Kuribo64
Views: 20,065,642 Home | Forums | Uploader | Wiki | Object databases | IRC
Rules/FAQ | Memberlist | Calendar | Stats | Online users | Last posts | Search
04-26-24 10:04 PM
Guest:

0 users reading The theory behind reverse engineering files | 1 bot

Main - General Chat - The theory behind reverse engineering files Hide post layouts | New reply


SuperMario64DS
Posted on 01-20-16 02:22 AM (rev. 3 of 01-20-16 07:33 AM) Link | #67110
I'm overwhelmed by the number of possible scenarios on each front. There are so many ways in which whichever thing can mean whatever or require viewing this as a pattern like that.

Is there a logical approach? As in, methds for approaching an issue and working through it? What should one be familiar with prior to tackling different areas (Audio, image data, etc)?

What I'm working with is an older PSX title. Not much to say other than I've discovered basic information about level layout and the control scheme (It's too horrid to describe - The assigned animation governs what occurs in part)

The typical issue is that I can locate data, but can't get anything out of it. Are the '.ani' files animation files? Are the '.mod' files mesh data? In both cases, yes, but it's difficult to derive anything from either when they're in a large archive file that may or may not feature a form compression conceived by the office intern.

Clearly there is a method to this madness (The number of moddable games speak for themselves), but I have not found it. When looking into something new, such as a format, what should you look for, how do you work through it, and how do you test results?

Arisotura
Posted on 01-20-16 09:38 PM Link | #67120
I don't really know how to explain it, but I guess a first thing is to know what you're looking for, under which form it could appear, etc...

For example, if you're looking at an image file and trying to figure out the format, it will likely be one of the formats the console hardware supports. You can try using an emulator (with sufficient tools) to determine which format is used. A good emu/debugger would also help you find out where the data comes from and how it's read-- chances are it's compressed, and in that case, you'll have to figure that compression out. Some games will just use standard compression algorithms, others not-- Nintendo loves inventing new LZ77 variants every time for example.

This also applies to audio data.

However, things that are more high-level, like level formats or scripts, will have their own format and don't depend on any hardware, so figuring it out might be harder.

Finding blocks of data requires looking at the file in a hex editor and trying to find data patterns. You can try subtly modifying the data and seeing what happens, and from there guess what it's supposed to be. An emulator with a debugger also helps. Once you find blocks of data, you have to work out the rest of the file structure, namely, what tells where the blocks are, etc...


As an example, how I worked out SM64DS's level format back then. I knew which files contained the level's models, collision maps and minimaps, but didn't know where the level headers were. ("level headers" basically defining all sorts of parameters, like which model the course uses, where the objects are, etc)

I don't remember it well, but I was able to find a table mapping filenames to internal file IDs. From that, I had the internal IDs of some course model files. Looking for them (probably modifying desmume to log when they appear, or using the RAM search) revealed the level headers, and from there, the various object lists. All that junk is stored in overlays, interleaved with code and other data (mostly level-specific objects (yes)), making it a pain to edit.

(how does SM64DSe make it easier to work with? it cheats, it adds new overlays, moves the level data there, and modifies the game code to read from there; the old overlays are still loaded so that their junk stays usable, etc)

____________________
NSMBHD - Kafuka - Jul
melonDS the most fruity DS emulator there is

zafkflzdasd

SuperMario64DS
Posted on 01-22-16 10:29 PM Link | #67146
Thanks for the insight. I'm shocked by the lack of information on subjects such as these given how widespread these types of communities are.

If I may inquire...

Posted by StapleButter
For example, if you're looking at an image file and trying to figure out the format, it will likely be one of the formats the console hardware supports...


If I understand your implications correctly, how are these formats figured out beforehand? I'm aware of instances in which devkits and documents are leaked, but in communities such as this, I do wonder how so many of the image and audio formats have been cracked (bti, brstm, etc). Are these generally based off of or similar enough to 'standard' variants to be accessible, or have they found there way to hands of modders by other means?

Posted by StapleButter
However, things that are more high-level, like level formats or scripts, will have their own format and don't depend on any hardware, so figuring it out might be harder.


Conversely, I've had a much easier time understanding these because they're not just 'audio?', 'images?', 'mesh?', but rather along the lines of co-ords, ids, unless it's designed to be cryptic.

Posted by StapleButter
(how does SM64DSe make it easier to work with? it cheats, it adds new overlays, moves the level data there, and modifies the game code to read from there; the old overlays are still loaded so that their junk stays usable, etc)


I've never understood Nintendo's thinking more clearly until the release of Super Mario Maker - Slap limitations on 'just because'. The fact that they purposefully and deliberately seem to write this line of code:

if (notAllowedInLevel(object->id,level->id))
die();

...Irks me. Who cares if you want to use a static boat object on a course that's not Daisy Circuit?!

Unrelated, I see this term 'overlay' refered to here on occasion. Is it a synonym? Are you referring to act selection? Sections of the file?

---

I'm aware of a few debuggers, but I've heard they're dodgy. I'm not sure if anything sufficient exists for the PSX, but I'll search around.

In my favor, the PSX has a ARM MIPS 3000 processor which... Yes, you guessed it, is the predecessor to MIPS 4000, used by the N64. Anything designed to disassemble N64 binaries -should- work in this case.

Arisotura
Posted on 01-22-16 11:07 PM Link | #67147
Nintendo doesn't go out of their way to add limitations. They code what they need, and keep in mind that most of those games aren't designed to be tampered with in the first place. Nintendo is known for hardcoding things... but to be honest, every game is going to be more or less of a pile of hacks. Quality varies. Rayman Origins Wii is a mess for example, and uses weirdass formats.

In the case of NSMBDS and SM64DS, the object limitations aren't deliberately coded in. They use a system known as overlays to save RAM. The objects' code/data are organized in these overlays, and the level header indicates which overlays should be loaded through a system of banks (each bank corresponds to a RAM address, the overlays in it are those which are set to load at that address).

This way, many objects not required for a given level are just not loaded at all.

And if you try using objects not loaded, well, it doesn't work. (NSMB ignores them, SM64DS crashes)

In SM64DS, the object bank system bleeds into the level system-- some objects only appear in one level, thus they're stored in the same overlay as the level's header.


If the overlay shit is still unclear to you, I'll provide an example...



As for how data formats are cracked... again, looking at patterns, debugging, etc...

As an example, how I worked out some data formats used in SM64DS. I don't remember details well, though.

The model format used is simple (compared to the typical Nintendo model format). I started with files I knew represented simple models. From reading DS docs, I knew what packed GX command lists (defining geometry) looked like, and similar data appeared in the files. From there I knew the geometry was directly stored as GX command lists (as opposed to storing separate positions/colors/texcoords). Makes sense, the game would just have to read the command list and copy it to the GX FIFO (from which the 3D hardware would process it and render things).

Next, I had to figure out the structure of the file, where those command lists start and what points to them. Easy enough in this case, all the offsets in BMD files are absolute (some more complex files may use offsets relative to some point in the file).

Knowing the structure also allowed figuring out the rest of the data: bones, how geometry is grouped, material data, textures, all the kind of shit you can expect to find in a model file. Again, most of it is stored in a format the hardware can directly understand: transform values are encoded in the proper fixed-point formats, material colors are encoded in 15-bit BGR, textures are encoded in formats supported by the hardware...



How are the hardware formats figured out? Well, people reverse-engineering the console, SDK leaks, etc... We're reaching into a whole new world, that of console hacking. ROM hacking can't exist without console hacking, though.

How the hardware data formats are reverse-engineered, well, I don't really know, I'm not into that. I've been mostly using existing documentation to do what I've done.


"In my favor, the PSX has a ARM MIPS 3000 processor which... Yes, you guessed it, is the predecessor to MIPS 4000, used by the N64. Anything designed to disassemble N64 binaries -should- work in this case."

yup, I guess, try out

____________________
NSMBHD - Kafuka - Jul
melonDS the most fruity DS emulator there is

zafkflzdasd

CodingKoopa
Posted on 01-23-16 12:41 AM (rev. 2 of 01-23-16 12:48 AM) Link | #67148
To add to what Arisotura said, the game's code is usually spit into 3 different files.
The DS has 2 CPUs.

The ARM9 does most of the work, and is therefore usually the only CPU you have to work with. Code for the ARM9 is located in arm9.bin, in the filesystem of a DS game.

The ARM7 does more minor things, such as WIFI and sound. Code for it is located in arm7.bin, in the filesystem of a DS game.

Finally, if you open the overlays folder inside kiwi.ds, you'll find a bunch of files named overlay_xxxx.bin. These .bins are additional code, that the game can load and unload at will. If you want to edit these, you'll have to first decompress them with NSMBe (don't use the latest version, you have to use a specific one), and make your changes..

You won't be editing these files manually, but it's useful to know IMO.

Also, if you're gonna be ASM hacking any DS game, I would strongly suggest taking a look at the ASM Hacking forum at NSMBHD.

I probably should have read all of the original post :P

____________________
Website | Twitter


Arisotura
Posted on 01-23-16 12:45 AM Link | #67149
in the case of the DS, yeah


but it appears the OP is working on PSX games

____________________
NSMBHD - Kafuka - Jul
melonDS the most fruity DS emulator there is

zafkflzdasd

SuperMario64DS
Posted on 01-23-16 08:02 PM (rev. 3 of 01-23-16 08:08 PM) Link | #67163
@TheKoopaKingdom Not related, but interesting. I knew the DS/GBA had two processors, but I didn't know what the second one was for. I do think the system they have set up for patching is interesting, it never occurred to me that you could compile and insert if you have information beforehand.

@StapleButter

I believe I understand, the overlays. Only general purpose/common objects are loaded (Defined elsewhere), and anything else depends on what's indexed by the current level/act. And you say actual code is included too (So if Boss X only exists in level 7, his actual scripting may be within the act's data rather than elsewhere). Is this correct?

I've heard of GX, but only in libraries for Wii development. I've never been able to locate a definition, but from your description I assume it's instructions to draw graphics rather than something that's interpreted into that.

Not related to current topic directly, but still interesting. I'd like to see documentation if any.


Main - General Chat - The theory behind reverse engineering files Hide post layouts | New reply

Page rendered in 0.052 seconds. (2048KB of memory used)
MySQL - queries: 27, rows: 205/205, time: 0.009 seconds.
[powered by Acmlm] Acmlmboard 2.064 (2018-07-20)
© 2005-2008 Acmlm, Xkeeper, blackhole89 et al.