How Does iTubesDBParser Work?

iTunesDBParser parses the iTunesDB file. This file is like so many large corporations file formats - closed and proprietary. To parse the iTunesDB you need to reverse engineer the database format. This article over at linuxjournal.com explains the typical process you have to go through to reverse engineer this particular file.

However, because of the extremely interesting work and dedication by all developers involved with ipodlinux (which is a fantastic example of how powerful open source and international collaboration is), most of that painstaking work has already been done. Click here to see the spec of an almost complete reverse engineered iTunesDB file.

iTunesDBParser tries to make an object representation of each of the objects in the iTunesDB file, and then serialize (output/dump) the track and playlist information to the filesystem. The parser therefore does not work by hunting down specific markers or strings in the file (treating merely as a series of bytes to iterate through). Instead it works sequentially, creating java objects of each of the objects in the iTunesDB file.

The iTunesDB file is largely a hierarchical set of objects. The easiest way to visual this is just (once again) as XML. The xml representation of the database below is a slightly modified version of what you will find on the ipodlinux.org/ITunesDB page.

iTunesDBParser uses the springframework (maybe a little bit of an overkill in this case, but gives both myself and all other developers who work on it some structure and a standard on how things should be done in this project). Another reason spring is a good choice in my opinion is due to the large amount of 'configuration' information that is needed for this project. This fits nicely into a spring xml configuration file. There is very little actually hardcoded in the code that expects the type or order of objects to appear at given places in the code.

Briefly, there is a 'parser' class for each object (mhlt, mhyp, mhod, etc) in the iTunesDB file. Each of these parsers classes implement the Parser interface. It is in the spring configuration where its specified what the children parser is of the current parser. It also specifies the length of each data item, its name, etc, etc. The benefit is that when the database structure changes it should just be a matter of changing the XML config file instead of hunting down the place in the code that needs to be modified.