Last updated: Sun, 09th October, 2005 @ 05:00am +0000
You are here: Home

How Does iTubesDBParser Work?

iTunesDBParser parses the iTunesDB file. This file is like so many large corporations file formats - closed and proprietary. To parse the iTunesDB you need to reverse engineer the database format. This article over at linuxjournal.com explains the typical process you have to go through to reverse engineer this particular file.

However, because of the extremely interesting work and dedication by all developers involved with ipodlinux (which is a fantastic example of how powerful open source and international collaboration is), most of that painstaking work has already been done. Click here to see the spec of an almost complete reverse engineered iTunesDB file.

iTunesDBParser tries to make an object representation of each of the objects in the iTunesDB file, and then serialize (output/dump) the track and playlist information to the filesystem. The parser therefore does not work by hunting down specific markers or strings in the file (treating merely as a series of bytes to iterate through). Instead it works sequentially, creating java objects of each of the objects in the iTunesDB file.

The iTunesDB file is largely a hierarchical set of objects. The easiest way to visual this is just (once again) as XML. The xml representation of the database below is a slightly modified version of what you will find on the ipodlinux.org/ITunesDB page.

<mhbd description="This is a database">
  
<mhsd description="This is a list holder, which holds either a mhlt or an mhlp">
    
<mhlt description="This holds a list of all the songs on the iPod">
      
<mhit description="This describes a particular song">
        
<mhod description="These hold strings associated with a song" />
        
<mhod description="Things like Artist, Song Title, Album, etc." />
      
</mhit>
      
<mhit description="This is another song. And so on.">
        
<mhod description="These hold strings associated with a song" />
        
<mhod description="Things like Artist, Song Title, Album, etc." />
      
</mhit>
    
</mhlt>
  
</mhsd>
  
<mhsd description="Here's the list holder again.. This time, it's holding an mhlp">
    
<mhlp description="This holds a bunch of playlists. In fact, all the playlists.">
      
<mhyp description="This is a playlist.">
        
<mhod description="These mhods hold info about the playlists like the name of the list." />
        
<mhip description="This mhip holds a reference to a particular song on the iPod." />
      
</mhyp>
      
<mhyp description="This is another playlist. And so on.">
        
<mhod description="Note that the mhods also hold other things for smart playlists" />
        
<mhip description="This mhip holds a reference to a particular song on the iPod." />
      
</mhyp>
    
</mhlp>
  
</mhsd>
</mhbd>

iTunesDBParser uses the springframework (maybe a little bit of an overkill in this case, but gives both myself and all other developers who work on it some structure and a standard on how things should be done in this project). Another reason spring is a good choice in my opinion is due to the large amount of 'configuration' information that is needed for this project. This fits nicely into a spring xml configuration file. There is very little actually hardcoded in the code that expects the type or order of objects to appear at given places in the code.

Briefly, there is a 'parser' class for each object (mhlt, mhyp, mhod, etc) in the iTunesDB file. Each of these parsers classes implement the Parser interface. It is in the spring configuration where its specified what the children parser is of the current parser. It also specifies the length of each data item, its name, etc, etc. The benefit is that when the database structure changes it should just be a matter of changing the XML config file instead of hunting down the place in the code that needs to be modified.