Target software: Skype for Windows/Linux
What’s all about?
Did you ever tried to export all your Skype history? It’s a little bit boring to copy and paste every conversation in separate text file, isn’t it? It’s even worse if you use the Linux version of this famous chat client.
Did you ever wondered if it’s possible to read your Skype chat history without having to be logged into your Skype account? And did you ever wondered if it’s possible to read someone else’s chat history without having to use Skype client and without knowing Skype account passwords?
The answer of the last few questions is ‘Yes’ π Bellow you’ll find Skype chat logs format description. I don’t pretend that this is a complete Skype chat log format, but this information was enough for me to write a small tool that extract the information from log files and save it in human readable format in simple text file. You can download the first version of my SkypeChatLogReader at the end of this article.
Personal motivation
So … What’s my personal story. Last time when I decided to reinstall my Ubuntu I’ve saved everything I thought I’ll need after my fresh install including the .Skype directory. After I finished installing the fresh Ubuntu copy and installing all the software I needed (Skype of course) I completely forgot to copy the content of .Skype directory into my new home folder. So I start using my fresh installed Skype. As you can suppose I had no access to my old Skype history and had to find a way to read it.
Before I start my own research on how Skype is storing our precious chats, I googled around for some tools that could be helpful for reading Skype history. I found a small tool called Skypr, but it didn’t work for me at all, so I decided to write my own π
First things first
It was clear for me that Skype keeps all the chat history in one single directory. In Windows systems this is βDocuments and Settings\windows_username\Application Data\Skypeβ and in Linux systems this is /home/user_name/.Skype. For every user ever used current Skype installation is generated separate subdirectory, named on its Skype identification name (notice that this is not Skype screen name but the Skype username). For example if UserA and UserB (UserA and UserB are Skype idents) are using the same Skype installation then you should find 2 separate directories:
…\Documents and Settings\windows_user\Application Data\Skype\UserA
…\Documents and Settings\windows_user\Application Data\Skype\UserB
In every one of this subdirectories there are several .dbb files. These files were my main target, because they seemed to keep all Skype logs. Some of theme are described below:
* call*.dbb – files that keep the call history
* chatmsg*.dbb and chat*.dbb – files that keep the chat history
* profile*.dbb – files that keep user profile details
* user*.dbb – files that keep users profiles details
* transfer*.dbb β files that keep transfer information
You can view part of the information written in those files with simple text editor or using βcatβ in Linux. It was obvious that those files are binary data files, so I moved forward to review them with Hex editor (Ghex was good enough for me β it came with standard package for Ubuntu Hardy). You could use any of your favorite hex editors. My efforts were concentrated on the goal to find any repeating pattern of bytes in chatmsg*.dbb files.
After some time it came clear that the details for a single chat message are stored as a single record in chatmsg*.dbb file. It’s not clear why there are so many chatmsg*.dbb files in this directory but I’m pretty sure that there is direct connection between the length of the chat message and the file used from Skype to store the message in history. Most of the messages are shorter then 256B and are stored in chatmsg256.dbb and this file is the biggest one occasionally.
Here is the pattern I was able to find in all chatmsg*.dbb files:
1. Every record starts with byte sequence 0x6C 0x33 0x33 0x6C (l33l in ASCII)
2. Next 14 bytes are there with unknown (at least for me) purpose
3. 0xE0 0x03 β marker for the beginning of chat members field
first chat member username is prefixed with 0x23 (# in ASCII)
two chat members are separated with 0x2F (/ in ASCII)
the second chat member username is prefixed with 0x34 ($ in ASCII)
the list of chat members ends with 0x3B (; in ASCII)
Remark: I still have some problems with correct interpretation of this field for records with more then two chat members
4. The bytes after 0x3B to the next described number are with unknown content
5. 0xE5 0x03 β marker for the beginning of 6 bytes sequence, representing the message timestamp. The numbers in all chat logs are stored in little-endian format. The fifth and the sixth byte seems to be constant in all the records – 0x04 0x03. The sixth byte is not used in the actual timestamp calculations (for now … may be it’ll be used in further moment). Bytes 1st to 5th represent message timestamp in standard Unix format.Normally only 4 bytes of information are needed to store Unix timestamp. That’s why first I thought that bytes 5th and 6th are not used at all. But after some calculations it came clear that first 4 bytes did not represent the actual time since 1/1/1970. It came clear also that the most significant bit in every of the first 4 bytes is always 1. That’s why it seems logically to me to conclude that those bits are sign bits and that they shouldn’t be used in actual timestamp calculations. Striping those most significant bits from every of the first 4 bytes and combining the rest of the bits it was received 28bit combination. For the standard Unix time representation 32 bits of information are needed, so we just ‘lend’ last 4 bits from 5th byte. This 32 bit combination gave the Unix timestamp of the chat message
6. 0xE8 0x03 β marker for the beginning of the sender username field. The field ends with zero byte 0x00
7. 1.0xEC 0x03 β marker for the beginning of the sender screen name field. The field ends with zero byte 0x00
8. 1.0xFC 0x03 β marker for the beginning of the message field. The field ends with zero byte 0x00
As a result…
According to this pattern I wrote a small tool called SkypeChatLogReader. It’s with very limited functionality but hopefully sooner or later (depends on my muses π it’ll be expanded to something really friendly.
SkypeChatLogReader is simple .NET console application and requires .NET 2.0 or Mono 2 to run.
You should pass chatmsg*.dbb filename to parse. This is the only mandatory parameter. The optional parameters are output filename, sender filter and date filter.
Download Skype Chat Log Reader
———
Update (05/05/2011): The Skype Chat Log Reader works with old skype version (below Skype 4) and exits with exception when handling with non UTF-8 symbols. Planning to patch it out if i found the source files π
Leave a Reply