Posts

Showing posts with the label compression

One Weird Trick to Shrink Your PANDA Malware Logs by 84%

When I wrote about some of the lessons learned from P ANDA Malrec 's first 100 days of operation , one of the things I mentioned was that the storage requirements for the system were extremely high. In the four months since, the storage problem only got worse: as of last week, we were storing 24,000 recordings of malware, coming in at a whopping 2.4 terabytes of storage. The amount of data involved poses problems not just for our own storage but also for others wanting to make use of the recordings for research. 2.4 terabytes is a lot, especially when it's spread out over 24,000 HTTP requests. If we want our data to be useful to researchers, it would be great if we could find better ways of compressing the recording logs. As it turns out, we can! The key is to look closely at what makes up a PANDA recording: The log of non-deterministic events (the -rr-nondet.log files) The initial QEMU snapshot (the -rr-snp files) The first of these is highly redundant and actually ...