tag:blogger.com,1999:blog-67873626387883149042024-03-07T06:22:57.894-05:00Push the Red ButtonMalware, encryption, reverse engineering, networking, and other arcana.Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.comBlogger68125tag:blogger.com,1999:blog-6787362638788314904.post-65460578365481083202022-09-06T11:00:00.013-04:002022-09-07T15:10:45.195-04:00Someone’s Been Messing With My Subnormals!<p><i><b>TL;DR: After noticing an annoying warning, I went on an absurd yak shave, and discovered that because of a tiny handful of Python packages built with an appealing-sounding but dangerous compiler option, more than 2,500 Python packages—some with more than a million downloads per month—could end up causing any program that uses them to compute incorrect numerical results.</b></i></p><h3 style="text-align: left;">Once Upon a Time in My Terminal</h3><p>Recently, whenever I tried to import certain Python packages (notably, some models from Huggingface Transformers), I would see this weird and unsightly warning:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Python 3.8.10 (default, Jun 22 2022, 20:18:18)<span class="Apple-converted-space"> </span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Type 'copyright', 'credits' or 'license' for more information</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">IPython 8.4.0 -- An enhanced Interactive Python. Type '?' for help.</span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s3" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>1</b></span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s4" style="color: #2d961e; font-variant-ligatures: no-common-ligatures;"><b>from</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"> </span><span class="s5" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>transformers</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"> </span><span class="s4" style="color: #2d961e; font-variant-ligatures: no-common-ligatures;"><b>import</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"> CodeGenForCausalLM</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">/home/moyix/.virtualenvs/sfcodegen/lib/python3.8/site-packages/numpy/core/getlimits.py:498: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>setattr(self, word, getattr(machar, word).flat[0])</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">/home/moyix/.virtualenvs/sfcodegen/lib/python3.8/site-packages/numpy/core/getlimits.py:88: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>return self._float_to_str(self.smallest_subnormal)</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">/home/moyix/.virtualenvs/sfcodegen/lib/python3.8/site-packages/numpy/core/getlimits.py:498: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>setattr(self, word, getattr(machar, word).flat[0])</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">/home/moyix/.virtualenvs/sfcodegen/lib/python3.8/site-packages/numpy/core/getlimits.py:88: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>return self._float_to_str(self.smallest_subnormal)</span></p><p><i><b>Someone was messing with my floating point subnormals!</b></i> Aside from the error messages being really annoying, it also made me a little worried. Floating point math is notoriously tricky, and if something is changing the behavior of the floating point unit (FPU) on the CPU it can cause all sorts of weird problems. For example, <a href="https://simonbyrne.github.io/notes/fastmath/#flushing_subnormals_to_zero">some numerical algorithms depend on the standard FPU behavior and will fail to converge</a> if the FPU is set to treat subnormal/denormal numbers as zero (on x86, by setting the FTZ/DAZ flags in the MXCSR register).</p><p>Some Googling <a href="https://github.com/Qiskit/qiskit-aer/issues/1461">led me to this issue</a>, which pointed toward some shared library compiled with the gcc/clang option <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span> that was being loaded as the culprit. It turns out (somewhat insanely) that when <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span> is enabled, the compiler will link in a constructor that sets the FTZ/DAZ flags whenever the library is loaded — even on shared libraries, which means that any application that loads that library will have its floating point behavior changed <i>for the whole process</i>. And <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-Ofast</span>, which sounds appealingly like a "make my program go fast" flag, automatically enables <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>, so some projects may unwittingly turn it on without realizing the implications.</p><p>But what shared libraries were even being loaded? We can find out by taking the PID of our Python interpreter and then looking at /proc/[PID]/maps; this will show all the mapped memory regions in the process and show the names for the ones that are file-backed (which shared libraries are). We can then filter that by grepping for <span style="font-family: courier;">r-xp</span> (readable + executable + copy-on-write) to only list code sections, and then look for shared objects:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s2" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s2" style="font-variant-ligatures: no-common-ligatures;">$ cat /proc/2902749/maps | grep 'r-xp' | grep -F '.so' | wc -l</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s2" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">158</span></p><p>Oof, 158 shared objects? In <i>my</i> Python process? It's more common than you think.</p><p>So now I wanted to narrow down which library (or libraries) was actually setting FTZ/DAZ. After going down a somewhat painful blind alley where I tried to modify QEMU's user-mode emulation so that all updates to MXCSR would get logged along with the name of the library containing the current program counter (NB: it turns out to be really annoying to map the program counter back to a library name), I hit on a simpler strategy.</p><p>Python lets you load arbitrary shared objects using ctypes.CDLL. So if I could just load each of those shared libraries one at a time and then trigger the numpy code that prints the warning, I could identify which library was messing with my floating point behavior. By reading through the getlimits.py file mentioned in the warning, I figured out that the check could be triggered by printing out <span style="font-family: courier;">numpy.finfo(numpy.float32)</span>, and ended up with this script:</p><p class="p1" style="background-color: black; color: #2d961e; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>import</b></span><span class="s2" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s3" style="color: #4a00ff; font-variant-ligatures: no-common-ligatures;"><b>sys</b></span></span></p><p class="p1" style="background-color: black; color: #2d961e; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>from</b></span><span class="s2" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s3" style="color: #4a00ff; font-variant-ligatures: no-common-ligatures;"><b>ctypes</b></span><span class="s2" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>import</b></span><span class="s2" style="color: white; font-variant-ligatures: no-common-ligatures;"> CDLL</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">CDLL(sys</span><span class="s4" style="color: #6c6c6c; font-variant-ligatures: no-common-ligatures;">.</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">argv[</span><span class="s4" style="color: #6c6c6c; font-variant-ligatures: no-common-ligatures;">1</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">])</span></span></p><p class="p1" style="background-color: black; color: #2d961e; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>import</b></span><span class="s2" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s3" style="color: #4a00ff; font-variant-ligatures: no-common-ligatures;"><b>numpy</b></span><span class="s2" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>as</b></span><span class="s2" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s3" style="color: #4a00ff; font-variant-ligatures: no-common-ligatures;"><b>np</b></span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s5" style="color: #2d961e; font-variant-ligatures: no-common-ligatures;">print</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(np</span><span class="s4" style="color: #6c6c6c; font-variant-ligatures: no-common-ligatures;">.</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">finfo(np</span><span class="s4" style="color: #6c6c6c; font-variant-ligatures: no-common-ligatures;">.</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">float32))</span></span></p><p>We can check that it works by building an empty C file as a shared library with and without <span style="font-family: courier;"><span face="ui-monospace, SFMono-Regular, "SF Mono", Menlo, Consolas, "Liberation Mono", monospace" style="background-color: rgba(175, 184, 193, 0.2); color: #24292f;">-Ofast</span></span>. Without <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-Ofast</span>:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ pygmentize foo.c</span></span></p><p class="p2" style="background-color: black; color: #4a00ff; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s4" style="color: #c7206a; font-variant-ligatures: no-common-ligatures;">void</span><span class="s5" style="color: #bdbdbd; font-variant-ligatures: no-common-ligatures;"> </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">empty</span><span class="s6" style="color: white; font-variant-ligatures: no-common-ligatures;">()</span><span class="s5" style="color: #bdbdbd; font-variant-ligatures: no-common-ligatures;"> </span><span class="s6" style="color: white; font-variant-ligatures: no-common-ligatures;">{</span><span class="s5" style="color: #bdbdbd; font-variant-ligatures: no-common-ligatures;"> </span><span class="s6" style="color: white; font-variant-ligatures: no-common-ligatures;">}</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ gcc -fpic -shared foo.c -o foo.so</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ python fptest.py ./foo.so</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Machine parameters for float32</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">---------------------------------------------------------------</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">precision = <span class="Apple-converted-space"> </span>6 <span class="Apple-converted-space"> </span>resolution = 1.0000000e-06</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">machep =<span class="Apple-converted-space"> </span>-23 <span class="Apple-converted-space"> </span>eps =<span class="Apple-converted-space"> </span>1.1920929e-07</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">negep = <span class="Apple-converted-space"> </span>-24 <span class="Apple-converted-space"> </span>epsneg = <span class="Apple-converted-space"> </span>5.9604645e-08</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">minexp = <span class="Apple-converted-space"> </span>-126 <span class="Apple-converted-space"> </span>tiny = <span class="Apple-converted-space"> </span>1.1754944e-38</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">maxexp =<span class="Apple-converted-space"> </span>128 <span class="Apple-converted-space"> </span>max =<span class="Apple-converted-space"> </span>3.4028235e+38</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">nexp =<span class="Apple-converted-space"> </span>8 <span class="Apple-converted-space"> </span>min =<span class="Apple-converted-space"> </span>-max</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">smallest_normal = 1.1754944e-38 <span class="Apple-converted-space"> </span>smallest_subnormal = 1.4012985e-45</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">---------------------------------------------------------------</span></p><p>But with <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-Ofast</span> enabled:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">/home/moyix/.virtualenvs/sfcodegen/lib/python3.8/site-packages/numpy/core/getlimits.py:498: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>setattr(self, word, getattr(machar, word).flat[0])</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">/home/moyix/.virtualenvs/sfcodegen/lib/python3.8/site-packages/numpy/core/getlimits.py:88: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>return self._float_to_str(self.smallest_subnormal)</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Machine parameters for float32</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">---------------------------------------------------------------</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">precision = <span class="Apple-converted-space"> </span>6 <span class="Apple-converted-space"> </span>resolution = 1.0000000e-06</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">machep =<span class="Apple-converted-space"> </span>-23 <span class="Apple-converted-space"> </span>eps =<span class="Apple-converted-space"> </span>1.1920929e-07</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">negep = <span class="Apple-converted-space"> </span>-24 <span class="Apple-converted-space"> </span>epsneg = <span class="Apple-converted-space"> </span>5.9604645e-08</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">minexp = <span class="Apple-converted-space"> </span>-126 <span class="Apple-converted-space"> </span>tiny = <span class="Apple-converted-space"> </span>1.1754944e-38</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">maxexp =<span class="Apple-converted-space"> </span>128 <span class="Apple-converted-space"> </span>max =<span class="Apple-converted-space"> </span>3.4028235e+38</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">nexp =<span class="Apple-converted-space"> </span>8 <span class="Apple-converted-space"> </span>min =<span class="Apple-converted-space"> </span>-max</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">smallest_normal = 1.1754944e-38 <span class="Apple-converted-space"> </span>smallest_subnormal = 0.0000000e+00</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">---------------------------------------------------------------</span></p><p>Great! So now we have a detector for shared libraries that set FTZ/DAZ. It's kind of annoying that we have to load the library to check, but we'll fix that later. Running it in a loop over all the libraries that we found were loaded in the Python process earlier, I found that the culprit was <b>gevent</b>, of all things (why is an event-based networking library messing with floating point behavior??).</p><p>Of course, now that I knew it was <b>gevent</b>, some more Googling located the relevant bug report. It seems that it was a <a href="https://github.com/gevent/gevent/pull/1820">known bug with an attempted fix</a>, but the fix <a href="https://github.com/gevent/gevent/pull/1864">didn't quite work</a> (it turns out that when you use <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-Ofast</span>, <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-fno-fast-math</span> does not, in fact, disable fast math. lol. lmao.) and so the most recent version of <b>gevent</b> on PyPI still messes with floating point behavior for no good reason.</p><h3 style="text-align: left;">What else is out there?</h3><p>With the immediate mystery solved, I wanted to figure out how many other projects might have (intentionally or inadvertently) enabled <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span> in their shared libraries uploaded to PyPI. So I decided to take the top 25% of projects on PyPI by number of downloads and scan their binary wheels (Python jargon for precompiled binaries) to see if they, too, messed with floating point behavior.</p><p>Let's start by finding the top 25% of projects. We can get a list of the number of downloads for each project on PyPI <a href="https://packaging.python.org/en/latest/guides/analyzing-pypi-package-downloads/">by querying a handy BigQuery table that PyPI publishes</a>. Skipping over all the warnings about how inaccurate the download numbers are, I was able to write this little bit of BigQuery SQL that gives the total downloads for each project over the past 30 days:</p><div style="background-color: #fffffe; font-family: Roboto Mono, Menlo, Monaco, Courier New, monospace; font-size: 12px; line-height: 18px; white-space: pre;"><div><span style="color: #d81b60;">#standardSQL</span></div><div><span style="color: #3367d6;">SELECT</span> file.<span style="color: #3367d6;">project</span>, <span style="color: #3367d6;">COUNT</span><span style="color: #37474f;">(*)</span> <span style="color: #3367d6;">AS</span> num_downloads</div><div><span style="color: #3367d6;">FROM</span> <span style="color: #0d904f;">`bigquery-public-data.pypi.file_downloads`</span></div><div><span style="color: #3367d6;">WHERE</span></div><div> <span style="color: #d81b60;">-- Only query the last 30 days of history</span></div><div> <span style="color: #3367d6;">DATE</span><span style="color: #37474f;">(</span><span style="color: #3367d6;">timestamp</span><span style="color: #37474f;">)</span></div><div> <span style="color: #3367d6;">BETWEEN</span> <span style="color: #3367d6;">DATE_SUB</span><span style="color: #37474f;">(</span><span style="color: #3367d6;">CURRENT_DATE</span><span style="color: #37474f;">()</span>, <span style="color: #3367d6;">INTERVAL</span> <span style="color: #f4511e;">30</span> DAY<span style="color: #37474f;">)</span></div><div> <span style="color: #3367d6;">AND</span> <span style="color: #3367d6;">CURRENT_DATE</span><span style="color: #37474f;">()</span></div><div><span style="color: #3367d6;">GROUP</span> <span style="color: #3367d6;">BY</span></div><div> file.<span style="color: #3367d6;">project</span></div></div><p>And then save that as a CSV file for further analysis. Once we have the CSV, we can drop it into pandas (or your favorite data analysis toolbox) and extract out a list of the names of the top 25% of projects by download count pretty easily:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ ipython</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Python 3.8.10 (default, Jun 22 2022, 20:18:18)<span class="Apple-converted-space"> </span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Type 'copyright', 'credits' or 'license' for more information</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">IPython 8.4.0 -- An enhanced Interactive Python. Type '?' for help.</span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p3" style="background-color: black; color: #2d961e; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s4" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>1</b></span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>import</b></span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>pandas</b></span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>as</b></span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>pd</b></span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p3" style="background-color: black; color: #2d961e; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s4" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>2</b></span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>import</b></span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>numpy</b></span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;"><b>as</b></span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;"> </span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>np</b></span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p4" style="background-color: black; color: #9fa01c; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s4" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>3</b></span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;">df = pd.read_csv(</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">'pypi_downloads_20220901_30d.csv'</span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;">)</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s4" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>4</b></span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">top_25p = df[df[</span><span class="s6" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'num_downloads'</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">] > df[</span><span class="s6" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'num_downloads'</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">].quantile(</span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">0.75</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">)]</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s4" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>5</b></span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">top_25p.head()</span></span></p><p class="p5" style="background-color: black; color: #b42419; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">Out[</span><span class="s7" style="color: #fc2118; font-variant-ligatures: no-common-ligatures;"><b>5</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">]:<span class="Apple-converted-space"> </span></span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>project<span class="Apple-converted-space"> </span>num_downloads</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">0<span class="Apple-converted-space"> </span>lima <span class="Apple-converted-space"> </span>2384</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">1<span class="Apple-converted-space"> </span>kiwisolver <span class="Apple-converted-space"> </span>25721252</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">2 <span class="Apple-converted-space"> </span>quill-delta <span class="Apple-converted-space"> </span>6128</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">3 <span class="Apple-converted-space"> </span>aiorpcx<span class="Apple-converted-space"> </span>12998</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">4<span class="Apple-converted-space"> </span>flake8-flask <span class="Apple-converted-space"> </span>6843</span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s4" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>6</b></span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s8" style="color: #2d961e; font-variant-ligatures: no-common-ligatures;">len</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(top_25p)</span></span></p><p class="p5" style="background-color: black; color: #b42419; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">Out[</span><span class="s7" style="color: #fc2118; font-variant-ligatures: no-common-ligatures;"><b>6</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">]: </span><span class="s5" style="color: white; font-variant-ligatures: no-common-ligatures;">102864</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s4" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>7</b></span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">np.savetxt(</span><span class="s6" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'top_25p_projects.txt'</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">, top_25p[</span><span class="s6" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'project'</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">].values, fmt=</span><span class="s6" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'</span><span class="s9" style="color: #bd618f; font-variant-ligatures: no-common-ligatures;"><b>%s</b></span><span class="s6" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'</span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">)</span></span></p><p>Okay, around 100K projects, that's manageable. Now it was time to try to download the wheels for those projects, extract them, and check for any floating point funny business. But wait, how are we going to do that check? I don't really want to load a bunch of .so files downloaded off the internet, given that any of them could execute arbitrary code. [My worries here about executing arbitrary code will seem richly ironic later in the post.]</p><h3 style="text-align: left;">Interlude: A Static Checker for <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier; font-size: medium;">crtfastmath</span></h3><p>How can we check if a library will mess with FTZ/DAZ without actually running any of its code? Let's go back to how the code that sets these bits actually gets executed in the first place. Shared libraries on Linux get loaded by the dynamic loader, which loops over the contents of the .init_array section looking for constructors that should be called when the library is loaded and then calling each one. We can print out the contents of this section using objdump:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ objdump -s -j .init_array foo.so</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">foo.so: <span class="Apple-converted-space"> </span>file format elf64-x86-64</span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Contents of section .init_array:</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>3e78 10110000 00000000 40100000 00000000<span class="Apple-converted-space"> </span>........@.......</span></p><p>Each entry in this array is a pointer stored in little-endian form; on a 64-bit system a pointer is 8 bytes, so in this example we have two constructors, and we can print their addresses with this awful incantation (my specialty!):</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ objdump -s -j .init_array foo.so | sed -e '1,/Contents/ d' | cut -c 7-40 | xxd -r -p | od -An -t x8 -w8</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>0000000000001110</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>0000000000001040</span></p><p>Now we can disassemble them with objdump. The first one is not very exciting:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ objdump -d --start-address=0x0000000000001110 foo.so | head -20</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">foo.so: <span class="Apple-converted-space"> </span>file format elf64-x86-64</span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Disassembly of section .text:</span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">0000000000001110 <frame_dummy>:</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1110: <span class="Apple-converted-space"> </span>f3 0f 1e fa <span class="Apple-converted-space"> </span>endbr64<span class="Apple-converted-space"> </span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1114: <span class="Apple-converted-space"> </span>e9 77 ff ff ff<span class="Apple-converted-space"> </span>jmpq <span class="Apple-converted-space"> </span>1090 <register_tm_clones></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1119: <span class="Apple-converted-space"> </span>0f 1f 80 00 00 00 00<span class="Apple-converted-space"> </span>nopl <span class="Apple-converted-space"> </span>0x0(%rax)</span></p><p>But the second one is what we want to look for (note: in stripped binaries, you won't see the nice function name, so we'll have to detect it by looking at the actual instructions):</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>~</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ objdump -d --start-address=0x0000000000001040 foo.so | head -20</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">foo.so: <span class="Apple-converted-space"> </span>file format elf64-x86-64</span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">Disassembly of section .text:</span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">0000000000001040 <set_fast_math>:</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1040: <span class="Apple-converted-space"> </span>f3 0f 1e fa <span class="Apple-converted-space"> </span>endbr64<span class="Apple-converted-space"> </span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1044: <span class="Apple-converted-space"> </span>0f ae 5c 24 fc<span class="Apple-converted-space"> </span>stmxcsr -0x4(%rsp)</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1049: <span class="Apple-converted-space"> </span>81 4c 24 fc 40 80 00<span class="Apple-converted-space"> </span>orl<span class="Apple-converted-space"> </span>$0x8040,-0x4(%rsp)</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1050: <span class="Apple-converted-space"> </span>00<span class="Apple-converted-space"> </span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1051: <span class="Apple-converted-space"> </span>0f ae 54 24 fc<span class="Apple-converted-space"> </span>ldmxcsr -0x4(%rsp)</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1056: <span class="Apple-converted-space"> </span>c3<span class="Apple-converted-space"> </span>retq<span class="Apple-converted-space"> </span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>1057: <span class="Apple-converted-space"> </span>66 0f 1f 84 00 00 00<span class="Apple-converted-space"> </span>nopw <span class="Apple-converted-space"> </span>0x0(%rax,%rax,1)</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>105e: <span class="Apple-converted-space"> </span>00 00<span class="Apple-converted-space"> </span></span></p><p>This snippet uses stmxcsr to get the value of the MXCSR register, ORs it with 0x8040 (MTZ | DAZ), and then uses ldmxcsr to save it back into the MXCSR register. Simple enough!</p><p>I wrote a <a href="https://gist.github.com/moyix/6ebe542affd555218bdb82b40ec49291">hacky Python script</a> that uses basically these same objdump commands to disassemble each constructor up until the first return (retq) instruction, and report if any of them contain all three of stmxcsr, ldmxcsr, and the constant 0x8040. This will miss any code pattern that doesn't exactly match the code found in gcc's <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">crtfastmath</span> library, but it's good enough and shouldn't have false positives.</p><p>Unfortunately, it also turns out to be quite slow, since it invokes objdump once to get the .init_array entries, and then N times to disassemble (once for each constructor). This really adds up when you want to scan thousands of files, especially because (as I now am doomed to know forever) shared libraries built in C++ can have thousands of constructors (the highest I saw was in <span style="font-family: courier;">scine-serenity-wrapper</span>'s <span style="font-family: courier;">serenity.module.so</span>, which has a whopping 11,841 different constructors (!!!)).</p><p>Instead I switched to searching for the <i>exact</i> byte sequence that encodes the stmxcsr, orl, ldmxcsr, and retq instructions. This is even more brittle than the disassembly-based technique, but since none of the instructions involved use any memory addresses (which would change depending on exactly where <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">crtfastmath</span> was linked into the binary), the sequence is pretty consistent over the range of compilers I could check (gcc-{5,7,8,9,11} and clang-{10,11,12,14}). The only difference I saw is that newer compilers start the function with <span style="font-family: courier;">endbr64</span> (an instruction that tells Intel's Control-flow Enforcement Technology, an exploit mitigation technique, that this is a valid jump target).</p><p>The <a href="https://gist.github.com/moyix/2154125d0cb9947ec0525fb49449fab7">final script can be found here</a>. It uses <a href="https://github.com/eliben/pyelftools">pyelftools</a> to parse the binary and extract the list of constructors, then tests for a string match at each of the constructors listed. This is <i>much</i> faster than multiple calls to objdump and can check thousands of binaries per second, which is the kind of scale we need.</p><h3 style="text-align: left;">Obtaining <strike>100GB</strike> 11.6TB of Shared Libraries from PyPI</h3><p>In the first draft of this post, I was going to wimp out and just do the top 25%, and only one binary wheel for the most recent version of each project. But my ambition and curiosity (and a strong desire to procrastinate on preparing for my first lecture on Tuesday) got the better of me, so I decided to go ahead and get <i>all</i> of the x86-64 binary wheels for every version of all packages on PyPI.</p><p>We start by getting a list of all Python package names using the <a href="https://pypi.org/simple/">PyPI Simple Index</a>, which is basically just a giant HTML file with one link per package.</p><p>Next we want to get the metadata for each package, including the list of releases and all the URLs to the actual binary packages. Each package has a JSON description at https://pypi.org/pypi/NAME/json; for example, here's the <a href="https://pypi.org/pypi/gevent/json">metadata for gevent</a>, the package that kicked off this quixotic quest. We can download all of them using wget; I'll also request gzip compression so that I can save PyPI some bandwidth:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>/fastdata/pypi_wheels</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ cat all_packages.txt | sed 's|^|https://pypi.org/pypi/|;s|$|/json|' > pypi_all_urls.txt</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>/fastdata/pypi_wheels</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ wget --no-verbose --header="Accept-Encoding: gzip" -O pypi_all.json.gz -i pypi_all_urls.txt 2> wget_errors.txt</span></span></p><p>When you use -O with multiple files like this, wget will concatenate all of them together into one big file, but luckily it turns out that the concatenation of multiple gzip files is itself a valid gzip file, so we end up with all package metadata (well, almost all – about 10,723 packages returned 404, but that's not too bad compared to the 386,544 packages we <i>did</i> manage to get info for).</p><p>After about an hour and an embarrassing amount of searching StackOverflow, I came up with a terrifying <span style="font-family: courier;">jq</span> one-liner to extract out all the URLs of the x86-64 Linux wheels from the 1.5GB of compressed JSON data:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>/fastdata/pypi_wheels</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ </span><span style="font-variant-ligatures: no-common-ligatures;">zcat</span><span class="Apple-converted-space" style="font-variant-ligatures: no-common-ligatures;"> </span><span style="font-variant-ligatures: no-common-ligatures;">pypi_all.json.gz | jq -cr '.releases[] | .[] | [select(.packagetype | contains("bdist_wheel"))] | .[] | select(.url | test("manylinux.*x86_64")) | [.size, .url] | join(" ")' > pypi_all_wheels.txt</span></span></p><p>This gives us 269,752 packages to download, along with their sizes. Adding up all the sizes, we get a grand total of 4 TB, which I managed to download in about 12 hours using <a href="https://aria2.github.io/">aria2</a> (while I love <span style="font-family: courier;">wget</span>, for a job this big I wanted something that could save/resume sessions and make many connections at once).</p><p>Now we have a small snag. While I do have quite a lot (10 TB) of NVME storage, it's still not enough to unpack all the shared objects at once for scanning. Instead I put together a <a href="https://gist.github.com/moyix/71896182bc5937fb8a1a882d765bc8ac">small script</a> to unpack just one package at a time, scan it, and then remove the extracted data. Finally, I am <a href="https://git.savannah.gnu.org/cgit/parallel.git/tree/doc/citation-notice-faq.txt">legally obligated</a> to inform you that I used GNU Parallel to scan all the wheels:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>/fastdata/pypi_wheels</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ find wheels -type f -print0 | parallel --progress -0 --xargs python extract_and_scan.py {#} {}</span></span></p><p>[This didn't go quite as smoothly as I'm making out — when you're scanning ~2.4 million shared libraries across ~270K wheels, you find all sorts of edge cases in your code; I discovered <a href="https://twitter.com/moyix/status/1565525352180580352">.so files that were in fact text files</a>, ARM and 32-bit x86 libraries bundled into an x86-64 package, and some partially mangled zip files.]</p><p>And so now we have everything we need to count how many x86-64 shared libraries there are in all of PyPI that were linked with <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="font-variant-ligatures: no-common-ligatures;">(sfcodegen) </span><span class="s2" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;"><b>moyix@isabella</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s3" style="color: #400bd9; font-variant-ligatures: no-common-ligatures;"><b>/fastdata/pypi_wheels</b></span><span class="s1" style="font-variant-ligatures: no-common-ligatures;">$ grep -F 'contains ffast-math constructor' ffast_results_all.txt | wc -l</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">5830</span></p><p>A mere 5,830 out of the 2.4 million we scanned!</p><h3 style="text-align: left;">On the Origin of Bad Floating Point Math</h3><p>To find out a little more about the packages that they're associated with, we can extract the dist-info metadata from each wheel. Unfortunately <a href="https://pypi.org/project/wheel-inspect/">the first library I found to do this</a> was extremely strict, and barfed on the many, many wheels in my dataset with slightly malformed metadata. I went on a <i>yet another </i>giant yak shaving expedition and <a href="https://gist.github.com/moyix/1bf820837930ec56d214952b0cce2d32">wrote my own robust parser</a>.</p><p>Along the way I learned a lot of fun facts about Python's packaging metadata. Did you know that the format of the METADATA file is actually based on email? And that because email is notoriously difficult to specify, the standard says that the format is "[...] what the standard library email.parser module can parse using the compat32 policy"? Or that the various files that can appear in the dist-info directory are an exciting menagerie of CSV, JSON, and Windows INI formats? So much knowledge that I now wish I could unlearn!</p><p>Anyway, let's get some stats. I dumped all shared library info and package metadata into a big pandas DataFrame to play around with for this.</p><p>First, how many distinct packages have at least one binary that was built with <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>?</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s2" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>28</b></span><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">df[</span><span class="s4" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'Package'</span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">].unique()</span></span></p><p class="p2" style="background-color: black; color: #b42419; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s3" style="font-variant-ligatures: no-common-ligatures;">Out[</span><span class="s5" style="color: #fc2118; font-variant-ligatures: no-common-ligatures;"><b>28</b></span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">]:<span class="Apple-converted-space"> </span></span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;">array(['archive-pdf-tools', 'bgfx-python',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'bicleaner-ai-glove', 'BTrees', 'cadbiom',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'ctranslate2', 'dyNET', 'dyNET38', 'gevent',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'glove-python-binary', 'higra', 'hybridq', 'ikomia',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'ioh', 'jij-cimod', 'lavavu', 'lavavu-osmesa',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'MulticoreTSNE', 'neural-compressor', 'nwhy',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'openjij', 'openturns', 'perfmetrics', 'pHashPy',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'pyace-lite', 'pyapr', 'pycompadre',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'pycompadre-serial', 'PyKEP', 'pykep',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'pylimer-tools', 'pyqubo', 'pyscf', 'PyTAT',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'python-prtree', 'qiskit-aer', 'qiskit-aer-gpu',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'RelStorage', 'sail-ml', 'segmentation', 'sente',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'sinr', 'snapml', 'superman', 'symengine',</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>'systran-align', 'texture-tool', 'tsne-mp', 'xcsf'],</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"><span class="Apple-converted-space"> </span>dtype=object)</span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s2" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>29</b></span><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s3" style="color: #2d961e; font-variant-ligatures: no-common-ligatures;">len</span><span class="s4" style="font-variant-ligatures: no-common-ligatures;">(df[</span><span class="s5" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'Package'</span><span class="s4" style="font-variant-ligatures: no-common-ligatures;">].unique())</span></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s3" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"></span></p><p class="p2" style="background-color: black; color: #b42419; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s4" style="font-variant-ligatures: no-common-ligatures;">Out[</span><span class="s6" style="color: #fc2118; font-variant-ligatures: no-common-ligatures;"><b>29</b></span><span class="s4" style="font-variant-ligatures: no-common-ligatures;">]: </span><span class="s7" style="color: white; font-variant-ligatures: no-common-ligatures;">49</span></span></p><p>Not nearly as bad as I thought! There are only 49 packages that, at some point in their history, released a binary wheel that had a shared library linked with <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>. We can find out more about them by having pandas make an HTML table with the package summary for us:</p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s2" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>54</b></span><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">counts = df.groupby([</span><span class="s4" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'Package'</span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">,</span><span class="s4" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'Summary'</span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">]).size()</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s3" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s2" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>55</b></span><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">counts = counts.reset_index().rename(columns={</span><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">0</span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">:</span><span class="s4" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'Count'</span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">})</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s3" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s2" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>56</b></span><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">counts = counts.sort_values(by=</span><span class="s4" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'Count'</span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">,ascending=</span><span class="s5" style="color: #2d961e; font-variant-ligatures: no-common-ligatures;"><b>False</b></span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">)</span></span></p><p class="p2" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px; min-height: 20px;"><span style="font-family: courier;"><span class="s3" style="font-variant-ligatures: no-common-ligatures;"></span><br /></span></p><p class="p1" style="background-color: black; color: white; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier;"><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">In [</span><span class="s2" style="color: #2fe71a; font-variant-ligatures: no-common-ligatures;"><b>57</b></span><span class="s1" style="color: #2fb41d; font-variant-ligatures: no-common-ligatures;">]: </span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">counts.to_html(</span><span class="s4" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'count.html'</span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">,justify=</span><span class="s4" style="color: #9fa01c; font-variant-ligatures: no-common-ligatures;">'center'</span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">,index=</span><span class="s5" style="color: #2d961e; font-variant-ligatures: no-common-ligatures;"><b>False</b></span><span class="s3" style="font-variant-ligatures: no-common-ligatures;">)</span></span></p><p>And we get a (pretty bare-bones) table <a href="https://moyix.net/~moyix/ffast_summary.html">that you can look at here</a>. Here are the first few rows:</p><div><center><table border="1" class="dataframe"><thead><tr style="text-align: center;"><th>Package</th><th>Summary</th><th>Count</th></tr></thead><tbody><tr><td>BTrees</td><td>Scalable persistent object containers</td><td>1166</td></tr><tr><td>gevent</td><td>Coroutine-based network library</td><td>1054</td></tr><tr><td>qiskit-aer</td><td>Qiskit Aer - High performance simulators for Qiskit</td><td>589</td></tr><tr><td>qiskit-aer-gpu</td><td>Qiskit Aer - High performance simulators for Qiskit</td><td>448</td></tr><tr><td>ctranslate2</td><td>Optimized inference engine for OpenNMT models</td><td>335</td></tr><tr><td>snapml</td><td>Snap Machine Learning</td><td>258</td></tr><tr><td>neural-compressor</td><td>Repository of Intel® Neural Compressor</td><td>234</td></tr><tr><td>RelStorage</td><td>A backend for ZODB that stores pickles in a relational database.</td><td>222</td></tr><tr><td>ikomia</td><td>Ikomia Python API for Computer Vision workflow and plugin integration in Ikomia Studio</td><td>165</td></tr></tbody></table></center></div><p>Unsurprisingly, a lot of these are various kinds of scientific software. I have never met a scientist who can resist the lure of fast-but-dangerous math when doing numerical simulations. But others, I think, are much more likely to simply be mistakes:</p><p></p><ul style="text-align: left;"><li><b><a href="https://pypi.org/project/BTrees/">BTrees</a></b> - Scalable persistent object containers. I don't think a generic data structure library should be changing the floating point behavior. And indeed if we look at their GitHub repo, there's an <a href="https://github.com/zopefoundation/BTrees/pull/179">open pull request to disable the use of -Ofast</a>.</li><li><b><a href="https://pypi.org/project/gevent/">gevent</a></b> - Coroutine-based network library. I covered this one in the intro; it definitely way out of line for a networking library to be messing with the FPU; we found <a href="https://github.com/gevent/gevent/pull/1864">the pull request that fixes it</a> (also still un-merged, sadly) earlier.</li><li><b><a href="https://pypi.org/project/RelStorage/">RelStorage</a></b> - A backend for ZODB that stores pickles in a relational database. I can't imagine how storing pickles in a database would need floating point math, so this seems like a mistake. This time I don't see any issues or PRs asking about <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-Ofast</span>, but we can see in <a href="https://github.com/zodb/relstorage/blob/de786b3e0d434748e1f56e9b02089662c84ac76b/scripts/releases/make-manylinux#L33-L34">the script used to generate the manylinux builds</a> that it is indeed enabled.</li><li><b><a href="https://pypi.org/project/perfmetrics/">perfmetrics</a></b> - Send performance metrics about Python code to Statsd. This has nothing to do with floating point math. Once again no PR or issue, but I notice that this, RelStorage, and BTrees are all Zope Foundation projects. The BTrees issue mentioned the Zope Foundation meta project as the source of the build configurations, and look what we find there (in <a href="https://github.com/zopefoundation/meta/blob/master/config/c-code/tests.yml.j2#L3-L6">config/c-code/tests.yml.j2</a>):</li></ul><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><p class="p1" style="background-color: black; color: #2fb41d; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span style="font-family: courier; font-variant-ligatures: no-common-ligatures;"># Initially copied from</span></p><p class="p1" style="background-color: black; color: #2fb41d; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"># https://github.com/actions/starter-workflows/blob/main/ci/python-package.yml</span></p><p class="p1" style="background-color: black; color: #2fb41d; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"># And later based on the version jamadden updated at</span></p><p class="p1" style="background-color: black; color: #2fb41d; font-size: 17px; font-stretch: normal; font-variant-east-asian: normal; font-variant-numeric: normal; line-height: normal; margin: 0px;"><span class="s1" style="font-family: courier; font-variant-ligatures: no-common-ligatures;"># gevent/gevent, and then at zodb/relstorage and zodb/perfmetrics</span></p></blockquote><p></p><p style="text-align: left;">So in fact all the packages we've seen so far can trace their use of <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-Ofast</span> to <b>gevent</b>!</p><h3 style="text-align: left;">How Bad is This? It <i>Depends</i></h3><div>Well who cares about all this, you might ask; I can just avoid using using those libraries until the problem is fixed, right? Well, maybe. But I had no idea I was using <b>gevent</b> until numpy started yelling at me, and I still don't know the exact chain of dependencies that caused it to get installed in the first place. So can we get a count of how many packages might (directly or indirectly) depend on a package that has an <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span> library?</div><div><br /></div><div>This is not as easy as it seems. Python packaging metadata is in a pretty sorry state, and you can't easily get a simple list of the dependencies of each package in machine-readable form. It turns out that the recommended way to get the dependencies of a particular package is to... just install it with pip and see what happens. And since we're looking for <i>reverse</i> dependencies, it's even worse -- we would have to install every package on PyPI and see if any of them pulled in one of the libraries we discovered!</div><div><br /></div><div>I actually started down this path and set about running <span style="font-family: courier;">pip install --dry-run --ignore-installed --report</span> on all 397,267 packages. This turned out to be a <i><b>terrible</b></i> idea. Unbeknownst to me, <span style="color: red;"><b>even with --dry-run pip will execute arbitrary code found in the package's setup.py</b></span>. In fact, <b><span style="color: red;">merely asking pip to <i>download</i> a package can execute arbitrary code</span></b> (see pip issues <a href="https://github.com/pypa/pip/issues/7325">7325</a> and <a href="https://github.com/pypa/pip/issues/1884">1884</a> for more details)! So when I tried to dry-run install almost 400K Python packages, <a href="https://twitter.com/moyix/status/1566561433898426368">hilarity ensued</a>. I spent a long time <a href="https://twitter.com/moyix/status/1566578412663209984">cleaning up the mess</a>, and discovered some <a href="https://twitter.com/moyix/status/1566609622680608770">pretty poor setup.py practices</a> along the way. But hey, at least I got <a href="https://twitter.com/moyix/status/1566612152558944257">two free pictures of anime catgirls</a>, deposited directly into my home directory. Convenient!</div><div><br /></div><div>Once I had managed to clean up the mess (or hopefully, anyway<span face="Roboto, arial, sans-serif" style="color: #202124;"><span style="font-size: 14px;">—</span></span>I never did find out what package tried to execute sudo), I decided I needed a different approach. <a href="https://twitter.com/sirdarckcat">Eduardo Vela</a> helpfully pointed me to <a href="https://deps.dev/">Google's Open Source Insights project, deps.dev</a>, which catalogues the dependencies and dependents (reverse dependencies) for PyPI, npm, Go, Maven, and Cargo. They don't have an official API, but a bit of poking around in Chrome Devtools' Network tab turned up a simple JSON endpoint that I could query. And I figured since I was only querying 422 package+version combinations, I could probably get away with it.</div><div><br /></div><div>That, finally, let me produce this table, <a href="https://moyix.net/~moyix/ffast_reverse_dep_counts.html">which you can see in its full form here</a> (I've omitted all packages with less than 4 dependents below since this post is already pretty long):</div><div><br /></div><div><br /></div><div><center><table border="1"><tbody></tbody><thead><tr><th style="text-align: start;">Package</th><th style="text-align: start;">Version</th><th style="text-align: end;">Count</th></tr></thead><tbody><tr><td>gevent</td><td>21.12.0</td><td style="text-align: end;">1592</td></tr><tr><td>BTrees</td><td>4.10.0</td><td style="text-align: end;">618</td></tr><tr><td>gevent</td><td>20.9.0</td><td style="text-align: end;">51</td></tr><tr><td>gevent</td><td>21.1.2</td><td style="text-align: end;">36</td></tr><tr><td>gevent</td><td>20.6.2</td><td style="text-align: end;">24</td></tr><tr><td>gevent</td><td>21.8.0</td><td style="text-align: end;">13</td></tr><tr><td>perfmetrics</td><td>3.2.0</td><td style="text-align: end;">12</td></tr><tr><td>ctranslate2</td><td>2.17.0</td><td style="text-align: end;">9</td></tr><tr><td>gevent</td><td>20.12.1</td><td style="text-align: end;">9</td></tr><tr><td>qiskit-aer</td><td>0.8.2</td><td style="text-align: end;">9</td></tr><tr><td>qiskit-aer</td><td>0.10.3</td><td style="text-align: end;">9</td></tr><tr><td>dyNET</td><td>2.1.2</td><td style="text-align: end;">8</td></tr><tr><td>qiskit-aer</td><td>0.9.1</td><td style="text-align: end;">7</td></tr><tr><td>dyNET38</td><td>2.1</td><td style="text-align: end;">6</td></tr><tr><td>gevent</td><td>20.6.1</td><td style="text-align: end;">6</td></tr><tr><td>qiskit-aer</td><td>0.7.5</td><td style="text-align: end;">6</td></tr><tr><td>qiskit-aer</td><td>0.1.1</td><td style="text-align: end;">5</td></tr><tr><td>gevent</td><td>20.6.0</td><td style="text-align: end;">4</td></tr><tr><td>glove-python-binary</td><td>0.2.0</td><td style="text-align: end;">4</td></tr><tr><td>pyqubo</td><td>1.0.13</td><td style="text-align: end;">4</td></tr><tr><td>qiskit-aer</td><td>0.6.1</td><td style="text-align: end;">4</td></tr><tr><td>qiskit-aer</td><td>0.7.3</td><td style="text-align: end;">4</td></tr><tr><td>qiskit-aer</td><td>0.9.0</td><td style="text-align: end;">4</td></tr><tr></tr></tbody></table></center><br /></div><div>A total of <b>2,514</b> packages eventually depend on a package that uses <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>. (This number is still approximate; deps.dev doesn't tell you the actual names and versions of all of the reverse dependencies (just a random sample), so the counts here may include some duplicates. It may also <i>underestimate</i> the extent of the problem, since there are some fairly popular packages like PyTorch that aren't on PyPI).</div><div><br /></div><div>Also, even with only the random sample of dependents provided by <a href="http://deps.dev">deps.dev</a>, we can use the download count CSV we generated at the beginning of this post to get a sense of how popular the affected projects are. Assuming we have a list of the names of the reverse dependencies in a file named "rdeps.txt", we can just do:</div><div><pre style="background-color: black; color: white; white-space: pre-wrap;">(sfcodegen) <span style="color: #4e9a06;"><b>moyix@isabella</b></span>:<span style="color: #3465a4;"><b>/fastdata/pypi_wheels</b></span>$ ipython
Python 3.8.10 (default, Jun 22 2022, 20:18:18)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.4.0 -- An enhanced Interactive Python. Type '?' for help.
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>1</b></span><span style="color: #4e9a06;">]: </span><span style="color: #008700;"><b>import</b></span> <span style="color: #3465a4;"><b>pandas</b></span> <span style="color: #008700;"><b>as</b></span> <span style="color: #3465a4;"><b>pd</b></span>
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>2</b></span><span style="color: #4e9a06;">]: </span>df = pd.read_csv(<span style="color: #c4a000;">'pypi_downloads_20220901_30d.csv'</span>)
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>3</b></span><span style="color: #4e9a06;">]: </span>rdeps = <span style="color: #008700;">set</span>(<span style="color: #008700;">open</span>(<span style="color: #c4a000;">'rdeps.txt'</span>).read().splitlines())
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>4</b></span><span style="color: #4e9a06;">]: </span>sorted_df = df.sort_values(by=<span style="color: #c4a000;">'num_downloads'</span>, ascending=<span style="color: #008700;"><b>False</b></span>, ignore_index=<span style="color: #008700;"><b>True</b></span>)
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>5</b></span><span style="color: #4e9a06;">]: </span>sorted_df[<span style="color: #c4a000;">'rank'</span>] = sorted_df.index+<span style="color: #4e9a06;">1</span>
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>6</b></span><span style="color: #4e9a06;">]: </span>rdep_ranks = sorted_df[sorted_df[<span style="color: #c4a000;">'project'</span>].isin(rdeps)]
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>7</b></span><span style="color: #4e9a06;">]: </span>top20 = rdep_ranks.nsmallest(<span style="color: #4e9a06;">20</span>, <span style="color: #c4a000;">'rank'</span>)
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>8</b></span><span style="color: #4e9a06;">]: </span><span style="color: #008700;"><b>from</b></span> <span style="color: #3465a4;"><b>tabulate</b></span> <span style="color: #008700;"><b>import</b></span> tabulate
<span style="color: #4e9a06;">In [</span><span style="color: #8ae234;"><b>9</b></span><span style="color: #4e9a06;">]: </span><span style="color: #008700;">print</span>(tabulate(top20, headers=<span style="color: #c4a000;">'keys'</span>, tablefmt=<span style="color: #c4a000;">'psql'</span>, showindex=<span style="color: #008700;"><b>False</b></span>))
+--------------------------+-----------------+--------+
| project | num_downloads | rank |
|--------------------------+-----------------+--------|
| geventhttpclient | 1116639 | 1296 |
| locust | 1045789 | 1345 |
| flask-socketio | 914846 | 1424 |
| dagster | 497982 | 1974 |
| grequests | 365292 | 2328 |
| dedupe | 358737 | 2352 |
| websocket | 346515 | 2397 |
| gevent-websocket | 322249 | 2466 |
| dagster-graphql | 310382 | 2525 |
| locust-plugins | 273452 | 2669 |
| interpret-community | 268129 | 2692 |
| zope-index | 240577 | 2836 |
| dedupe-variable-datetime | 223080 | 2932 |
| parallel-ssh | 220882 | 2947 |
| azureml-interpret | 211233 | 3009 |
| locustio | 102322 | 4203 |
| allennlp | 92268 | 4498 |
| interpret | 91404 | 4521 |
| rasa-sdk | 87875 | 4609 |
| pykafka | 78679 | 4837 |
+--------------------------+-----------------+--------+</pre></div><div><br /></div><div>So there are some very popular packages that will pull in one of the 49 packages that was built with <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>! With a little more work we can include the name of the fast-math-enabled package as well:</div><div><br /></div><div><pre style="background-color: black; color: white; white-space: pre-wrap;">+--------------------------+----------+-----------------+--------+
| project | source | num_downloads | rank |
|--------------------------+----------+-----------------+--------|
| geventhttpclient | gevent | 1116639 | 1296 |
| locust | gevent | 1045789 | 1345 |
| flask-socketio | gevent | 914846 | 1424 |
| dagster | gevent | 497982 | 1974 |
| grequests | gevent | 365292 | 2328 |
| dedupe | btrees | 358737 | 2352 |
| websocket | gevent | 346515 | 2397 |
| gevent-websocket | gevent | 322249 | 2466 |
| dagster-graphql | gevent | 310382 | 2525 |
| locust-plugins | gevent | 273452 | 2669 |
| interpret-community | gevent | 268129 | 2692 |
| zope-index | btrees | 240577 | 2836 |
| dedupe-variable-datetime | btrees | 223080 | 2932 |
| parallel-ssh | gevent | 220882 | 2947 |
| azureml-interpret | gevent | 211233 | 3009 |
| locustio | gevent | 102322 | 4203 |
| allennlp | gevent | 92268 | 4498 |
| interpret | gevent | 91404 | 4521 |
| rasa-sdk | gevent | 87875 | 4609 |
| pykafka | gevent | 78679 | 4837 |
+--------------------------+----------+-----------------+--------+</pre></div><div><br /></div><div>You can find the <a href="https://moyix.net/~moyix/rdep_ranks.html">full version of this table here</a>.</div><h3 style="text-align: left;">Conclusion</h3><h4 style="text-align: left;">So after all this work, what did we learn?</h4><p style="text-align: left;"></p><ul style="text-align: left;"><li>Turning on <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-Ofast</span> will end up turning on <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>, and that can cause all sorts of problems for any program unlucky enough to load them.</li><li>Even if you explicitly ask for no fast math, you will still get fast math as long as <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-Ofast</span> is enabled.</li><li>It is surprisingly feasible (though perhaps not wise) for a single individual with a good internet connection to download 4 TB of Python packages and scan 11 TB of shared libraries in a single day.</li><li>It is <i>definitely</i> not wise to try to run pip download or pip install --dry-run on every package listed in PyPI, at least not without some good sandboxing, because it will execute tons of random code from setup.py files and leave you with a giant mess to clean up.</li><li>Because of <a href="https://en.wikipedia.org/wiki/Software_supply_chain">highly connected nature of the modern software supply chain</a>, even though a mere 49 packages were actually built with <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>, thousands of other packages, with a total of at least <b>9.7 <i>million</i></b> downloads over the past 30 days, are affected.</li></ul><h4 style="text-align: left;">What can we actually do about it?</h4><div>Well, for now you can just try to be careful about what libraries you use, perhaps with the help of the tables I generated in this post. If you have a numerical function in Python that you really don't want to be affected by this, I wrote a <a href="https://gist.github.com/moyix/5bac4b2e383a466b7d015b8c04db13b5">somewhat alarming script, ensure_fp.py</a>, that provides a decorator named <span style="font-family: courier;">@ensure_clean_fpu_state</span>. The decorator resets the value of MXCSR to its power-on state for the duration of the function. (I say "alarming" because it runs assembly code from Python by mapping an executable memory region and then copying in the raw code bytes; hopefully it goes without saying that it needs some work before it's production-ready.)<p></p></div><div><table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/a/AVvXsEid854WZEzp-Rtw3KuEZVBoPtT7eSci150VHH0wWvQ7QureR9pKovwQQPq1_e89uIeE1CA9g4SmxDvE6YQnWzEzVQMRuJe-bXI1frjcaxzNKpWKPfSYq3NKfLNkITBfVttrLil4FY4Um5wacFHUFVIXKj9u8MXwVZK9nrGO6p7Xu7M2lorX_v3swPOyPQ" style="margin-left: auto; margin-right: auto;"><img alt="" data-original-height="613" data-original-width="1549" height="254" src="https://blogger.googleusercontent.com/img/a/AVvXsEid854WZEzp-Rtw3KuEZVBoPtT7eSci150VHH0wWvQ7QureR9pKovwQQPq1_e89uIeE1CA9g4SmxDvE6YQnWzEzVQMRuJe-bXI1frjcaxzNKpWKPfSYq3NKfLNkITBfVttrLil4FY4Um5wacFHUFVIXKj9u8MXwVZK9nrGO6p7Xu7M2lorX_v3swPOyPQ=w640-h254" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Some absolutely 31337 ASCII-art I put together for <span style="font-family: courier;">ensure_fp.py</span>. Thanks to <a href="https://twitter.com/bjg/status/1566518384245854208">Ben Gras</a>, <a href="https://twitter.com/peter_a_goodman/status/1566506623895486464">Peter Goodman</a>, and <a href="https://twitter.com/DRMacIver/status/1566506166183776256">David R. MacIver</a> on Twitter for helping me rephrase the caption so that it was fully justified to the width of that box in the top-left.</td></tr></tbody></table><br /><br /></div><div>Longer-term, gcc and clang should provide more sane defaults. Ideally, <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span> should simply not enable FTZ/DAZ; that functionality (if anyone wants it) could be split out into an option and <i>definitely</i> not enabled by <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">-ffast-math</span>. A less radical compromise would be to at least avoid linking in <span style="background-color: rgba(175, 184, 193, 0.2); color: #24292f; font-family: courier;">crtfastmath</span> when building a shared library. I'm not all that optimistic about this, though, given that <a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522">the relevant gcc bug report will turn 10 years old at the end of this November and is still marked NEW</a>. Still, maybe if more people complain about it (ideally with a pull request or patch attached) it will get fixed.</div><div><br /></div><div>Oh, and as for <b>gevent</b>? I decided to just replace the code that messes with MXCSR with no-ops until they get around to making a release that doesn't mess with my FPU.</div>Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com6tag:blogger.com,1999:blog-6787362638788314904.post-6539654551401373342022-02-05T18:36:00.005-05:002022-02-05T18:36:43.525-05:00On Building 30K Debian Packages<p>As part of my ongoing attempts to create some nice datasets for training <a href="https://huggingface.co/moyix/csrc_774m">large code models for C/C++</a>, I've recently been attempting to build every package in Debian Unstable from source using <a href="https://github.com/rizsotto/Bear">bear</a> to log the compilation and generate a <a href="https://clang.llvm.org/docs/JSONCompilationDatabase.html">compile_commands.json database</a> for each build. Since it's not possible, in general, to parse C/C++ code without knowing what flags were used (e.g., so you can find header files, know what preprocessor defines are in use, etc.), this will open up some nice possibilities like:</p><p></p><ul style="text-align: left;"><li>Getting ASTs for each source file</li><li>Rebuilding each file and generating its LLVM IR (-emit-llvm) or assembly (-S)</li><li>Extracting comments associated with individual functions</li></ul>I'll probably have more to say about this dataset once I actually get around to doing something fun with it, but for now I wanted to just jot down some notes on stuff I wish I had known before trying to do this:<p></p><p></p><ul style="text-align: left;"><li><b>Isolation</b>: Run the build for each package in some kind of isolated environment. You know how packages sometimes have install-time conflicts? It's 100x worse for build-time conflicts.</li><li><b>Use an SSD</b>: Make sure to build things somewhere with fast storage. A huge amount of compiling stuff is just reading it off disk and writing it back. Because my main Docker stores its images on spinning rust, I ran a separate Docker daemon for the SSD with a <a href="https://gist.github.com/moyix/8668f411c1dd87a07457b329e3008530">minimal config file</a>. Then you can just set DOCKER_HOST=unix:///var/run/docker-nvme.sock and build/run your images.</li><li><b>Log everything, especially exit codes</b>. I got through a whole pass before realizing I didn't have a reliable way to tell which packages had built successfully (dpkg-buildpackage emits an exciting array of inconsistent messages), and had to re-run everything.</li><li><b>Turn off stuff you don't want</b>. I don't care about running tests or building documentation, so I set DEB_BUILD_OPTIONS="nodoc notest nocheck". Unfortunately, not every package respects the build options, but it's worth a try.</li><li><b>Don't build as root</b>. A number of packages detect if you're trying to build stuff as root and will die (coreutils is one example). This is an easy mistake to make in Docker, where running as root is the default. Run as a normal user, and use "dpkg-buildpackage -rfakeroot" so that it can pretend to be root for packages that <i>do</i> want to be built as root.</li><li><b>Run non-interactively</b>. There are a few packages that, when installed, try to ask the user some questions and will hang forever unless DEBIAN_FRONTEND=noninteractive is set. So set it, and make sure it gets passed on child processes (a particularly annoying example is sudo, where you have to add -E to make it inherit the environment).</li><li><b>Use timeouts</b>. Particularly in an isolated environment like Docker, sometimes stuff will just hang during build (or maybe in some cases it's bear's fault, IDK). Some common culprits I've found so far are xvfb-run and erl_child_setup, and (maybe) things that expect dbus to be present. Aside from setting a timeout, I also ran a script in the background to find and kill any of those processes that were hanging around longer than a few minutes. [Actually, rather than killing them, which will make them exit with a non-zero status and cause the build to error out, I used this <a href="https://twitter.com/moyix/status/1484342467205816325">nice trick from Kyle Huey</a> to <a href="https://gist.github.com/moyix/95ca9a7a26a639b2322c36c7411dc3be">attach to them with gdb and inject a call to exit(0)</a>]</li><li><b>Clean up</b>. Since you're using a nice fast SSD, it's probably not enormous (mine is a measly 2TB). Builds are big. You may want to remember to move your build artifacts to somewhere roomier so that you don't run out of space (this tends to make build systems very unhappy).</li><li><b>Stay up to date</b>. Initially I just parsed Sources.gz, grabbed all the source packages, and then tried to fetch their build-deps. But it turns out Debian moves too fast for this; by the time I got around to building some package a few days later, its build-deps had in some cases been updated and weren't available in apt any more. Now I instead start each build with an apt-get -y update, and then fetch the most recent sources package info and build dependencies right before attempting the build.</li><li><b>Avoid shell hackery</b>. This is probably controversial, and I'm sure someone better and more careful at bash could do it, but trying to automate everything in a language where failures are silent and can do exciting things like call "rm -rf /" when you meant "rm -rf ${foo}/${bar}" is painful. Python has its own issues, but it was nice to at least get noisy errors as soon as things went wrong (example script: <a href="https://gist.github.com/moyix/f64f35a563f74d23293c4c17fb15c904">this one which uses python-apt to get source package info</a>, rather than "parsing" Sources.gz with grep/awk/sed).</li><li><b>Expect to be disappointed</b>. Even after all of this a lot of stuff is going to fail to build. Other things will be weird in ways you never dreamed software could be weird (hello, packages that spend 12 hours generating documentation using xsltproc!). You'll find fun stuff like packages that have clear security vulnerabilities, as revealed by compiler diagnostics like -Wformat-security (presumably these packages built fine under older, dumber compilers). Some of this can probably be mitigated by targeting Debian stable; unstable is, well, unstable, and brokenness is expected.</li></ul>No doubt I've missed lots of things that make this a more pleasant and reliable experience! There are a number of other projects that are also attempting to build all (or large portions) of Debian, which I probably should have looked at in more detail before attempting to roll my own (my only excuse is that I wanted something I knew how to extend and modify to do weird stuff like tracing build commands and recompiling individual files with other flags):<div><ul style="text-align: left;"><li><a href="https://reproducible.seal.purdue.wtf/">Reproducible Debian</a> uses <a href="https://github.com/kpcyrd/rebuilderd">rebuilderd</a> to measure how much of Debian can be rebuilt <a href="https://reproducible-builds.org/">reproducibly</a>, ideally producing bit-for-bit identical binaries. This is run by NYU CCS alum <a href="https://badhomb.re/">Santiago Torres-Arias</a>, now an assistant prof at Purdue!</li><li><a href="https://wiki.debian.org/ReproducibleBuilds">Debian also has its own reproducible builds project</a> (at least, I <i>think</i> the two projects are separate); you can see the status of their efforts on the <a href="https://tests.reproducible-builds.org/debian/reproducible.html">reproducible builds dashboard</a>.</li><li><a href="https://wiki.debian.org/qa.debian.org/ArchiveTesting">Debian's QA team runs periodic build attempts</a>. I think these use <a href="https://wiki.debian.org/sbuild">sbuild</a> and some scripts to run things on AWS.</li><li>These folks are attempting to <a href="https://clang.debian.net/">build all of Debian with clang</a>. They're taking a slightly more invasive approach than I'd like, though, by deleting gcc/g++ and replacing it with a symlink to clang/clang++.</li></ul>I'm hoping to dig into these more established efforts and see what tips and tricks I can steal for my own infrastructure. And if you know of other helpful hints, please let me know!</div>Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com2tag:blogger.com,1999:blog-6787362638788314904.post-8407090599464143032018-10-17T13:58:00.002-04:002018-10-17T13:58:38.167-04:00A couple ideas that went nowhere<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
I suspect a lot of people in academia end up having a lot of ideas and projects that went nowhere for any number of reasons – maybe there were insurmountable technical challenges, maybe the right person to work on it never materialized, or maybe it just got crowded out by other projects and never picked back up. Here are a couple of mine. For each I'll try to indicate why it fell by the wayside, and whether I think it could be resurrected (if you're interested in doing some idea necromancy, let me know! :)).<br />
<br />
<h3 style="text-align: left;">
Detecting Flush+Flush </h3>
</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
Among the flurry of microarchitectural side channel attacks that eventually culminated in the devastating Spectre and Meltdown attacks was one that has received relatively little attention: Flush+Flush. The base of the attack is the observation that <code style="background-color: ghostwhite; border: 1px solid rgb(222, 222, 222); color: #444444; font: 12px Monaco, "Courier New", "DejaVu Sans Mono", "Bitstream Vera Sans Mono", monospace; margin: 0px; padding: 0px 0.2em;">clflush</code> takes a different amount of time depending on whether the address to be flushed was already in the cache or not. Gruss et al. <a href="https://arxiv.org/abs/1511.04594" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">had a nice paper on this variant of the attack at DIMVA 2016</a>.</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
The interesting thing about Flush+Flush to me is not its speed (which I believe has since been surpassed) but the fact that it is <i style="margin: 0px; padding: 0px;">stealthy</i>: unlike Flush+Reload, which causes an unusually large number of cache misses in the attacking process, Flush+Flush only causes cache misses in the <i style="margin: 0px; padding: 0px;">victim</i> process. So you can tell that an attack is happening, but you can't tell which process is actually carrying it out --- which might be useful if you want to stop the attack without taking down the whole machine.</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
The key idea of the project was that even if you have only a global detector of whether an attack is going on system-wide, you can convert this into something that detects which process is attacking with a simple trick. Assuming you have control of the OS scheduler, just pause half the processes on the system and see if the attack stops. If it does, then the attacker must have been in the half you paused; you can now repeat this procedure on those processes and find the attacker via binary search. There are some wrinkles here: what if the attacker is carrying out the attack from multiple cooperating processes? What if the attacker probabilistically chooses whether or not to continue the attack at each time step?</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
It's a pretty simple idea but I think it may have applications beyond detecting flush-flush. For example, I recently <a href="https://twitter.com/moyix/status/1050402642067972096" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">complained on twitter</a> that it's currently hard to tell what's causing my Mac's fan to spin. I have a detector (fan is on); if I can systematically freeze and unfreeze processes on the system, I can convert that to something that tells me . It's also very similar to the well known ideas of <a href="https://git-scm.com/docs/git-bisect" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">code bisection</a> and (my favorite name) <a href="https://dl.acm.org/citation.cfm?id=358695" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">Wolf Fence Debugging</a>. So if this defense is ever implemented I hope it becomes known as the wolf fence defense :)</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
This project failed primarily due to personnel issues: a couple different students took it on as a semester or internship project, but didn't manage to finish it up before they graduated or moved on to other things. It also turned out to be more difficult than expected to reliably carry out Flush+Flush attacks; at the time we started the project, a lot less was known about the Intel microarchitecture, and we were tripped up by features like <a href="https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">hardware prefetching</a>. We were also hampered by the lack of widely available implementations of hardware performance counter-based detections, and our own implementation was prone to false positives (at least in this we are not alone: a recent SoK paper has pointed out that <a href="https://oaklandsok.github.io/papers/das2019.pdf" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">many of these HPCs are not very reliable for security purposes</a>).</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
I think this could still make a nice little paper, particularly if some of the more complex scenarios and advanced attackers were taken into account. When I discussed this with a more theoretically minded friend, he thought there might be connections to <a href="https://en.wikipedia.org/wiki/Compressed_sensing" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">compressed sensing</a>; meanwhile, a game theoretically oriented colleague I spoke to thought it sounded like an instance of a <a href="https://en.wikipedia.org/wiki/Multi-armed_bandit" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">Multi-Armed Bandit problem</a>. So there's still plenty of meat to this problem for an interested researcher.</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
<h3 style="text-align: left;">
Reverse Engineering Currency Detection Code</h3>
</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
It is by now well-known that a wide variety of software and hardware, from cameras to photo editing software such as Photoshop, implement <a href="https://www.blogger.com/u/1/blogger.g?blogID=6787362638788314904" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">currency detection</a> in order to help combat counterfeiting. But surprisingly little is known about these algorithms: Steven Murdoch did some research on the topic using <a href="https://murdoch.is/projects/currency/" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">a mix of black-box techniques and reverse engineering all the way back in 2004</a>, but as far as I know never published a paper on the topic.</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
Reverse engineering the precise detection technique could have a lot of benefits. First, as a matter of principle, I think defensive techniques <i style="margin: 0px; padding: 0px;">must</i> be attacked if we are to rely on them; the fact that this has been used in the wild for more than 15 years across a wide range of devices without any rigorous public analysis is really surprising to me. Second, there is a fun application if we can pin down precisely what makes the currency detector fire: we could create something that placed a currency watermark on arbitrary documents, making them impossible to scan or edit on devices that implement the currency detector! We could even, perhaps, imagine making T-shirts that trigger the detector when photographed :) I believe this idea has been floated before with the <a href="https://en.wikipedia.org/wiki/EURion_constellation" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">EURion constellation</a>, but based on Murdoch's research we know that the EURion is not the only feature used (and it may not be used at all by modern detectors).</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
Our technical approach to this problem was to use dynamic taint analysis and measure the amount of computation performed by Photoshop in each pixel of the input image. I now think this was a mistake. First, many image analyses compute <i style="margin: 0px; padding: 0px;">global </i>features (such as color histograms) over the whole image; taint analysis on these will only tell you that every pixel contributes to the decision [1]. A more profitable approach might be something like <a href="http://bitblaze.cs.berkeley.edu/papers/diffslicing_oakland11.pdf" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">differential slicing</a>, which pins down the causal differences between two closely related execution traces. This would hopefully help isolate the code responsible for the detection for more manual analysis.</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
As with the Flush+Flush detection above, this project could still be viable if approached by someone with strength in binary analysis and some knowledge of image processing and watermarking algorithms.</div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
<br /></div>
<div style="-webkit-text-stroke-width: 0px; color: black; font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 13.34px; font-style: normal; font-variant-caps: normal; font-variant-ligatures: normal; font-weight: 400; letter-spacing: normal; line-height: 1.5em; margin: 1em 0px; orphans: 2; padding: 0px; text-align: start; text-decoration-color: initial; text-decoration-style: initial; text-indent: 0px; text-transform: none; white-space: normal; widows: 2; word-spacing: 0px;">
<hr />
1. This also made me start thinking about variants of taint analysis that would be better suited to image analyses. One possibility is something based on quantitative information flow; <a href="http://groups.csail.mit.edu/pag/pubs/secret-max-flow-pldi2008.pdf" style="color: #4183c4; margin: 0px; padding: 0px; text-decoration: none;">Steven McCamant et al.</a> had some success at using this to analyze the security of image redactions. Even with traditional dynamic taint analysis, though, I think it might be possible to try tainting the results of intermediate stages of the computation (e.g., in the histogram example, one could try tainting each bucket and seeing how it contributed to the eventual decision, or for an FFT one could apply the taint analysis after the transformation into the frequency domain).</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com1tag:blogger.com,1999:blog-6787362638788314904.post-77936118120987442492018-03-19T23:25:00.000-04:002018-03-20T10:40:09.360-04:00Of Bugs and Baselines<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="text-align: left;">
<i><b>Summary</b>: recently published results on the LAVA-M synthetic bug dataset are exciting. However, I show that much simpler techniques can also do startlingly well on this dataset; we need to be cautious in our evaluations and not rely too much on getting a high score on a single benchmark.</i><br />
<i><br /></i></div>
<h3 style="text-align: left;">
A New Record</h3>
<br />
The <a href="http://moyix.blogspot.com/2016/10/the-lava-synthetic-bug-corpora.html">LAVA synthetic bug corpora</a> have been available now for about a year and a half. I've been really excited to see new bug-finding approaches (particularly fuzzers) use the LAVA-M dataset as a benchmark, and to watch as performance on that dataset steadily improved. Here's how things have progressed over time.<br />
<span id="goog_1646589057"></span><span id="goog_1646589058"></span><br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9VQTDzco7PXKrKwPaqXTn2B84c0iBS28ZsUEv25HWAQuTTriRHezaU2ljS81oVUVrlt3c2tNJyeQhejubL_Sy2enJV31tmfAtnxcSZz9QI8PsWox0MORTLk1ZF6cq6F4SRBaI1b0cuseQ/s1600/lava_over_time.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="600" data-original-width="800" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9VQTDzco7PXKrKwPaqXTn2B84c0iBS28ZsUEv25HWAQuTTriRHezaU2ljS81oVUVrlt3c2tNJyeQhejubL_Sy2enJV31tmfAtnxcSZz9QI8PsWox0MORTLk1ZF6cq6F4SRBaI1b0cuseQ/s400/lava_over_time.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Performance on the LAVA-M dataset over time. Note that because the different utilities have differing numbers of bugs, this picture presents a slightly skewed view of how successful each approach was by normalizing the performance on each utility. Also, SBF was only evaluated on base64 (where it did very well), and Vuzzer's performance on md5sum is due to largely to a problem in the LAVA-M dataset.</td></tr>
</tbody></table>
You may notice that <a href="https://arxiv.org/abs/1803.01307">Angora (Chen and Chen, to appear at Oakland ’18)</a>, on the far right, has come within striking distance of perfect performance on the dataset! This is a great result, and the authors pulled it by combining several really neat techniques (I mention a few of them in <a href="https://twitter.com/moyix/status/974005431801667585">this Twitter thread</a>). They also managed to do it in just <b>10 minutes</b> for base64, md5sum, and uniq, and <b>45 minutes</b> for who. Kudos to the authors!<br />
<br />
My feeling is that the effective “shelf life” of this dataset is pretty much up – current techniques are very close to being able to cover everything in this dataset. This is perhaps not too surprising, since the coreutils are quite small, and the trigger mechanism used by the original LAVA system (a comparison against a 4 byte magic number) is a bit too simple since you can often solve it by <a href="http://moyix.blogspot.com/2016/07/fuzzing-with-afl-is-an-art.html">extracting constants from the binary</a>. Luckily, we have been working on new bug injection techniques and new corpora, and we will hopefully have some results to announce soon – watch this space :)<br />
<br />
<h3 style="text-align: left;">
Covering your Baseline</h3>
<div>
<br />
When talking about progress in bug-finding, it is important to figure out a good baseline. One such baseline, of course, is the results we provided in the original paper (the first two bars on the graph). However, during the summer following the paper's publication, I pointed out <a href="http://moyix.blogspot.com/2016/07/fuzzing-with-afl-is-an-art.html">two fairly simple ways to strengthen the baseline for fuzzing</a>: using a constant dictionary and using a compiler pass to split up integer comparisons. The former is built in to AFL, and the latter was implemented in an AFL fork by <a href="https://lafintel.wordpress.com/2016/08/15/circumventing-fuzzing-roadblocks-with-compiler-transformations/">laf-intel</a>. </div>
<div>
<br /></div>
<div>
What does our baseline look like when we use these techniques? To test this, I ran fuzzers on the LAVA-M dataset for 1 hour per program. This is a very short amount of time, but the intent here is to see what's possible with minimal effort and simple techniques. For the dictionary, I used a <a href="https://gist.github.com/moyix/c042090d9beb6b1a7cb39f6162cd6128#file-make_testcases-sh">simple script</a> that extracts strings and also runs objdump to extract integer constants. I tested:<br />
<br />
<ul style="text-align: left;">
<li>AFL with a dictionary</li>
<li>laf-intel with AFL's "fidgety" mode (the -d option) and LAF_SPLIT_COMPARES=1</li>
<li>The same laf-intel configuration, but with a 256KiB map size rather than the default 64KiB.</li>
</ul>
The laf-intel configurations were suggested by <a href="http://www.carolemieux.com/">Caroline Lemieux</a> and relayed to me by <a href="https://twitter.com/ElectronicKiwi/status/974426690045620224">Kevin Laeufer</a>, who also pointed out that <a href="https://twitter.com/ElectronicKiwi/status/974464933973786624">one really ought to be doing multiple runs of each fuzzer</a> since the instrumentation as well as some fuzzing stages are nondeterministic – advice I have, alas, ignored for now. The larger map size helps reduce hash collisions with laf-intel – splitting up comparisons adds a large number of new branches to the program, so fuzzing performance may suffer with too small a map.<br />
<br />
Here's what our new baseline results look like when put next to published work.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieW2wdnERXL6592TWWsJ_zhIl4VJWMhx1NDXufeNHwGM4ox7JV8JSPON5KBBsRV2vvwA_5hy0HVhVkJe3vMXATsvi-6ErR6JxgBtHEdRigQPDlRyWjXcHerEzzzdfQy43hUE8r0sLWjKF0/s1600/lava_m_withbase.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="600" data-original-width="800" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieW2wdnERXL6592TWWsJ_zhIl4VJWMhx1NDXufeNHwGM4ox7JV8JSPON5KBBsRV2vvwA_5hy0HVhVkJe3vMXATsvi-6ErR6JxgBtHEdRigQPDlRyWjXcHerEzzzdfQy43hUE8r0sLWjKF0/s400/lava_m_withbase.png" width="400" /></a></div>
<br />
In this light, published tools are still much stronger than our original weak baseline, but (aside from Angora) are not better than AFL with a dictionary. I suspect that this says more about deficiencies in our dataset than about the tools themselves (more on this later).<br />
<br /></div>
<h3 style="text-align: left;">
Who's Who</h3>
<div>
<br /></div>
<div>
Because the who program has so many bugs, and published tools seem to have had some trouble with it, I also wanted to see how different techniques would fare when given 24 hours to chew on it. I should note here, for clarity, that I do not think 2000+ bugs in who is a <i>reasonable</i> benchmark by which to judge bug-finding tools. You should not use the results of this test to decide what you will use to fuzz your software. But it does help us characterize what works well on LAVA-style bugs in a very simple, small program that takes structured binary input.</div>
<div>
<br /></div>
<div>
The contestants this time:</div>
<div>
<ul style="text-align: left;">
<li>AFL, default mode, with dictionary</li>
<li>AFL, fidgety mode (-d option), with dictionary</li>
<li><a href="https://github.com/mboehme/aflfast">AFLFast</a>, with dictionary</li>
<li>AFL, fidgety mode, with the laf-intel pass, with all four combinations of {dict,nodict} x {64KiB,256KiB} map.</li>
</ul>
You can see the results below; I've included Angora in the chart since it's the best-performing of the non-baseline approaches. Note, though, that Angora only ran for 45 minutes – it's possible it would have done better if allowed to run for 24 hours.</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiePvYyo6SfJhX9TlPou-bujMkBg3gPoH-cP12VnD8303nAN6OynEtm2hotqSPZxEp_gGyDOx4YRoVp9e4U-uKuGJhb2uoUIuksnIdr1VplyEYFkp4Zsq-VdUyR7xhim_G8Qr1gDuwSNyq0/s1600/who_results.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="600" data-original-width="800" height="300" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiePvYyo6SfJhX9TlPou-bujMkBg3gPoH-cP12VnD8303nAN6OynEtm2hotqSPZxEp_gGyDOx4YRoVp9e4U-uKuGJhb2uoUIuksnIdr1VplyEYFkp4Zsq-VdUyR7xhim_G8Qr1gDuwSNyq0/s400/who_results.png" width="400" /></a></div>
<div>
<br /></div>
<div>
<br /></div>
<div>
The top of the chart, at 2136, represents the total number of (validated) bugs in who. We can see that the <i>combination</i> of our techniques comes close to finding all of the bugs (93.7%), and any variant of AFL that was able to use a dictionary does very well. The only exception to this trend is AFL with default settings – this is because by default, AFL starts with <i>deterministic</i> fuzzing. With a large dictionary, this can take an inordinately long time: we noticed that after 24 hours, default AFL still hadn't finished a full cycle, and was only 12% of the way through the stage. Fidgety mode (the -d option) bypasses deterministic fuzzing and does much better here.<br />
<br /></div>
<h3 style="text-align: left;">
Notes on Seed Selection</h3>
<br />
When we put together the LAVA-M benchmark, one thing we thought would be important is to include the seed file we used for our experiments. The performance of mutation-based fuzzers like AFL can vary significantly depending on the quality of their input corpus, so to reproduce our results you really need to know what seed we used. Sadly, I almost never see papers on fuzzing actually talk about their seeds. (Perhaps they usually start off with no seeds? But this is an unrealistic worst-case for AFL.)<br />
<br />
In the experiments above, I used the corpus-provided seeds for all programs <b>except</b> md5sum. In the case of md5sum, the program is run with "md5sum -c", which causes it to go compute and check the MD5 of each file listed in the input file. The seed we used in the LAVA-M corpus lists 20 files, which slows down md5sum significantly as it has to compute 20 MD5 sums. Replacing the seed with one that only lists a single, small file allows AFL to achieve more than 10 times as many executions per second. In retrospect, our seed choice when running the experiments for the paper were definitely sub-optimal.<br />
<br />
<h3 style="text-align: left;">
Conclusion</h3>
<div>
<br />
The upshot? We can almost completely solve the LAVA-M dataset using stock AFL with a dictionary. The only utility that this doesn't work for is "who"; however, a few simple tricks (a dictionary, constant splitting, and skipping the deterministic phase of AFL with "-d") and 24 hours suffice to get us to very high coverage (1731/2136 bugs, or 81%). So although the results of fuzzers like Angora are exciting, we should be cautious not to read too much into their performance on LAVA-M and instead also ask how they perform at finding important bugs in programs we care about. (John Regehr points out that the most important question to ask about a bug-finding paper is “<a href="https://twitter.com/johnregehr/status/974100798467538944">Did you report the bugs, and what did they say?</a>”)<br />
<br />
To be honest, I think this is not a great result for the LAVA-M benchmark. If a fairly simple baseline can find most of its bugs, then we have miscalibrated its difficulty: ideally we want our bug datasets to be a bit beyond what the best techniques can do today. This also confirms our intuition that it's time to create a new, harder dataset of bugs!<br />
<br />
<h3 style="text-align: left;">
Acknowledgements</h3>
<br />
Thanks to Caroline Lemieux and Kevin Laeufer for feedback on a draft of this post, which fixed several mistakes and provided several interesting insights (naturally, if you see any more errors – that's my fault). Additional thanks are due to Kevin Laeufer (again), Alex Gantman, John Regehr, Kristopher Micinski, Sam Tobin-Hochstadt, Michal Zalewski (aka lcamtuf), and Valentin Manès (aka Jiliac) for a lively Twitter argument that turned into some actual useful science.</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-49216862468801092142016-10-20T11:40:00.000-04:002016-11-07T09:20:10.397-05:00NYC Area Security Folks – Come to SOS!<div dir="ltr" style="text-align: left;" trbidi="on">
Every year the NYU School of Engineering hosts Cyber Security Awareness Week (CSAW) – the largest student-run security event in the country. This year, we're trying something new that combines two of my favorite things: <b>security </b>and <b>open source</b>.<br />
<br />
The inaugural <a href="https://csaw.engineering.nyu.edu/events/security-open-source-workshop">Security: Open Source (SOS) workshop</a>, held this <b>November 10 at NYU Tandon</b> will feature the creators of some really cool new security tools talking about their projects. It's happening the day before one of the best CTF challenges out there, so we're expecting an audience that's not afraid of technical detail :)<br />
<br />
What will you hear about at SOS? Here some of the cool speakers and topics:<br />
<br />
<ul style="text-align: left;">
<li><b>Félix Cloutier</b> will tell us about his open-source decompiler, <a href="https://zneak.github.io/fcd/">fcd</a>. This is a great example of incorporating cutting edge academic research into an open-source tool that anyone can use. Félix is also a former CSAW CTF competititor.</li>
<li><b>Mike Arpaia, </b>co-founder of <a href="https://www.kolide.co/">Kolide</a>, will talk about <a href="https://osquery.io/">osquery</a>, a new open-source operating system instrumentation framework and toolset he created while at Facebook. Mike will talk about his experience managing an open-source security project and how to make it successful.</li>
<li><b>Patrick Hulin</b> from <a href="https://www.ll.mit.edu/">MIT Lincoln Laboratory</a> will talk about a new differential debugging technique he's devised. Patrick is one of the lead developers on <a href="https://github.com/moyix/panda">PANDA</a>, and he'll talk about how he used another great open-source tool, <a href="http://rr-project.org/">Mozilla rr</a>, to <i>automatically</i> do root-cause debugging on devilishly tricky record/replay bugs.</li>
<li><b>Jamie Levy</b>, one of the core developers on the <a href="http://www.volatilityfoundation.org/">Volatility memory forensics framework</a>, will talk about taking memory forensics to the next level. Jamie is one of the most talented forensic investigators and developers I know and this should be a great talk!</li>
<li><b>Jonathan Salwan</b> and <b>Romain Thomas</b> from <a href="http://www.quarkslab.com/">Quarkslab</a> will present a deep dive on <a href="http://triton.quarkslab.com/">Triton</a>, their exciting binary analysis platform that combines symbolic execution and dynamic taint analysis, and demonstrate how it can be used to defeat virtualization-based obfuscation techniques.</li>
<li><b>Ryan Stortz</b> from <a href="https://www.trailofbits.com/">Trail of Bits</a> will talk about how they took the <a href="https://www.cybergrandchallenge.com/">DARPA Cyber Grand Challenge</a> test programs and <a href="https://blog.trailofbits.com/2016/08/01/your-tool-works-better-than-mine-prove-it/">ported them to run on OS X and Linux</a>. This opens up some really cool possibilities for using them to evaluate the effectiveness of different security tools!</li>
<li><b>Andrew Dutcher</b> of <a href="http://www.cs.ucsb.edu/~yans/">UCSB</a> will talk about <a href="http://angr.io/">angr</a>, their Python-based binary analysis platform that aims to bring together tons of state-of-the-art analyses under one roof. They've recently used it to get <a href="https://www.cs.ucsb.edu/news/2190">third place in the DARPA Cyber Grand Challenge</a>, and it's become a popular tool for CTF players around the world.</li>
</ul>
SOS will take place in the Pfizer Auditorium at the NYU Tandon School of Engineering in Brooklyn from 10:30am-5:30pm on November 10, the day before the CSAW CTF.<br />
<div>
<br /></div>
<div>
So what are you waiting for? Go <a href="https://csaw.engineering.nyu.edu/events/security-open-source-workshop/register-sos">register</a>!</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-58618300871314081292016-10-08T15:06:00.001-04:002016-10-08T15:08:57.127-04:00The LAVA Synthetic Bug Corpora<div dir="ltr" style="text-align: left;" trbidi="on">
I'm planning a longer post discussing how we evaluated the LAVA bug injection system, but since we've gotten approval to release the test corpora I wanted to make them available right away.<br />
<br />
The corpora described in the paper, LAVA-1 and LAVA-M, can be downloaded here:<br />
<br />
<a href="http://panda.moyix.net/~moyix/lava_corpus.tar.xz">http://panda.moyix.net/~moyix/lava_corpus.tar.xz</a> (101M)<br />
<br />
Quoting from the included README:<br />
<br />
<blockquote class="tr_bq">
<span class="s1">This distribution contains the automatically generated bug corpora used </span>in the paper, "LAVA: Large-scale Automated Vulnerability Addition".
<br />
<br />
<span class="s1">LAVA-1 is a corpus consisting of 69 versions of the "file" utility, each </span>of which has had a single bug injected into it. Each bug is a named branch in a git repository. The triggering input can be found in the file named CRASH_INPUT. To run the validation, you can use validate.sh, which builds each buggy version of file and evaluates it on the corresponding triggering input.
<br />
<br />
<span class="s1">LAVA-M is a corpus consisting of four GNU coreutils programs (base64, </span>md5sum, uniq, and who), each of which has had a large number of bugs added. Each injected, validated bug is listed in the validated_bugs file, and the corresponding triggering inputs can be found in the inputs subdirectory. To run the validation, you can use the validate.sh script, which builds the buggy utility and evaluates it on triggering and non-triggering inputs.
<br />
<br />
<span class="s1">For both corpora, the "backtraces" subdirectory contains the output of </span>gdb's backtrace command for each bug.</blockquote>
<div class="p2">
<span class="s1"></span></div>
<div class="p2">
<span class="s1"></span></div>
<div class="p2">
<span class="s1"></span></div>
<div class="p1">
<br /></div>
<div class="p1">
Enjoy!</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com7tag:blogger.com,1999:blog-6787362638788314904.post-51769661381395855542016-07-21T09:32:00.001-04:002016-07-21T09:32:10.784-04:00Fuzzing with AFL is an Art<div dir="ltr" style="text-align: left;" trbidi="on">
<i>Using one of the test cases from the <a href="http://moyix.blogspot.com/2016/07/the-mechanics-of-bug-injection-with-lava.html">previous post</a>, I examine what affects AFL's ability to find a bug placed by LAVA in a program. Along the way, I found what's probably a harmless bug in AFL, and some interesting factors that affect its performance. Although its interface is admirably simple, AFL can still require some tuning, and unexpected things can determine its success or failure on a bug.</i><br />
<br />
<a href="http://lcamtuf.coredump.cx/afl/">American Fuzzy Lop</a>, or AFL for short, is a powerful coverage-guided fuzzer developed by <a href="http://lcamtuf.coredump.cx/">Michal Zalewski (lcamtuf)</a> at Google. Since its release in 2013, it has racked up an impressive set of trophies in the form of <a href="http://lcamtuf.coredump.cx/afl/#bugs">security vulnerabilities in high-profile software</a>. Given its phenomenal success on real world programs, I was curious to explore in detail how it worked on an automatically generated bug.<br />
<br />
I started off with the <a href="https://gist.github.com/moyix/f27f6d8cccd35d561d9379656090ddf6">toy program</a> we looked at in the previous post, with a single bug added. The bug added by LAVA will trigger whenever the first four bytes of a float-type <span style="font-family: "courier new" , "courier" , monospace;">file_entry</span> are set to <span style="font-family: "courier new" , "courier" , monospace;">0x6c6175de</span> or <span style="font-family: "courier new" , "courier" , monospace;">0xde75616c</span>, and will cause <span style="font-family: "courier new" , "courier" , monospace;">printf</span> to be called with an invalid format string, crashing the program.<br />
<br />
After verifying that the bug could be triggered reliably, I compiled it with <span style="font-family: "courier new" , "courier" , monospace;">afl-gcc</span> and started a fuzzing run. To get things started, I used a well-formed input file for the program that contained both int and float <span style="font-family: Courier New, Courier, monospace;">file_entry</span> types:<br />
<br />
<script src="https://gist.github.com/moyix/89960deb69e823dfbfcee5009a914ac9.js"></script><br />
Because I'm lucky enough to have a 24 core server sitting around, I gave it 24 cores (one using -M and the rest using -S) and let it run for about 4 and a half days, fully expecting that it would find the input in that time.<br />
<br />
This did not turn out so well.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRrC_f1_G8fF84zyl8tXg3hv2Z_dv8Ew0a5o6jHPMO3qUqSidoIYsOH1rPIWoRTvWlK0UvRYO7AuPdlGtOkNRfpGW2UWFm9iABe0XVkjtNmYfHsEkD6ZjXjjtzXu8bejQyAo7r-tT_ivhf/s1600/Screenshot+2016-07-16+12.06.06.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="290" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRrC_f1_G8fF84zyl8tXg3hv2Z_dv8Ew0a5o6jHPMO3qUqSidoIYsOH1rPIWoRTvWlK0UvRYO7AuPdlGtOkNRfpGW2UWFm9iABe0XVkjtNmYfHsEkD6ZjXjjtzXu8bejQyAo7r-tT_ivhf/s400/Screenshot+2016-07-16+12.06.06.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Around 20 billion executions later, AFL had found zilch.</td></tr>
</tbody></table>
<br />
At this point, I turned to Twitter, where John Regehr suggested that I look into what coverage AFL was achieving. I realized that I actually had no idea how AFL's instrumentation worked, and that this would be a great opportunity to find out.<br />
<br />
<blockquote class="twitter-tweet" data-lang="en">
<div dir="ltr" lang="en">
<a href="https://twitter.com/moyix">@moyix</a> you should look at the actual coverage achieved by afl's test cases and see what is going on</div>
— John Regehr (@johnregehr) <a href="https://twitter.com/johnregehr/status/754351267070087168">July 16, 2016</a></blockquote>
<script async="" charset="utf-8" src="//platform.twitter.com/widgets.js"></script>
<br />
<h2 style="text-align: left;">
</h2>
<h2 style="text-align: left;">
Diving Into AFL's Instrumentation</h2>
<div>
<br /></div>
<div>
The basic afl-gcc and afl-clang tools are actually very simple. They wrap gcc and clang, respectively, and modify the compile process to emit an intermediate assembly code file (using the -S option). Finally they do some simple string matching (in C, ew) to find out where to add in calls to AFL's coverage logging functions. You can get AFL to save the assembly code it generates using the AFL_KEEP_ASSEMBLY environment variable, and see exactly what it's doing. (There's actually also a newer way of getting instrumentation that was added recently using an LLVM pass; more on this later.)</div>
<div>
<br /></div>
<table style="text-align: center;"><tbody>
<tr><td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-JwP16YW5OZc9gU1Kuvg51loabHDk1ee2sOGwP68SqN7iPSlEXPOlpIG5cqk6ki5t-cU_Jr7YoLdsSdouVgrYU3Ak9YbTwuOv0_BQ1GrFIjNzIkffqYn6dPQMnPXrQtcsLq0NDV26yDpV/s1600/Screenshot+2016-07-17+19.09.23.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="92" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-JwP16YW5OZc9gU1Kuvg51loabHDk1ee2sOGwP68SqN7iPSlEXPOlpIG5cqk6ki5t-cU_Jr7YoLdsSdouVgrYU3Ak9YbTwuOv0_BQ1GrFIjNzIkffqYn6dPQMnPXrQtcsLq0NDV26yDpV/s200/Screenshot+2016-07-17+19.09.23.png" width="200" /></a></div>
<div style="text-align: center;">
<br /></div>
</td><td><div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgC6yQDrhz2TFt8iSSv8ageXFOrMb3ICLlICikjfNWK6LjA7KGvh176_pw7-_i9e2CGsZKh9J6HvkISVQs6Pd3RQhbLJjsnJqwU9fdf2VoJj0AaWW7MmrTWpYl4Sy3E6X9fnduy9r-jep4-/s1600/Screenshot+2016-07-17+19.11.22.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgC6yQDrhz2TFt8iSSv8ageXFOrMb3ICLlICikjfNWK6LjA7KGvh176_pw7-_i9e2CGsZKh9J6HvkISVQs6Pd3RQhbLJjsnJqwU9fdf2VoJj0AaWW7MmrTWpYl4Sy3E6X9fnduy9r-jep4-/s320/Screenshot+2016-07-17+19.11.22.png" width="297" /></a></div>
<div style="text-align: center;">
<br /></div>
</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<span style="font-size: x-small;">Left, the original assembly code. Right, the same code after AFL's instrumentation has been added.</span></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
After looking at the generated assembly, I noticed that the code corresponding to the buggy branch of the if statement wasn't getting instrumented. This seemed like it could be a problem, since AFL can't try to use coverage to reach a part of the program if there's no logging to tell it that an input has caused it to reach that point.<br />
<br />
Looking into the source code of <span style="font-family: Courier New, Courier, monospace;">afl-as</span>, the program that instruments the assembly code, I noticed a curious bit of code:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEio3Ki9E5uPpJ_RH0epjaDiaETUSPCCoOWzkruLLdyYID2nwrKh7RutI8NX-7bNmoxkaGyWfbQDwd_EIRLZXTAj9dxq6DUM_fjyHSkNjsF4n_DUdMO03n4vNil7o5C1NtMnRNn_0Ra3G3D1/s1600/Screenshot+2016-07-16+15.02.04.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="100" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEio3Ki9E5uPpJ_RH0epjaDiaETUSPCCoOWzkruLLdyYID2nwrKh7RutI8NX-7bNmoxkaGyWfbQDwd_EIRLZXTAj9dxq6DUM_fjyHSkNjsF4n_DUdMO03n4vNil7o5C1NtMnRNn_0Ra3G3D1/s400/Screenshot+2016-07-16+15.02.04.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">AFL skips labels following p2align directives in the assembly code.</td></tr>
</tbody></table>
<br />
According to the comment, this should only affect programs compiled under OpenBSD. However, the branch I wanted instrumented was being affected by this even though I was running under Linux, not OpenBSD, and there were no jump tables present in the program.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPalZ1SZ1uEewJ4xXAIjbWU-NzJA6FKs8zbW3KXvxclAzvpVnBfES4VRO3JtkSe4QkZ-jtBjgqLTFT1TVke6JuIPGPduUgvR0UtUitxV163mVI8YuwUCNu8HBb3us_qvb08lF1r8gVW7L_/s1600/Screenshot+2016-07-16+15.01.40.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="137" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgPalZ1SZ1uEewJ4xXAIjbWU-NzJA6FKs8zbW3KXvxclAzvpVnBfES4VRO3JtkSe4QkZ-jtBjgqLTFT1TVke6JuIPGPduUgvR0UtUitxV163mVI8YuwUCNu8HBb3us_qvb08lF1r8gVW7L_/s200/Screenshot+2016-07-16+15.01.40.png" width="200" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The .L18 block should be instrumented by AFL, but won't be because it's right after an alignment statement.</td></tr>
</tbody></table>
<br />
Since I'm not on OpenBSD, I just commented out this if statement. As an alternate workaround, you can also add "<span style="font-family: Courier New, Courier, monospace;">-fno-align-labels -fno-align-loops -fno-align-jumps</span>" to the compile command (at the cost of potentially slower binaries). After making the change I restarted, once again confident AFL would soon find my bug.<br />
<br />
Alas, it was not to be. Another 17 hours of fuzzing on 24 cores yielded nothing, and so I went back to the drawing board. I am still fairly sure I found a real bug in AFL, but fixing it didn't help find the bug I was interested in. (Note: it's possible that if I had waited four days again it would have found my bug. On the other hand, <a href="http://lcamtuf.coredump.cx/afl/status_screen.txt">AFL's cycle counter</a> had turned green, indicating that it thought there was little benefit in continuing to fuzz.)<br />
<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6whB9jvjIv8V2LePQ5Qf1PCWwm7sgZE0-VEo4GtG66jd1YkyWDRnHb1BgFOX8sggANvX4D0MAgyqOZLNlZc787aw5NtuIKfpr-dykVqPa7-1TfrHyFkWdO-fLeMbMoU1Fn47SF9yASUU4/s1600/Screenshot+2016-07-17+12.30.19.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="129" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6whB9jvjIv8V2LePQ5Qf1PCWwm7sgZE0-VEo4GtG66jd1YkyWDRnHb1BgFOX8sggANvX4D0MAgyqOZLNlZc787aw5NtuIKfpr-dykVqPa7-1TfrHyFkWdO-fLeMbMoU1Fn47SF9yASUU4/s320/Screenshot+2016-07-17+12.30.19.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">5.2 billion executions, no crashes :(</td></tr>
</tbody></table>
<br />
<br />
<h2 style="text-align: left;">
“Unrolling” Constants</h2>
</div>
<div>
<br /></div>
<div>
Thinking about what would be required to find the bug by AFL, I realized that its chances of hitting our failing test case were pretty low. AFL will only prioritize a test case if it has seen that it leads to new coverage. In the case of our toy program, it would have to guess one of the two exact 32-bit trigger values at exactly the right place in the file, and the odds of this happening are pretty slim.<br />
<br />
At this point I remembered a post by lcamtuf that described how <a href="https://lcamtuf.blogspot.com/2014/11/afl-fuzz-nobody-expects-cdata-sections.html">AFL managed to figure out that an XML file could contain CDATA tags</a> even though its original test cases didn't contain any examples that used CDATA. He also calls out our bug as exactly the kind of thing AFL is not designed to find:<br />
<br />
<blockquote class="tr_bq">
<span style="background-color: white; color: #111111; font-family: "trebuchet" , "trebuchet ms" , "arial" , sans-serif; font-size: 14.4px; line-height: 21.6px;">What seemed perfectly clear, though, is that the algorithm wouldn't be able to get past "atomic", large-search-space checks such as:</span><br />
<div style="background-color: white; color: #111111; font-family: Trebuchet, "Trebuchet MS", Arial, sans-serif; font-size: 14.4px; line-height: 21.6px;">
</div>
<code style="background-color: white; color: #666666;">if (strcmp(header.magic_password, "<span style="color: teal;">h4ck3d by p1gZ</span>")) goto terminate_now;</code><span style="background-color: white; color: #111111; font-family: "trebuchet" , "trebuchet ms" , "arial" , sans-serif; font-size: 14.4px; line-height: 21.6px;"></span><br />
<div style="background-color: white; color: #111111; font-family: Trebuchet, "Trebuchet MS", Arial, sans-serif; font-size: 14.4px; line-height: 21.6px;">
</div>
<span style="background-color: white; color: #111111; font-family: "trebuchet" , "trebuchet ms" , "arial" , sans-serif; font-size: 14.4px; line-height: 21.6px;">...or:</span><br />
<div style="background-color: white; color: #111111; font-family: Trebuchet, "Trebuchet MS", Arial, sans-serif; font-size: 14.4px; line-height: 21.6px;">
</div>
<code style="background-color: white; color: #666666;">if (header.magic_value == <span style="color: teal;">0x12345678</span>) goto terminate_now;</code><span style="background-color: white; color: #111111; font-family: "trebuchet" , "trebuchet ms" , "arial" , sans-serif; font-size: 14.4px; line-height: 21.6px;"></span><br />
<div style="background-color: white; color: #111111; font-family: Trebuchet, "Trebuchet MS", Arial, sans-serif; font-size: 14.4px; line-height: 21.6px;">
</div>
<span style="background-color: white; color: #111111; font-family: "trebuchet" , "trebuchet ms" , "arial" , sans-serif; font-size: 14.4px; line-height: 21.6px;"></span></blockquote>
<div>
<br /></div>
So how was AFL able to generate a CDATA tag out of thin air? It turns out that libxml2 has a set of macros that expand out some string comparisons into character-by-character comparisons that use simple if statements. This allows AFL to discover valid strings character by character, since each correct character will add new coverage, and cause further fuzzing to be done with that input.<br />
<br />
We can also apply this to our test program. Rather than checking for the fixed constant <span style="font-family: "courier new" , "courier" , monospace;">0x6c6175de</span>, we can compare each byte individually. This should allow AFL to identify the trigger value one byte at a time. The <a href="https://gist.github.com/moyix/ae51abcc3e199323d27b5669e653a8fe">new code</a> looks like this:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeyFBLC_83TuIvJlvbDvQ7SiQqrKkLADUcgKZWl385Z9J2W8od7nmQMez3hXY7tqYwJSYJekhwaAh_vfKfidoCY_kRr1h11-cT1e9uETwMd82_MJCjWSh0Kc7hdL0_hWucg4lRaRVnOJyA/s1600/Screenshot+2016-07-17+20.08.21.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="136" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjeyFBLC_83TuIvJlvbDvQ7SiQqrKkLADUcgKZWl385Z9J2W8od7nmQMez3hXY7tqYwJSYJekhwaAh_vfKfidoCY_kRr1h11-cT1e9uETwMd82_MJCjWSh0Kc7hdL0_hWucg4lRaRVnOJyA/s400/Screenshot+2016-07-17+20.08.21.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">The monolithic if statement has been replaced by 4 individual branches.</td></tr>
</tbody></table>
<br />
Once we make this change and compile with afl-gcc, AFL finds a crash in just 3 minutes on a single CPU!<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDB3JkT-n02DpfVL-CZvDA95Je6pZZn9MCqS0NRDGq9KocE0K5zYSXx2QlteVuVqVU47eO2YkuS38X9l1GJ4GdubEa7gzZL9aaULtcv0YNTggr3bLMdGEUhVZEFEIIdPiGiOhW1UBcBc8c/s1600/Screenshot+2016-07-17+20.11.57.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="245" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjDB3JkT-n02DpfVL-CZvDA95Je6pZZn9MCqS0NRDGq9KocE0K5zYSXx2QlteVuVqVU47eO2YkuS38X9l1GJ4GdubEa7gzZL9aaULtcv0YNTggr3bLMdGEUhVZEFEIIdPiGiOhW1UBcBc8c/s400/Screenshot+2016-07-17+20.11.57.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">AFL has found the bug!</td></tr>
</tbody></table>
<br />
This also makes me wonder if it might be worthwhile to implement a compiler pass that breaks down large integer comparisons into byte-sized chunks that AFL can deal with more easily. For string comparisons, one can already substitute in an inline implementation of strcmp/memcmp; <a href="https://github.com/mirrorer/afl/blob/master/experimental/instrumented_cmp/instrumented_cmp.c">an example is available in the AFL source</a>.<br />
<br />
<h2>
A Hidden Coverage Pitfall</h2>
<div>
<br /></div>
<div>
While investigating the coverage issues, I noticed that AFL has a new compiler: <span style="font-family: "courier new" , "courier" , monospace;">afl-clang-fast</span>. This module, contributed by <a href="http://seclab.cs.sunysb.edu/lszekeres/">László Szekeres</a>, performs instrumentation as an LLVM pass rather than by modifying the generated assembly code. As a result, it should be less brittle and allow for more instrumentation options; from what I can tell it's slated to become the default compiler for AFL at some point.</div>
<div>
<br /></div>
<div>
However, I discovered that its instrumentation is not identical to the instrumentation done by <span style="font-family: "courier new" , "courier" , monospace;">afl-as</span>. Whereas <span style="font-family: "courier new" , "courier" , monospace;">afl-as</span> instruments each x86 assembly conditional branch (that is, any of the instructions starting with "j" aside from "jmp"), <span style="font-family: "courier new" , "courier" , monospace;">afl-clang-fast</span> works at the level of LLVM basic blocks, which are closer to the blocks of code found in the original source. And since by default AFL adds -O3 to the compile command, multiple conditional checks may end up getting merged into a single basic block.</div>
<div>
<br /></div>
<div>
As a result, even though we have added multiple if statements to our source, the generated LLVM looks more like our original statement – the AFL instrumentation is only placed in the innermost if body, and so AFL is forced to try and guess the entire 32-bit trigger at once again.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQisBQELk5uBq5xEsBt7fUWwGaNXyq872x2-j4jGuhFwbkYzOUnsuv6TVbt4FGFAmdBcMWOTOAiZ2TJda5O0FrY30VQL5VHfLI8im0-4F0mnm0EGrzg5tqkOuUXoif8E71_QZ_NT64gvi8/s1600/Screenshot+2016-07-17+20.29.58.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="242" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQisBQELk5uBq5xEsBt7fUWwGaNXyq872x2-j4jGuhFwbkYzOUnsuv6TVbt4FGFAmdBcMWOTOAiZ2TJda5O0FrY30VQL5VHfLI8im0-4F0mnm0EGrzg5tqkOuUXoif8E71_QZ_NT64gvi8/s400/Screenshot+2016-07-17+20.29.58.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Using the LLVM instrumentation mode, AFL is no longer able to find our bug.</td></tr>
</tbody></table>
<div>
<br />
We can tell AFL not to enable the compiler optimizations, however, by setting the AFL_DONT_OPTIMIZE environment variable. If we do that and recompile with <span style="font-family: Courier New, Courier, monospace;">afl-clang-fast</span>, the if statements do not get merged, and AFL is able to find the trigger for the bug in about 7 minutes.<br />
<br />
So this is something to keep in mind when using <span style="font-family: Courier New, Courier, monospace;">afl-clang-fast</span>: the instrumentation does not work in quite the same way as the traditional <span style="font-family: Courier New, Courier, monospace;">afl-gcc</span> mode, and in some special cases you may need to use AFL_DONT_OPTIMIZE in order to get the coverage instrumentation that you want.</div>
<br />
<h2 style="text-align: left;">
Making AFL Smarter with a Dictionary</h2>
<br />
Although it's great that we were able to get AFL to generate the triggering input that reveals the bug by tweaking the program, it would be nice if we could somehow get it to find the bugs in our original programs.<br />
<br />
AFL is having trouble with our bugs because they require it to guess a 32-bit input all at once. The search space for this is pretty large: even supposing that it starts systematically flipping bits in the right part of the file, it's going to take an average of 2 billion executions to find the right value. And of course, unless it has some reason to believe that working on that part of the file will get improved coverage, it won't be focusing on the right file position, making it even less likely it will find the right input.<br />
<br />
However, we can give AFL a leg up by allowing it to pick inputs that aren't completely random. One of AFL's features is that it supports using a <i>dictionary</i> of values when fuzzing. This is basically just a set of tokens that it can use when mutating a file instead of picking values at random. So one <a href="https://www.ll.mit.edu/mission/cybersec/publications/publication-files/full_papers/07-03-02_Leek_TR-1112.pdf">classic trick</a> is to take all of the constants and strings found in the program binary and add them to the dictionary. Here's a quick and dirty script that extracts the constants and strings from a binary for use with AFL:<br />
<br />
<script src="https://gist.github.com/moyix/c042090d9beb6b1a7cb39f6162cd6128.js"></script>
Once we give AFL a dictionary, it finds <b>94%</b> of our bugs (<b>149/159</b>) within 15 minutes!<br />
<br />
Now, does this mean that LAVA's bugs are too easy to find? At the moment, probably yes. In the real world, the triggering conditions will not always be something you can just extract with <span style="font-family: Courier New, Courier, monospace;">objdump</span> and <span style="font-family: Courier New, Courier, monospace;">strings</span>. The key improvement needed in LAVA is a wider variety of triggering mechanisms, which is something we're working on.<br />
<br />
<br />
<h2 style="text-align: left;">
Conclusion</h2>
</div>
<div>
<br /></div>
<div>
By looking in detail at a bug we already knew was there, we found out some very interesting facts about AFL:</div>
<div>
<br /></div>
<div>
<ul style="text-align: left;">
<li>Its ability to find bugs is strongly related to the quality of its coverage instrumentation, and that instrumentation can vary due both to bugs in AFL and inherent differences in the various compile-time passes AFL supports.</li>
<li>The structure of the code also heavily influences AFL's behavior: seemingly small differences (making 4 one-byte comparisons vs one 4-byte comparison) can have a huge effect.</li>
<li>Seeding AFL with even a naïve dictionary can be devastatingly effective.</li>
</ul>
</div>
<div>
<br /></div>
<div>
In the end, this is precisely what we hoped to accomplish with LAVA. By carefully examining cases where current bug-finding tools have trouble on our synthetic bugs, we can better understand how they work and figure out how to make them better at finding real bugs as well.<br />
<br />
<h2 style="text-align: left;">
Thanks</h2>
</div>
<div>
<br /></div>
<div>
Thanks to Josh Hofing, Kevin Chung, and Ryan Stortz for helpful feedback and comments on this post, and of course Michal Zalewski for making AFL.</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-82350685236334076422016-07-11T08:07:00.000-04:002016-07-11T18:08:40.949-04:00The Mechanics of Bug Injection with LAVA<div dir="ltr" style="text-align: left;" trbidi="on">
<i>This is the second in a series of posts about evaluating and improving bug detection software by automatically injecting bugs into programs. Part one, which discussed the setting and motivation, is available <a href="http://moyix.blogspot.com/2016/06/how-to-add-a-million-bugs-to-a-program.html">here</a>.</i><br />
<i><br /></i>
Now that we understand why we might want to automatically add bugs to programs, let's look at how we can actually do it. We'll first investigate an existing approach (mutation testing), show why it doesn't work very well in our scenario, and then develop a more sophisticated injection technique that tells us exactly how to modify the program to insert bugs that meet the goals we laid out in the introductory post.<br />
<br />
<h2 style="text-align: left;">
A Mutant Strawman that Doesn't Work</h2>
<br />
One way of approaching the problem of bug injection is to just pick parts of the program that we think are currently correct and then mutate them somehow. This, essentially, is the idea behind <a href="https://en.wikipedia.org/wiki/Mutation_testing">mutation testing</a>: you use some predefined <i>mutation operators</i> that mangle the program somehow and then declare that it is now buggy.<br />
<br />
For example, we could take every instance of <span style="font-family: "courier new" , "courier" , monospace;">strncpy</span> and change it to <span style="font-family: "courier new" , "courier" , monospace;">strcpy</span>. Presumably, this would add lots of potential buffer overflows to a program that previously had none.<br />
<br />
Unfortunately, this method has a couple problems. First, it is likely that many such changes will break the program on <i>every</i> input, which would make the bug trivial to find. The following program will always fail if <span style="font-family: "courier new" , "courier" , monospace;">strncpy</span> is changed to <span style="font-family: "courier new" , "courier" , monospace;">strcpy</span>:<br />
<br />
<script src="https://gist.github.com/moyix/11bc2a9ccf1074aaeb72473644bf7084.js"></script>
We also face the opposite problem: if the bug doesn't trigger every time, we won't necessarily know how to trigger it when we want to. This will make it hard to prove that there really is a bug, and violates one of the requirements we described last time: <i>each bug must come with a triggering input that proves the bug exists</i>. If we wanted to find the triggering input for a given mutation, we'd have to find an input that reaches our mutant, which is actually a large part of what makes finding bugs hard!<br />
<br />
<h2 style="text-align: left;">
Dead, Uncomplicated and Available Data</h2>
<br />
Instead of doing random, local mutations, LAVA first tries to characterize the program's behavior on some concrete input. We'll run the program on an input file, and then try to see where that input data reaches in the program. This solves the triggering program because we will know a concrete path through the program, and the input needed to traverse that path. Now, if we can place bugs in code along that path, we will be able to reach them using the concrete input we know about.<br />
<br />
We need a couple other properties. Because we want to create bugs that are triggered only for certain values, we will want the ability to manipulate the input of the program. However, doing so might cause the program to take a different path, and the input data may get transformed along the way, making it difficult to predict what value it will have when we actually want to use it to trigger our bug.<br />
<br />
To resolve this, we will try to find parts of the program's input data that are:<br />
<br />
<ul style="text-align: left;">
<li><b>Dead</b>: not currently used much in the program (i.e., we can set to arbitrary values)</li>
<li><b>Uncomplicated</b>: not altered very much (i.e., we can predict their value throughout the program's lifetime)</li>
<li><b>Available</b> in some program variables</li>
</ul>
<br />
We'll call data that satisfies these three properties a <b>DUA</b>. DUAs try to capture the notion of <i>attacker-controlled data</i>: a DUA is something that can be set to an arbitrary value without changing the program's control flow, is available somewhere along the program path we're interested in, and whose value is predictable.<br />
<br />
<h2 style="text-align: left;">
Measuring Liveness and Complication with Dynamic Taint Analysis</h2>
<div>
<br />
Having defined these properties, we need some way to measure them. We'll do that using a technique called <i>dynamic taint analysis</i><sup><a href="#f1" name="top1">1</a></sup>. You can think of dynamic taint analysis like a <a href="https://en.wikipedia.org/wiki/Positron_emission_tomography">PET scan</a> or a <a href="https://en.wikipedia.org/wiki/Upper_gastrointestinal_series#Types_of_barium-contrast_imaging">barium swallow</a>, where a radionuclide is introduced into a patient, allowed to propagate throughout the body, and then a scan checks to see where it ends up. Similarly, with taint analysis, we can mark some data, allow it to propagate through the program, and later query to see where it ended up. This is an extremely useful feature in all sorts of reverse engineering and security tasks.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgk6BvWKGmyM4nS5lvuidwpxICA6RfoMr6wTwci7PJY89XkgOJsCpq33rSbdRMbYlM0nXp-Dk4QOMKrdkrcuL8h560pxcBBmvY4smDUPQYk_kT-e5FNLc6jWTiFFzD6O2fDhYJijy1MFE8m/s1600/PET.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgk6BvWKGmyM4nS5lvuidwpxICA6RfoMr6wTwci7PJY89XkgOJsCpq33rSbdRMbYlM0nXp-Dk4QOMKrdkrcuL8h560pxcBBmvY4smDUPQYk_kT-e5FNLc6jWTiFFzD6O2fDhYJijy1MFE8m/s1600/PET.jpg" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Like a PET scan, dynamic taint analysis works by seeing where marked input ends up in your program.</td></tr>
</tbody></table>
<br />
To find out where input data is <i><b>available</b></i>, we can taint the input data to the program – essentially assigning a unique label to each byte of the program's input. Then, as the program runs, we'll propagate those labels as data is copied around the program, and query any variables in scope as the program runs to see if they are derived from some portion of the input data, and if so, from precisely which bytes.<br />
<br />
Next, we want to figure out what data is currently unused. To do so, we'll extend simple dynamic taint analysis by checking, every time there's a branch in the program, whether the data used to decide it was tainted, and if so, which input bytes were used make the decision. At the end, we'll know exactly how many branches in the program each byte of the input was used to decide. This measure is known as <i><b>liveness</b></i>.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgshWkaOlD76tBs4t1Ir-bCzSrshD25C4zJcIhMZ62oTKrBA3qbfSfaq1J9zNNQEd2zt2TUUs8dwA1CfgWd7OhDbzDu0WLRW8dP5zjGrg8NvqYmu28LOXJiwmD2wg1rmnQEiLvXQ8ehy1wQ/s1600/liveness.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="175" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgshWkaOlD76tBs4t1Ir-bCzSrshD25C4zJcIhMZ62oTKrBA3qbfSfaq1J9zNNQEd2zt2TUUs8dwA1CfgWd7OhDbzDu0WLRW8dP5zjGrg8NvqYmu28LOXJiwmD2wg1rmnQEiLvXQ8ehy1wQ/s320/liveness.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Liveness measures how many branches use each input byte.</td></tr>
</tbody></table>
<br />
Finally, we want some measure of how complicated the data in each tainted program variable is. We can do this with another addition to the taint analysis. In standard taint analysis, whenever data is copied or computed in the program, the taint system checks if the source operands are tainted and if so propagates the taint labels to the destination. If we want to measure how <i>complicated</i> a piece of data is – that is, how much it has been changed since it was first introduced to the program – we can simply add a new rule that increments a counter whenever an arithmetic operation on tainted data occurs. That is, if you have something like <span style="font-family: "courier new" , "courier" , monospace;">c = a + b</span>; then the <i><b>taint compute number (TCN)</b> </i>of c is <span style="font-family: "courier new" , "courier" , monospace;">tcn(c) = max(tcn(a),tcn(b)) + 1</span>.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_fK8xZwReBWOzJxZf78JM3EToQL0gSqW6tZDxYfH72hfJFbz5HoZkWumsQsJ2Vm7Jz81p4DAZv-SuENKDAEdZMNsMRHfA3AgMQSaFZ0_6cwQRtHo3vQqSx9IIOk03_bHQYsKcVDl6owCf/s1600/tcn.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="295" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_fK8xZwReBWOzJxZf78JM3EToQL0gSqW6tZDxYfH72hfJFbz5HoZkWumsQsJ2Vm7Jz81p4DAZv-SuENKDAEdZMNsMRHfA3AgMQSaFZ0_6cwQRtHo3vQqSx9IIOk03_bHQYsKcVDl6owCf/s320/tcn.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><div class="p1">
TCN measures how much computation has been done on a variable at a given point in the program.</div>
</td></tr>
</tbody></table>
<br />
On the implementation side, all this is done using <a href="https://github.com/moyix/panda">PANDA</a>, our platform for dynamic analysis. PANDA's <a href="https://github.com/moyix/panda/blob/master/docs/manual.md#taint-related-plugins">taint system</a> allows us to taint an input file with unique byte labels. To query the state of program variables, we use a <a href="http://clang.llvm.org/docs/Tooling.html">clang tool</a> that modifies the original program source code<sup><a href="#f2" name="top2">2</a></sup> to add code that asks PANDA to query and log the taint information about a particular program variable. When we run the program under PANDA, we'll get a log telling us exactly which program variables were tainted, how complicated the data was, and how live each byte of input is.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiquROONOK-abFf-J6D6OpjmrvyhuQh9Fjp7Iwl3vbBlva2Bx0o7f-cAeSPNApAUD5qvqPyWVkXgX4ejS9UmevK8gweACkr6M8_DFHaXrHlAcSL7gU5RnQ5HejTWtn3fI0X5rI5GRdSY5fq/s1600/PANDA.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="102" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiquROONOK-abFf-J6D6OpjmrvyhuQh9Fjp7Iwl3vbBlva2Bx0o7f-cAeSPNApAUD5qvqPyWVkXgX4ejS9UmevK8gweACkr6M8_DFHaXrHlAcSL7gU5RnQ5HejTWtn3fI0X5rI5GRdSY5fq/s320/PANDA.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">PANDA's taint system allows us to find DUAs in the program.</td></tr>
</tbody></table>
<br />
After running PANDA, we can pick out the variables that are uncomplicated and derived from input bytes with low liveness. These are our <b>DUAs</b>, approximations of attacker controlled data that can be used to create bugs.<br />
<br /></div>
<div>
<h2 style="text-align: left;">
Finding Attack Points</h2>
</div>
<div>
<br /></div>
<div>
With some DUAs in hand, we now have the raw material we need to create our bugs. The last missing piece is finding some code we want to have an effect on. These are places where we can use the data from a DUA to trigger some buggy effect on the program, which we call <i><b>attack points (ATP)</b></i>. In our current implementation, we look for places in the program where pointers are passed into functions. We can then use the DUA to modify the pointer, which will hopefully cause the program to perform an out of bounds read or write – a classic memory safety violation.<br />
<br />
Because we want the bug to trigger only under certain conditions, we will also add code at the attack point that checks if the data from the DUA has a specific value or is in a specific range of values. This gives us some control over how much of the input space triggers the bug. The current implementation can produce both specific-value triggers (DUA == magic_value) and range-based triggers of varying sizes (x < DUA < y).<br />
<br />
Each LAVA bug, then is just a pair (DUA, ATP) where the attack point occurs in the program trace after the DUA. If there are many DUAs and many attack points, then we will be able to inject a number of bugs roughly proportional to the product of the two. In large programs like Wireshark, this adds up to <i>hundreds of thousands of potential bugs </i>for a single input file! In our tests, multiple files increased the number of bugs roughly linearly, in proportion to the amount of coverage achieved by the extra input. Thus, with just a handful of input files on a complex program you can easily reach millions of bugs.</div>
<div>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8ZkA-IeH_VNHy3Q7T_KL5YcMz1TgEaiDc8cVe4ibJNQRKb5kak4sj1fO0imJ213BgFSuRJ-OvCblq2PLpb_J1opJ26b-AV3Hdm__kSCeHnTH2H6wvJwSH0UXYqwlxMtM0nUM-dM00fkFU/s1600/lavabug.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="99" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh8ZkA-IeH_VNHy3Q7T_KL5YcMz1TgEaiDc8cVe4ibJNQRKb5kak4sj1fO0imJ213BgFSuRJ-OvCblq2PLpb_J1opJ26b-AV3Hdm__kSCeHnTH2H6wvJwSH0UXYqwlxMtM0nUM-dM00fkFU/s320/lavabug.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Our "formula" for injecting a bug. Any (DUA, ATP) pair where the DUA occurs before the attack point is a potential bug we can inject.</td></tr>
</tbody></table>
</div>
<h2 style="text-align: left;">
Modifying the Source Code</h2>
<div>
<br />
The last step is to modify the source code to add our bug. We will insert code in two places:<br />
<ol style="text-align: left;">
<li>At the DUA site, to save a copy of the input data to a global variable.</li>
<li>At the attack point, to retrieve the DUA's data, check if it satisfies the trigger condition, and use it to corrupt the pointer.</li>
</ol>
By doing so, we create a new data flow between the place where our attacker-controlled data is available and the place where we want to manifest the bug.<br />
<br />
<h2 style="text-align: left;">
A Toy Example</h2>
<br />
To see LAVA in action, let's step through a full example. Have a look at this small program, which parses and prints information about a very simple binary file format:<br />
<br />
<script src="https://gist.github.com/moyix/93cd687fde9fb965cfb7d508118d27c1.js"></script><br />
We start by instrumenting the source code to add taint queries. The queries will be inserted to check taint on program variables, and, for aggregate data structures, the members inside each structure. The result is a bit too long to include inline, since it quadruples the size of the original program, but you can <a href="https://gist.github.com/moyix/f417ae57a06fd00573787144ab694513">see it in this gist</a>.<br />
<br />
When we compile and run that program on some input inside of PANDA with taint tracking enabled, we get information about taint compute numbers and the liveness of each byte of the input. For example, here's the liveness map for a small (88 byte) input:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8_cheF8eiObXBjoaz1ldzYEGYd5ldZrFczFYiiIxtaA2dBR134pfbxGFtenDHafR_nUkR6ja9qgga3EOprl8ks1f4tMZpio8Uwj_M_x584tesXGPoY2t0GRl8O_-Os8BXWUlF7dBwHEu7/s1600/toy_dead_data.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="270" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi8_cheF8eiObXBjoaz1ldzYEGYd5ldZrFczFYiiIxtaA2dBR134pfbxGFtenDHafR_nUkR6ja9qgga3EOprl8ks1f4tMZpio8Uwj_M_x584tesXGPoY2t0GRl8O_-Os8BXWUlF7dBwHEu7/s400/toy_dead_data.png" width="400" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Liveness map for the input to our toy program. The bytes with a white background are completely dead – they can be set to arbitrary values without affecting the behavior of the program.</td></tr>
</tbody></table>
<br />
LAVA's analysis finds 82 DUAs and 8 attack points, for a total of 407 potential bugs. Not all of these bugs will be viable: because we want to measure the effect of liveness and taint compute number, the current implementation does not impose limits on how live or complicated the DUAs used in bugs are.<br />
<br />
To make sure that an injected bug really is a bug, we do two tests. First, we run the modified program on a non-triggering input, and verify that it runs correctly. This ensures that we didn't accidentally break the program in a way we weren't expecting. Second, we run it on the triggering input and check that it causes a crash (a segfault or bus error). If it passes both tests we deem it a valid bug. This could miss some valid bugs, of course – not all memory corruptions will cause the program to crash – but we're interested mainly in bugs that we can easily prove are real. Another approach might be to run the buggy program under <a href="https://github.com/google/sanitizers/wiki/AddressSanitizer">Address Sanitizer</a> and check to see if it flags any memory errors. After validation, we find that LAVA is able to inject <b>159 bugs</b> into the toy program, for a yield of around <b>39%</b>.<br />
<br />
Let's look at an example bug (I've cleaned up the source a little bit by hand to make it easier to read; programmatically generated code is not pretty):<br />
<br />
<script src="https://gist.github.com/moyix/7117e2c29f3bf716e1047ceca5ffdc45.js"></script><br />
On lines 6–15, after parsing the file header, we add code that saves off the value of the <span style="font-family: Courier New, Courier, monospace;">reserved</span> field<sup><a href="#f3" name="top3">3</a></sup>, which our analysis correctly told us was dead, uncomplicated, and available in <span style="font-family: Courier New, Courier, monospace;">head.reserved</span>. Then, on line 20, we retrieve the value and conditionally add it to the pointer <span style="font-family: Courier New, Courier, monospace;">ent</span> that is being passed to <span style="font-family: Courier New, Courier, monospace;">consume_record</span> (checking the value in both possible byte orders, because endianness is hard). When <span style="font-family: Courier New, Courier, monospace;">consume_record</span> tries to access fields inside the <span style="font-family: Courier New, Courier, monospace;">file_entry</span>, it crashes. In this case, the DUA and attack point were in the same function, and so the use of a global variable was not actually necessary, but in a larger program the DUA and attack point could be in different functions or even different compilation units.<br />
<br />
If you like, you can <a href="http://laredo-13.mit.edu/~brendan/toy_example_distrib.tar.gz">download all 407 buggy program versions</a>, along with the original source code and triggering inputs. Note that the current implementation does not make any attempt to hide the bugs from human eyes, so you will very easily be able to spot them by looking at the source code.<br />
<br /></div>
<div>
<h2 style="text-align: left;">
Next Time</h2>
</div>
<div>
<br />
Having developed a bug injection system, we would like to know how well it performs. In the next post, we'll examine questions of <i>evaluation</i>: how <i>many</i> bugs can we inject, and how do the liveness and taint compute measures influence the number of viable bugs? How <i>realistic</i> are the bugs? (much more complicated than it may first appear!) And how <i>effective</i> are some common bug-finding techniques like symbolic execution and fuzzing? We'll explore all these and more.</div>
<div>
<br /></div>
<div>
<br /></div>
<hr width="80%" />
<span class="Apple-style-span" style="font-size: x-small;"><br />
<a name="f1"><b>1 </b></a>Having worked with dynamic program analysis for so long, I sometimes forget how ridiculous the term "dynamic taint analysis" is. If you're looking for another way to say the same thing, you can use "information flow" but dynamic taint analysis is the name that seems to have stuck.<a href="#top1"><sup>↩</sup></a><br />
</span>
<span class="Apple-style-span" style="font-size: x-small;"><br />
<a name="f2"><b>2 </b></a>Getting taint information by instrumenting the source works, but has a few drawbacks. Most notably, it causes a huge increase in the size of the source program, and slows it down dramatically. We're currently finishing up a new method, <a href="https://github.com/moyix/panda/blob/master/qemu/panda_plugins/pri_taint/USAGE.md">pri_taint</a>, which can do the taint queries on <em>uninstrumented</em> programs as long as they have debug symbols. This should allow LAVA to scale to larger programs like Firefox.<a href="#top2"><sup>↩</sup></a><br />
</span>
<span class="Apple-style-span" style="font-size: x-small;"><br />
<a name="f3"><b>3 </b></a>The slightly weird ({ }) construct is a non-standard extension to C called a <a href="https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html">statement expression</a>. It allows multiple statements to be executed in a block with control over what the block as a whole evaluates to. It's a nice feature to have available for automatically generated code, as it allows you to insert arbitrary statements in the middle of an expression without worrying about messing up the evaluation.<a href="#top3"><sup>↩</sup></a><br />
</span>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com3tag:blogger.com,1999:blog-6787362638788314904.post-71626290038581277492016-06-07T09:08:00.003-04:002016-07-11T08:57:19.051-04:00How to add a million bugs to a program (and why you might want to)<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
<i>This is the first in a series of posts about evaluating and improving bug detection software by automatically injecting bugs into programs. You can find part two, with technical details of our bug injection technique, <a href="">here</a>.</i><br /><br />
<span style="font-weight: normal;"><span style="font-family: inherit;">In this series of posts, I'm going to describe how to <em style="font-weight: normal;">automatically</em> put bugs in programs, a topic on which we just published a <a href="http://www.ieee-security.org/TC/SP2016/papers/0824a110.pdf" style="font-weight: normal;">paper</a> at Oakland, one of the top academic security conferences. The system we developed, <b>LAVA</b>, can put millions of bugs into real-world programs. Why would anyone want to do this? Are my coauthors and I sociopaths who just want to watch the world burn? No, but to see why we need such a system requires a little bit of background, which is what I hope to provide in this first post.</span></span><br />
<span style="font-weight: normal;"><span style="font-family: "times" , "times new roman" , serif;"><br /></span></span></div>
<a href="https://www.blogger.com/blogger.g?blogID=6787362638788314904" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a>I am sure this will come as a shock to most, but <strong>programs written by humans have bugs</strong>. Finding and fixing them is immensely time consuming; just how much of a developer's time is spent debugging is hard to pin down, but estimates range between <a href="http://programmers.stackexchange.com/a/91764">40%</a> and <a href="http://coralogix.com/this-is-what-your-developers-are-doing-75-of-the-time-and-this-is-the-cost-you-pay/">75%</a>. And of course these errors can be not only costly for developers but catastrophic for users: attackers can exploit software bugs to run their own code, install malware, set your computer on fire, etc.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwiFIrJMdSmvC_O888As28srR0ptT7NToAKt3AfnZAWALlSgvpX0k3Bm2NkWdLphtuyyk9MUTDY84OhK3exeO-OEjvmtNXGminMdaNJ7GBa9CIMVwyVwzsYdmtp7amGdPXuruJa_FPE6Aa/s1600/Computer_bomb.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="270" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwiFIrJMdSmvC_O888As28srR0ptT7NToAKt3AfnZAWALlSgvpX0k3Bm2NkWdLphtuyyk9MUTDY84OhK3exeO-OEjvmtNXGminMdaNJ7GBa9CIMVwyVwzsYdmtp7amGdPXuruJa_FPE6Aa/s320/Computer_bomb.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Weekly World News has known about this problem for <i>years</i>.</td></tr>
</tbody></table>
<br />
<br />
It should come as little surprise, then, that immense effort has been expended in finding ways to locate and fix bugs <em>automatically</em>. On the academic side, techniques such as <a href="ftp://ftp.cs.wisc.edu/paradyn/technical_papers/fuzz.pdf">fuzzing</a>, <a href="http://llvm.org/pubs/2008-12-OSDI-KLEE.pdf">symbolic execution</a>, <a href="http://link.springer.com/chapter/10.1007%2F3-540-10003-2_69">model checking</a>, <a href="http://www.di.ens.fr/~cousot/publications.www/CousotCousot-POPL-77-ACM-p238--252-1977.pdf">abstract interpretation</a>, and <a href="https://www.internetsociety.org/sites/default/files/blogs-media/driller-augmenting-fuzzing-through-selective-symbolic-execution.pdf">creative combinations</a> of those techniques, have been proposed and refined for the past 25 years. Nor has industry been idle: companies like <a href="http://www.coverity.com/">Coverity</a>, <a href="http://www8.hp.com/us/en/software-solutions/application-security/">Fortify</a>, <a href="http://www.veracode.com/">Veracode</a>, <a href="http://www.klocwork.com/">Klocwork</a>, <a href="http://www.grammatech.com/">GrammaTech</a>, and many more will happily sell (or rent) you a product that automatically finds bugs in your program.<br />
<br />
Great, so by now we must surely have solved the problem, right? Well, not so fast. We should probably check to see how well these tools and techniques work. Since they're detectors, the usual way would be to measure the <em><b>false positive</b></em> and <em><b>false negative</b></em> rates. To measure false positives, we can just run one of these tools on our program, go through the output, and decide whether we think each bug it found is real.<br />
<br />
The same strategy does <em>not</em> work for measuring false negatives. If a bug finder reports finding 42 bugs in a program, we have no way of knowing whether that's 99% or 1% of the total. And this seems like the piece of information we'd most like to have!<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2pDTTxnMgAcmDexmjz9gbgt_CYBFk0T1zXbC2Duqg5d-DK7a16czklds_tYSGtJIAZ3uZrpCZ04voe67Gb2FdGiYlhyphenhyphenKKGCX8a9EFHPwFiqCqVTA93cedwE8h8E8h7InaWAIckR-zFUxN/s1600/heartbleed.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2pDTTxnMgAcmDexmjz9gbgt_CYBFk0T1zXbC2Duqg5d-DK7a16czklds_tYSGtJIAZ3uZrpCZ04voe67Gb2FdGiYlhyphenhyphenKKGCX8a9EFHPwFiqCqVTA93cedwE8h8E8h7InaWAIckR-zFUxN/s200/heartbleed.png" width="165" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;"><span style="font-size: x-small;">Heartbleed: detectable with static analysis tools, but only after the fact.</span></td></tr>
</tbody></table>
<br />
<br />
To measure false negatives we need a source of bugs so that we can tell how many of them our bug-finder detects. One strategy might be to look at historical bug databases and see how many of those bugs are detected. Unfortunately, these sorts of corpora are fixed in size – there are only so many bugs out there, and analysis tools will, over time, be capable of detecting most of them. We can see how this dynamic played out with <a href="http://heartbleed.com/">Heartbleed</a>: shortly after the bug was found, <a href="http://security.coverity.com/blog/2014/Apr/on-detecting-heartbleed-with-static-analysis.html">Coverity</a> and <a href="http://blogs.grammatech.com/finding-heartbleed-with-codesonar">GrammaTech</a> quickly found ways to improve their software so that it could find Heartbleed.<br />
<br />
Let me be clear – it's a <em>good thing</em> that vendors can use test cases like these to improve their products! But it's <em>bad</em> when these test cases are in short supply, leaving users with no good way of evaluating false negatives and bug finders with no clear path to improving their techniques.<br />
<br />
This is where LAVA enters the picture. If we can find a way to automatically add realistic bugs to pre-existing programs, we can both measure how well current bug finding tools are doing, and provide an endless stream of examples that bug-finding tools can use to get better.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirUJDlohp69d9VQa0Yy-FofEizZsgvNqkfU8ETMW8rLgQ2PPtJtluE0lUCaKozda-uD72OiyeVY3S-os4m9aLUyycu-TZm0AIw4_JcDR7ikNYDuaz-qagIrL8w2Koymalfj1UqA2ubDPYb/s1600/lavalogo.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="265" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEirUJDlohp69d9VQa0Yy-FofEizZsgvNqkfU8ETMW8rLgQ2PPtJtluE0lUCaKozda-uD72OiyeVY3S-os4m9aLUyycu-TZm0AIw4_JcDR7ikNYDuaz-qagIrL8w2Koymalfj1UqA2ubDPYb/s320/lavalogo.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">LAVA: Large-scale Automated Vulnerability Analysis</td></tr>
</tbody></table>
<br />
<h2 id="goals-for-bug-corpora">
Goals for Automated Bug Corpora</h2>
<br />
So what do we want out of our bug injection? In our paper, we defined five goals for automated bug injection, requiring that injected bugs<br />
<ol style="text-align: left;">
<li><div class="p1">
Be cheap and plentiful</div>
</li>
<li><div class="p1">
Span the execution lifetime of a program</div>
</li>
<li><div class="p1">
Be embedded in representative control and data flow</div>
</li>
<li><div class="p1">
Come with a triggering input that proves the bug exists</div>
</li>
<li>
<div class="p1">
<span class="s1">Manifest for a very small fraction of possible inputs</span></div>
</li>
</ol>
The first goal we've already discussed – if we want to evaluate tools and enable "hill climbing" by bug finders we will want a lot of bugs. If it's too expensive to add a bug, or if we can only add a handful per program, then we don't gain much by doing it automatically – expensive humans can already add small numbers of bugs to programs by hand.<br />
<br />
The next two relate to whether our (necessarily artificial) bugs are reasonable proxies for real bugs. This is a tricky and contentious point, which we'll return to in part three. For now, I'll note that the two things called out here – occurring throughout the program and being embedded in "normal" control and data flow – are intended to capture the idea that program analyses will need to do essentially the same reasoning about program behavior to find them as they would for any other bugs. In other words, they're intended to help ensure that getting better at finding LAVA bugs will make tools better at understanding programs generally.<br />
<br />
The fourth is important because it allows us to demonstrate, conclusively, that the bugs we inject are real problems. Concretely, with LAVA we can demonstrate an input for each bug we inject that causes the program to crash with a segfault or bus error.<br />
<br />
The final property is critical but not immediately obvious. We don't want the bugs we inject to be too <i>easy</i> to find. In particular, if a bug manifests on most inputs, then it's trivial to find it – just run the program and wait for the crash. We might even want this to be a tunable parameter, so that we could specify what fraction of the input space of a program causes a crash and dial the difficulty of finding the right input up or down.<br />
<br />
<h2 id="ethics-of-bug-injection">
Ethics of Bug Injection</h2>
<div>
<br /></div>
<div>
A common worry about bug injection is that it could be misused to add backdoors into legitimate software. I think these worries are, for the most part, misplaced. To see why, consider the goals of a would-be attacker trying to sneak a backdoor into some program. They want:</div>
<div>
<ol style="text-align: left;">
<li>A way to get the program to do something bad on some secret input.</li>
<li>Not to get caught (i.e., to be stealthy, and for the bugs to be <i>deniable</i>).</li>
</ol>
Looking at (1), it's clear that one bug suffices to achieve the goal; there's no need to add millions of bugs to a program. Indeed, adding millions of bugs <i>harms </i>goal (2) – it would require lots of changes to the program source, which would be very difficult to hide.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpeQEZCVUkKIP_iUYriAM5PO9smxkIriLiEdu9c-xaB9MvrilBgK1Cp3Xtg0PSEcyI_TtlcqUpiiAfCEaVUnVi3kJXMzYcHqHgWxYyITLkGLrx9mhyjFnJqMytp8LUTGmo77zNr5c47dbl/s1600/Screenshot+2016-06-07+00.48.16.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgpeQEZCVUkKIP_iUYriAM5PO9smxkIriLiEdu9c-xaB9MvrilBgK1Cp3Xtg0PSEcyI_TtlcqUpiiAfCEaVUnVi3kJXMzYcHqHgWxYyITLkGLrx9mhyjFnJqMytp8LUTGmo77zNr5c47dbl/s1600/Screenshot+2016-06-07+00.48.16.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">An attempted <a href="https://freedom-to-tinker.com/blog/felten/the-linux-backdoor-attempt-of-2003/">Linux kernel backdoor attempt</a> from 2003. Can you spot the bugdoor?</td></tr>
</tbody></table>
<div>
<br /></div>
<div>
In other words, the benefit that LAVA provides is in <b>adding lots of bugs at scale</b>. An attacker that wants to add a backdoor can easily do it by hand – they only need to add one, and even if it takes a lot of effort to understand the program, that effort will be rewarded with extra stealth and deniability. Although the bugs that LAVA injects are realistic in many ways, they do not look like mistakes a programmer would have naturally made, which means that manual code review would be very likely to spot them.</div>
<div>
<br /></div>
<div>
(There is one area where LAVA might help a would-be attacker – the analysis we do to locate portions of the program that have access to attacker controlled input could conceivably speed up the process of inserting a backdoor by hand. But this analysis is quite general, and is useful for far more than just adding bugs to programs.)<br />
<br /></div>
<h2 style="text-align: left;">
The Road Ahead</h2>
<div>
<br /></div>
<div>
The next post will discuss the actual mechanics of automated bug injection. We'll see how, using some new taint analyses in PANDA we can analyze a program to find small modifications that cause attacker-controlled input to reach sensitive points in the program and <i>selectively</i> trigger memory safety errors when the input is just right.</div>
<div>
<br /></div>
<div>
Once we understand how LAVA works, the final post will be about evaluation: how can we tell if LAVA succeeded in its goals of injecting massive numbers of realistic bugs? And how well do current bug-finders fare at finding LAVA bugs?</div>
<div>
<br /></div>
<h2 style="text-align: left;">
Credits</h2>
<div>
<br /></div>
<div>
The idea for LAVA originated with Tim Leek of MIT Lincoln Laboratory. Our paper lists authors alphabetically, because designing, implementing and testing it truly was a group effort. I am honored to share a byline with <span title="MIT Lincoln Laboratory">Patrick Hulin</span>, <span title="Northeastern University">Engin Kirda</span>, <span title="MIT Lincoln Laboratory">Tim Leek</span>, <span title="Northeastern University">Andrea Mambretti</span>, <span title="Northeastern University">Wil Robertson</span>, <span title="MIT Lincoln Laboratory">Frederick Ulrich</span>, and <span title="MIT Lincoln Laboratory">Ryan Whelan</span>.</div>
</div>Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-61597250023286090922016-01-05T19:16:00.003-05:002016-01-05T19:16:39.788-05:00PANDA Plugin Documentation<div dir="ltr" style="text-align: left;" trbidi="on">
It's been a very long time coming, but over the holiday break I went through and created basic documentation for all 54 currently-available PANDA plugins. Each plugin now includes a manpage-style document named <span style="font-family: Courier New, Courier, monospace;">USAGE.md</span> in its plugin directory.<br />
<br />
You can find a master list of each plugin and a link to its man page here:<br />
<a href="https://github.com/moyix/panda/blob/master/docs/Plugins.md">https://github.com/moyix/panda/blob/master/docs/Plugins.md</a><br />
<br />
Hopefully this will help people get started using PANDA to do some cool reverse engineering!</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com4tag:blogger.com,1999:blog-6787362638788314904.post-77669023335042741512015-10-02T16:55:00.000-04:002015-10-02T16:55:11.503-04:00PANDA VM Update October 2015<div dir="ltr" style="text-align: left;" trbidi="on">
The PANDA Virtual machine has once again been updated, and you can download it from:<br />
<br />
<div style="text-align: center;">
<a href="http://laredo-13.mit.edu/~brendan/pandavm-20151002.ova">http://laredo-13.mit.edu/~brendan/pandavm-20151002.ova</a></div>
<br />
Notable changes:<br />
<br />
<ul style="text-align: left;">
<li>We fixed a <a href="https://github.com/moyix/panda/issues/18">record/replay bug</a> that was preventing Debian Wheezy and above from replaying properly.</li>
<li>The <a href="http://wiki.osdev.org/Kernel_Debugging#Use_gdb_with_Qemu">QEMU GDB stub</a> now works during replay, so you can break, step, etc. at various points during the replay to figure out what's going on. We still haven't implemented reverse-step though – hopefully in a future release.</li>
<li>Thanks to Manolis Stamatogiannakis, the <a href="https://github.com/moyix/panda/tree/master/qemu/panda_plugins/osi_linux">Linux OS Introspection code</a> can now resolve file descriptors to actual filenames. Tim Leek then extended the file_taint plugin to use this information, so file-based tainting should be more accurate now, even if things like dup() are used.</li>
<li>We have added support for more versions of Windows in the <a href="https://github.com/moyix/panda/blob/master/docs/syscalls2.md">syscalls2</a> code.</li>
</ul>
Enjoy!</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com3tag:blogger.com,1999:blog-6787362638788314904.post-72450789164478691182015-08-27T16:37:00.000-04:002015-08-28T20:14:29.659-04:00(Sys)Call Me Maybe: Exploring Malware Syscalls with PANDA<div dir="ltr" style="text-align: left;" trbidi="on">
System calls are of great interest to researchers studying malware, because they are the only way that malware can have any effect on the world – writing files to the hard drive, manipulating the registry, sending network packets, and so on all must be done by making a call into the kernel.<br />
<br />
In Windows, the system call interface is not publicly documented, but there have been lots of good reverse engineering efforts, and we now have <a href="http://j00ru.vexillium.org/ntapi/">full tables of the names of each system call</a>; in addition, by using the Windows debug symbols, we can figure out how many arguments each system call takes (though not yet their actual types).<br />
<br />
I recently ran 24,389 malware replays under PANDA and recorded all the system calls made, along with their arguments (just the top-level argument, without trying to descend into pointer types or dereference handle types). So for each replay, we now have a log file that looks like:<br />
<br />
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2340 NtGdiFlush</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2340 NtUserGetMessage 0175feac 00000000 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtCreateEvent 0058f8d8 001f0003 00000000 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtWaitForMultipleObjects 00000002 0058f83c 00000001 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtSetEvent 000002ec 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtWaitForSingleObject 000002f0 00000000 0058f89c</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtReleaseWorkerFactoryWorker 00000050</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtReleaseMutant 00000098 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtWaitForSingleObject 000005a4 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtWaitForMultipleObjects 00000002 00dbf49c 00000001 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtReleaseMutant 00000098 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtWaitForMultipleObjects 00000002 00dbf4a8 00000001 00000000 00dbf4c8</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtWaitForMultipleObjects 00000002 00dbf49c 00000001 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtClearEvent 000002ec</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtReleaseMutant 00000098 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtWaitForMultipleObjects 00000002 00dbf49c 00000001 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtReleaseMutant 000001e8 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtWaitForMultipleObjects 00000002 00dbf3b8 00000001 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtReleaseMutant 00000158 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtCreateEvent 00dbeed4 001f0003 00000000 00000000 00000000</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtDuplicateObject ffffffff fffffffe ffffffff 002edf50 00000000 00000000 00000002</span></span></div>
<br />
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">3f9b2120 NtTestAlert</span></span></div>
<div class="p1">
<span class="s1"><span style="font-family: Courier New, Courier, monospace; font-size: xx-small;">...</span></span></div>
<br />
<div class="p1">
<span class="s1">The first column identifies the process that made the call, using its address space as a unique identifier. The second gives the name of the call, and the remaining columns show the arguments passed to the function.</span></div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span class="s1">As usual, this data can be freely downloaded; the data set is 38GB. Each log file is compressed; you can use the showsc program (included in the tarball) to display an individual log file:</span></div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"><span class="s1">$ ./showsc 32 </span>32bit/008d065f-7f5d-4a86-9995-970509ff3999_syscalls.dat.gz</span></div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span class="s1">You can download the data set here:</span></div>
<div class="p1">
</div>
<ul style="text-align: left;">
<li><a href="http://laredo-13.mit.edu/~brendan/malrec/malrec_syscalls.tar">Direct download</a> (<a href="http://laredo-13.mit.edu/~brendan/malrec/malrec_syscalls.tar.md5">md5</a>)</li>
<li><a href="http://laredo-13.mit.edu/~brendan/malrec/malrec_syscalls.torrent">Torrent download</a></li>
</ul>
<br />
<h3 style="text-align: left;">
<span class="s1">Interesting Malware System Calls</span></h3>
<div class="p1">
<span class="s1">As a first pass, we can look at what the <i>least</i> commonly used system calls are. These may be interesting because rarely used system calls are more likely to contain bugs; in the context of malware, invoking a vulnerable system call can be a way to achieve <i>privilege escalation</i>.</span></div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span class="s1">Here are a few that came out from sorting the list of system calls in the malrec dataset and then searching Google for some of the least common:</span></div>
<div class="p1">
</div>
<ul style="text-align: left;">
<li><a href="http://j00ru.vexillium.org/?p=1455">NtUserMagControl</a> (1 occurrence) One of many functions found by j00ru to cause crashes due to invalid pointer dereferences when called from the context of the CSRSS process</li>
<li><a href="http://j00ru.vexillium.org/?p=866">NtSetLdtEntries</a> (2 occurrences) Used as an anti-debug trick by some malware</li>
<li><a href="http://j00ru.vexillium.org/?p=1479">NtUserInitTask</a> (3 occurrences) Used as part of an exploit for <a href="http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-2553">CVE-2012-2553</a></li>
<li><a href="https://www.exploit-db.com/exploits/3755/">NtGdiGetNearestPaletteIndex</a> (3 occurrences) Used in an exploit for <a href="https://technet.microsoft.com/library/security/ms07-017">MS07-017</a></li>
<li><a href="http://j00ru.vexillium.org/?p=783">NtQueueApcThreadEx</a> (5 occurrences) Mentioned as a way to get attacker-controlled code into the kernel, allowing one to bypass <a href="http://blog.ptsecurity.com/2012/09/intel-smep-overview-and-partial-bypass.html">SMEP</a></li>
<li><a href="http://h30499.www3.hp.com/t5/HP-Security-Research-Blog/Just-another-day-at-the-office-A-ZDI-analyst-s-perspective-on/ba-p/6710637">NtUserConvertMemHandle</a> (5 occurrences) Used to replace a freed kernel object with attacker data in an exploit for <a href="http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-0058">CVE-2015-0058</a></li>
<li><a href="https://www.exploit-db.com/bypassing-uac-with-user-privilege-under-windows-vista7-mirror/">NtGdiEnableEudc</a> (9 occurrences) Used in a privilege escalation exploit where NtGdiEnableEudc assumes a certain registry key is of type REG_SZ without checking, allowing an attacker to overflow a stack buffer (<strike>I was unable to find anything about whether this has been patched</strike> – <b>Update</b>: <a href="https://twitter.com/markwo/status/637021939790319616"><span id="goog_1746497464"></span>Mark Wodrich<span id="goog_1746497465"></span></a> points out that this is <a href="http://www.cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-4398">CVE-2010-4398</a> and it was patched in <a href="https://technet.microsoft.com/library/security/ms11-011">MS11-011</a>)</li>
<li><a href="https://github.com/JeremyFetiveau/Exploits/blob/master/MS10-058.cpp">NtAllocateReserveObject</a> (11 occurrences) Used for a kernel pool spray</li>
<li><a href="http://blog.cr0.org/2010/01/cve-2010-0232-microsoft-windows-nt-gp.html">NtVdmControl</a> (55 occurrences) Used for the famous <a href="http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-0232">CVE-2010-0232</a> bug; Tavis Ormandy won the Pwnie for Best Privilege Escalation Bug in 2010 for this.</li>
</ul>
<div>
Of course, we can't say for sure that the replays that execute these calls actually contain exploitation attempts. After all, there are benign ways to use each of the calls, or they wouldn't be in Windows in the first place :) But these are a few that may reward closer examination; if they are in fact exploit attempts, you can then use PANDA's record and replay facility to step through the exploit in as much detail as you like. You can even use PANDA's <a href="https://github.com/moyix/panda/commit/58f2d4a1823f5257841ace63266de17f93e33905">recently-fixed QEMU gdb stub</a> to go through the exploit instruction by instruction.</div>
<div>
<br /></div>
<div>
You can peruse the full list of system calls and their frequencies here: <a href="http://laredo-13.mit.edu/~brendan/malrec/sc32_counts.txt">32-bit</a>, <a href="http://laredo-13.mit.edu/~brendan/malrec/sc64_counts.txt">64-bit</a>. Let me know if you find any other interesting calls in there :)</div>
<h3 style="text-align: left;">
Updates 8/28/2015</h3>
<div>
If you want to know which log files have which system calls without processing all of them, I have created an index that lists the unique calls for each replay:</div>
<div>
<ul style="text-align: left;">
<li><a href="http://laredo-13.mit.edu/~brendan/malrec/sc/32bit/allcall.txt.gz">32-bit system call index</a></li>
<li><a href="http://laredo-13.mit.edu/~brendan/malrec/sc/64bit/allcall.txt.gz">64-bit system call index</a></li>
</ul>
<div>
Also, Reddit user <a href="https://www.reddit.com/r/Malware/comments/3in9mf/syscall_me_maybe_exploring_malware_syscalls_with/cuijurp">trevlix wondered</a> whether the lack of pointer dereferencing was inherent to PANDA or something I'd just left out. My response:</div>
</div>
<div>
<br /></div>
<div>
<form action="https://www.reddit.com/r/Malware/comments/3in9mf/syscall_me_maybe_exploring_malware_syscalls_with/#" class="usertext" id="form-t1_cuimmcy8m0" style="clear: left; font-family: verdana, arial, helvetica, sans-serif; font-size: small; margin: 0px; padding: 0px;">
<div class="usertext-body may-blank-within md-container " style="margin: 0px; padding: 0px;">
<div class="md" style="background-color: rgb(240, 243, 252) !important; color: #222222; font-size: 1.07692307692308em; margin: 5px 0px; max-width: 60em; overflow-wrap: break-word; padding: 0px; word-wrap: break-word;">
<div style="font-size: 1em; line-height: 1.42857142857143em; margin-bottom: 0.357142857142857em; padding: 0px;">
Yes, it is possible to do that. I just wasn't able to because I didn't have access to full system call prototypes. E.g., to follow pointers for something like <code style="background-color: #fcfcf7; border-radius: 2px; border: 1px solid rgb(238, 238, 210); font-family: monospace, monospace; line-height: 1em; margin: 0px 2px; padding: 0px 4px; white-space: nowrap; word-break: normal;">NtCreateFile</code>, you need to know that its full prototype is</div>
<pre style="background-color: #fcfcf7; border-radius: 2px; border: 1px solid rgb(238, 238, 210); margin-bottom: 0.357142857142857em; margin-top: 0.357142857142857em; overflow: auto; padding: 4px 9px;"><code style="background-color: transparent; border-radius: 2px; border: 0px; display: block; font-family: monospace, monospace; font-size: 1em; line-height: 1.42857142857143em; margin: 0px 2px; padding: 0px !important; word-break: normal;">NTSTATUS NtCreateFile(
_Out_ PHANDLE FileHandle,
_In_ ACCESS_MASK DesiredAccess,
_In_ POBJECT_ATTRIBUTES ObjectAttributes,
_Out_ PIO_STATUS_BLOCK IoStatusBlock,
_In_opt_ PLARGE_INTEGER AllocationSize,
_In_ ULONG FileAttributes,
_In_ ULONG ShareAccess,
_In_ ULONG CreateDisposition,
_In_ ULONG CreateOptions,
_In_ PVOID EaBuffer,
_In_ ULONG EaLength
);
</code></pre>
<div style="font-size: 1em; line-height: 1.42857142857143em; margin-bottom: 0.357142857142857em; margin-top: 0.357142857142857em; padding: 0px;">
You furthermore have to know how big an <code style="background-color: #fcfcf7; border-radius: 2px; border: 1px solid rgb(238, 238, 210); font-family: monospace, monospace; line-height: 1em; margin: 0px 2px; padding: 0px 4px; white-space: nowrap; word-break: normal;">OBJECT_ATTRIBUTES</code> struct is, so that when you dereference the pointer you know how many bytes to read and store in the log.</div>
<div style="font-size: 1em; line-height: 1.42857142857143em; margin-bottom: 0.357142857142857em; margin-top: 0.357142857142857em; padding: 0px;">
If you wanted to collect extra information about any of the logs posted, it's possible since they are full-system traces and can be replayed :) Supposing you have a syscall trace file like<code style="background-color: #fcfcf7; border-radius: 2px; border: 1px solid rgb(238, 238, 210); font-family: monospace, monospace; line-height: 1em; margin: 0px 2px; padding: 0px 4px; white-space: nowrap; word-break: normal;">0a1a1a77-d4f1-43e0-bc14-4f34f7d96820_syscalls.dat.gz</code>, you can use the UUID to find it on malrec and download the log file:</div>
<div style="font-size: 1em; line-height: 1.42857142857143em; margin-bottom: 0.357142857142857em; margin-top: 0.357142857142857em; padding: 0px;">
<a class="imgScanned" href="http://laredo-13.mit.edu/~brendan/malrec/rr/0a1a1a77-d4f1-43e0-bc14-4f34f7d96820.rr" rel="nofollow" style="color: #551a8b; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-decoration: none;">http://laredo-13.mit.edu/~brendan/malrec/rr/0a1a1a77-d4f1-43e0-bc14-4f34f7d96820.rr</a></div>
<div style="font-size: 1em; line-height: 1.42857142857143em; margin-bottom: 0.357142857142857em; margin-top: 0.357142857142857em; padding: 0px;">
Then you'd just unpack that log (<code style="background-color: #fcfcf7; border-radius: 2px; border: 1px solid rgb(238, 238, 210); font-family: monospace, monospace; line-height: 1em; margin: 0px 2px; padding: 0px 4px; white-space: nowrap; word-break: normal;">scripts/rrunpack.py</code> in the PANDA directory) and replay it with a PANDA plugin that understands how to dereference the various pointers involved. For reference, you can see the PANDA plugin I originally used to gather the syscall traces:</div>
<div style="font-size: 1em; line-height: 1.42857142857143em; margin-bottom: 0.357142857142857em; margin-top: 0.357142857142857em; padding: 0px;">
<a class="imgScanned" href="https://gist.github.com/moyix/43d3ea40e8dedea103a4" rel="nofollow" style="color: #551a8b; margin-left: 0px; margin-right: 0px; margin-top: 0px; text-decoration: none;">https://gist.github.com/moyix/43d3ea40e8dedea103a4</a></div>
<div style="font-size: 1em; line-height: 1.42857142857143em; margin-top: 0.357142857142857em; padding: 0px;">
And you can see on lines 108 and 119 where you'd have to add in code to read the dereferenced values.</div>
</div>
</div>
</form>
</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-80549230453621098512015-08-24T09:55:00.000-04:002015-08-24T09:56:36.029-04:00One Weird Trick to Shrink Your PANDA Malware Logs by 84%<div dir="ltr" style="text-align: left;" trbidi="on">
When I wrote about some of the lessons learned from P<a href="http://moyix.blogspot.com/2014/12/reproducible-malware-analyses-for-all.html">ANDA Malrec</a>'s first <a href="http://moyix.blogspot.com/2015/03/100-days-of-malware.html">100 days of operation</a>, one of the things I mentioned was that the storage requirements for the system were extremely high. In the four months since, the storage problem only got worse: as of last week, we were storing 24,000 recordings of malware, coming in at a whopping 2.4 terabytes of storage.<br />
<br />
The amount of data involved poses problems not just for our own storage but also for others wanting to make use of the recordings for research. 2.4 terabytes is a lot, especially when it's spread out over 24,000 HTTP requests. If we want our data to be useful to researchers, it would be great if we could find better ways of compressing the recording logs.<br />
<br />
As it turns out, we can! The key is to look closely at what makes up a PANDA recording:<br />
<ul style="text-align: left;">
<li>The log of non-deterministic events (the -rr-nondet.log files)</li>
<li>The initial QEMU snapshot (the -rr-snp files)</li>
</ul>
<div>
The first of these is highly redundant and actually compresses quite well already – the <span style="font-family: Courier New, Courier, monospace;">xz</span> compression used by PANDA's <span style="font-family: Courier New, Courier, monospace;">rrpack.py</span> usually manages to get around a 5-6X reduction for the nondet log. The snapshots also compress pretty well, at around 4X.</div>
<div>
<br /></div>
<div>
So where can we find further savings? The trick is to notice that for the malware recordings, each run is started by first reverting the virtual machine to the same state. That means that the initial snapshot files for our recordings are almost all identical! In fact, if we do a byte-by-byte diff, the vast majority differ by only a few bytes – most likely a timer value that increments in the short time between when we revert to the snapshot and begin our recording.</div>
<div>
<br /></div>
<div>
With this observation in hand, we can instead store the malware recordings in a new format. The nondet log will still be compressed with <span style="font-family: Courier New, Courier, monospace;">xz</span>, but now the snapshot for each will now instead be stored as a binary diff with respect to a reference snapshot. Because we have two separate recording platforms and have changed the initial environment used by Malrec a few times, the total number of reference snapshots we need is 8 – but this is a huge improvement over storing 24,000 snapshots! The binary diff for each recording then requires only a handful of bytes to specify.</div>
<div>
<br /></div>
<div>
The upshot of all of this is that a dataset of 24,189 PANDA malware recordings now takes up just 387 GB, a savings of 84%. This is pretty astonishing – the recordings in the archive contain 476 <i>trillion</i> instructions' worth of execution, meaning our storage rate is 1147.5 instructions per byte! As a point of comparison, <a href="http://www.eecs.harvard.edu/~skanev/papers/ispass11zcompr.pdf">one recent published instruction trace compression scheme</a> achieved 2 bits per instruction; our compression is 0.007 bits per instruction – though this comparison is somewhat unfair since that paper can't assume a shared starting point.</div>
<div>
<br /></div>
<div>
You can download this data set as a single file from our MIT mirror; please share and mirror this as widely as you like! There is a README included in the archive that contains instructions for extracting and replaying any of the recordings. Click the link below to download:</div>
<div>
<br /></div>
<div style="text-align: center;">
<span style="font-size: large;"><a href="http://laredo-13.mit.edu/~brendan/malrec/malrec_archive_20150813.tar">The Malrec Dataset</a> (<a href="http://laredo-13.mit.edu/~brendan/malrec/malrec_archive_20150813.tar.md5">MD5</a>)</span></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
Stay tuned, too – there's more cool stuff on the way. Next time, I'll be writing about one of the things you can do with a full-trace recording dataset like this: extracting system call traces with arguments. And of course that means I'll have a syscall dataset to share then as well :)</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com4tag:blogger.com,1999:blog-6787362638788314904.post-59825004322047025562015-04-13T19:22:00.001-04:002015-04-13T19:22:33.201-04:00PANDA VM Update April 2015<div dir="ltr" style="text-align: left;" trbidi="on">
The PANDA virtual machine has been updated to the latest version of PANDA, which corresponds to commit <a href="https://github.com/moyix/panda/commit/ce866e1508719282b970da4d8a2222f29f959dcd">ce866e1508719282b970da4d8a2222f29f959dcd</a>. You can download it here:<br />
<br />
<div>
<ul style="text-align: left;">
<li><a href="http://laredo-13.mit.edu/~brendan/pandavm-20150413.tar.bz2">http://laredo-13.mit.edu/~brendan/pandavm-20150413.tar.bz2</a></li>
</ul>
Some notable changes:</div>
<div>
<ul style="text-align: left;">
<li>The taint system has been rewritten and is now available as the taint2 plugin. It is at least 10x faster, and uses much less memory. You can check out an example of how to use it in the recently updated <a href="https://github.com/moyix/panda/blob/master/docs/tainted_instructions.md">tainted instructions tutorial</a>.</li>
<li>Since taint is now usable, I have increased the amount of memory in the VM to 4GB, which is reasonable for most tasks that use taint.</li>
<li>PANDA now understands system calls and their arguments on Linux (x86 and ARM) and Windows 7 (x86). This is available in the syscalls2 plugin, and even has <a href="https://github.com/moyix/panda/blob/master/docs/syscalls2.md">some documentation</a>.</li>
<li>There is now a generic logging format for PANDA, which uses Protocol Buffers. Check out the <a href="https://github.com/moyix/panda/blob/master/docs/pandalog.md">pandalog documentation</a> for more details.</li>
</ul>
<div>
There's lots more that has changed, and I will try to write up a more detailed post about all the cool stuff PANDA can do now soon!</div>
</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-68638193371621535742015-03-24T08:39:00.000-04:002015-05-28T22:09:29.957-04:00100 Days of Malware<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
It's now been a little over 100 days since I started <a href="http://moyix.blogspot.com/2014/12/reproducible-malware-analyses-for-all.html">running malware samples in PANDA</a> and making the executions publicly available. In that time, we've analyzed 10,794 pieces of malware, which generated:<br />
<ul style="text-align: left;">
<li>10,794 <a href="http://panda.gtisc.gatech.edu/malrec/rr/">record/replay logs</a>, representing 226,163,195,948,195 instructions executed</li>
<li>10,794 <a href="http://panda.gtisc.gatech.edu/malrec/pcap">packet captures</a>, totaling 26GB of data and 33,968,944 packets</li>
<li>10,794 <a href="http://panda.gtisc.gatech.edu/malrec/movies/"><span id="goog_1013890001"></span>movies<span id="goog_1013890002"></span></a>, which are interesting enough that I'll give them their own section</li>
<li>10,794 <a href="http://panda.gtisc.gatech.edu/malrec/vt/">VirusTotal reports</a>, indicating what level of detection they had when they were run by malrec</li>
<li>107 <a href="http://panda.gtisc.gatech.edu/malrec/torrent/">torrents</a>, containing downloads of the above</li>
</ul>
I've been pleased by the interest malrec has generated. We've had visitors from over 6000 unique IPs, in 89 different countries:<br />
<ul style="text-align: left;">
</ul>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjcLHVz42_kB-sx00Um3JPT86pqth47xn-KvGNzYZODw4q0mBlQRxabYZ0w3gkYxbvRDV-Humk0PcFpTjhFC9UQBvMWnxQABX1HhgIpw0v8D-EbdV3oI5AhgZ331xlvCDtj7sL4dT1C6uM/s1600/geoip.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="205" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjcLHVz42_kB-sx00Um3JPT86pqth47xn-KvGNzYZODw4q0mBlQRxabYZ0w3gkYxbvRDV-Humk0PcFpTjhFC9UQBvMWnxQABX1HhgIpw0v8D-EbdV3oI5AhgZ331xlvCDtj7sL4dT1C6uM/s1600/geoip.png" width="400" /></a></div>
<h3 style="clear: both; text-align: left;">
The Movies</h3>
<div>
There's a lot of great stuff in these ~10K movies. An easy way to get an idea of what's in there is to sort by filesize; because of the way MP4 encoding works, larger files in general mean that there's more going on on-screen (though only up to a point – the largest ones seemed to be mostly command prompts scrolling, which wasn't very interesting). I took it upon myself to watch the ones between ~300KB and 1MB, and found some fun videos:</div>
<div>
<br /></div>
Several games:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/5e77ec8d-0f01-4fe7-a20f-cc00c37edf20.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Puzzle</em><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/10a2201f-abd8-4b49-b7bf-172547af5cd6.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Anime-like game</em><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/1713e1e2-52c8-4db7-93c5-ad6132e312b6.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>"Wild Spirit"</em><br />
<br />
Some fake antivirus:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/7cd6edef-0b8c-4f6c-95ac-7b4e799c54a4.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/5c232b60-bf36-4465-be69-5092fac5cea5.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<br />
Extortion attempts were <em>really</em> popular:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/d925fc9e-7903-4739-8c8c-cc023b16b450.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/2ce9f60f-5f82-489b-a3dd-aeed17eb2a84.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<br />
Though they didn't always invest in fancy art design:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/2435fa1e-71fe-42b1-a48e-c8c3f85e4443.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Extortion (low-rent)</em><br />
<br />
Download managers and trojaned installers were very popular:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/189fe7d2-ff03-429e-b4d0-81e885718568.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Broken trojaned installer</em><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/1fa8f42d-83d7-48cf-ab6c-f0b7f774132e.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Download manager</em><br />
<br />
Some used legitimate-looking documents to disguise nefarious intentions:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/329e24aa-7286-42ad-9efd-04fcc2061ed7.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Some maritime history</em><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/b54e5c40-f237-46db-b4a6-838b8c258960.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>German Britfilms</em><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/5bb096c9-b3af-424f-9214-3dc509a312a2.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Chinese newspaper</em><br />
<br />
Finally, there's the weird and random:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/809fc48c-ef0f-4a64-8150-c9ffe3345574.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Trust Me, I'm a Doctor</em><br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/malrec/movies/8c7a2ecd-3959-45c5-9841-725c3aababe0.mp4" type="video/mp4"></source>
Your browser does not support the video tag.</video>
</div>
<em>Trees</em><br />
<br />
<strong>Not pictured:</strong> there was even one sample that played a <a href="http://laredo-13.mit.edu/~brendan/malrec/movies/b433707a-ee9f-4304-aeec-ecc3f0bbf3b7.mp4">porn video (NSFW)</a> while it infected you; I guess it was intended as a distraction?</div>
</div>
<br />
<h3 style="text-align: left;">
<strong>Antivirus Results</strong></h3>
<div>
After malrec runs a sample in the PANDA sandbox, it checks to see what VirusTotal thinks about the file, and saves the result in VirusTotal's JSON format. From this, we can find out what the most popular families in our corpus are. For example, here's what McAfee thought about our samples:</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWbSybUP5uarK-uwWOxyep1g7xgSrdygAAbmgeuyAMBDtETLIz8dqGoflTAek6fbkDgGbcFL1Hfk1JNTQZfyisDk2pjIVkdqmJ7MnTDS1RwJi04pJO8e_nNJ_Yq1qy-hnLkwbc2vXPaKlz/s1600/McAfee.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="282" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjWbSybUP5uarK-uwWOxyep1g7xgSrdygAAbmgeuyAMBDtETLIz8dqGoflTAek6fbkDgGbcFL1Hfk1JNTQZfyisDk2pjIVkdqmJ7MnTDS1RwJi04pJO8e_nNJ_Yq1qy-hnLkwbc2vXPaKlz/s1600/McAfee.png" width="400" /></a></div>
<br />
<div>
<br /></div>
<div>
<br /></div>
<div>
And Symantec:</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2V9bA2NK3kP7i0sKNM-JV7CI6twLjnVDCQCbcjbBr09ca3A73Pt116C2zSXsC4RGMOj6j6NDU_9be6Sh5jTK6bdyZkLJBacZhVPZFZHPbgKxxm8nhpiz8pJimaRVOxNKivfICGoBlgj0b/s1600/Symantec.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="249" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh2V9bA2NK3kP7i0sKNM-JV7CI6twLjnVDCQCbcjbBr09ca3A73Pt116C2zSXsC4RGMOj6j6NDU_9be6Sh5jTK6bdyZkLJBacZhVPZFZHPbgKxxm8nhpiz8pJimaRVOxNKivfICGoBlgj0b/s1600/Symantec.png" width="320" /></a></div>
<div>
<br /></div>
<div>
In the graphs above, "None" includes both cases where the AV didn't detect anything <em>and</em> cases where the sample hadn't been submitted to VirusTotal yet. You therefore should probably not use this data to try and draw any conclusions about the relative efficacy of Symantec vs McAfee. I'd like to go back and see how the detection rate has changed, but unfortunately my VirusTotal API key is currently rate-limited, so running all 10,794 samples would be a pain.</div>
<div>
<br /></div>
<h3 style="text-align: left;">
Bugs in PANDA</h3>
<div>
Making recordings on such a large scale has exposed bugs in PANDA, some of which we have fixed, others of which need further investigation:</div>
<div>
<ul style="text-align: left;">
<li>One sample, 5309206b-e76f-417a-a27e-05e7f20c3c9d, ran a tight loop of rdtsc queries without interrupts. Because PANDA's replay queue would continue filling until it saw the next interrupt, this meant that the queue could grow very large and exhaust physical memory. This was fixed by <a href="https://github.com/moyix/panda/commit/d34104a8fbfb58160360e4ab9f2e28d458502527">limiting the queue to 65,536 entries</a>.</li>
<li>Our support for 64-bit Windows is much less reliable than I would like. Of the 153 64-bit samples, 45 failed to replay (29.4%). We clearly need to do better here!</li>
<li>We see some sporadic replay failures in 32-bit recordings as well, but they are much more rare: out of the 10,641 32-bit recordings we have, only 14 failed to replay. I suspect that some of these are due to a <a href="https://github.com/moyix/panda/issues/18">known bug involving the recording of port I/O</a>.</li>
<li>One sample, MD5 1285d1893937c3be98dcbc15b88d9433, we have not even been able to record, because it causes QEMU to run out of memory; if you'd like to play with it, you can <a href="http://laredo-13.mit.edu/~brendan/1285d1893937c3be98dcbc15b88d9433.zip">download it here</a>.</li>
</ul>
<h3 style="text-align: left;">
Conclusions</h3>
</div>
<div>
With our current disk usage (1.2TB used out of 2TB), I'm anticipating that we'll be able to run for another 50 days or so; hopefully by then I'll be able to arrange to get more storage.</div>
<div>
<br /></div>
<div>
Meanwhile, I've started to do some deeper research on the corpus, including visualization and <a href="https://github.com/moyix/panda/tree/master/qemu/panda_plugins/memstrings">mining memory for printable strings</a>. I'm looking forward to getting a clearer picture of what's in our corpus as it grows!</div>
</div>Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com2tag:blogger.com,1999:blog-6787362638788314904.post-67665271813688371942014-12-09T20:25:00.000-05:002014-12-10T00:57:28.316-05:00Reproducible Malware Analyses for All<div dir="ltr" style="text-align: left;" trbidi="on">
<i><b>Summary</b>: With help from <a href="https://www.gtisc.gatech.edu/">GTISC</a>, I have begun running 100 malware samples per day and posting the PANDA record & replay logs online at </i><a href="http://panda.gtisc.gatech.edu/malrec/" style="font-style: italic;">http://panda.gtisc.gatech.edu/malrec/</a><i>. The goal is to lower the barriers to entry for doing dynamic malware research, and to make such research </i>reproducible<i>.</i><br />
<br />
Today, I spoke at the ACSAC Malware Memory Forensics workshop in New Orleans about a problem that I think has been largely ignored in existing dynamic malware analysis research: <b>reproducibility</b>.<br />
<br />
To make results reproducible, a computer science researcher typically needs to do three things:<br />
<ol style="text-align: left;">
<li>Carefully and precisely describe their methods.</li>
<li>Release the code they wrote for their system or analysis.</li>
<li>Release the data the analysis was performed on.</li>
</ol>
Of course, even research published at top conferences may fail at some of these criteria; a <a href="http://reproducibility.cs.arizona.edu/">recent study by Collberg et al.</a> attempted to obtain the code associated with 613 recent papers from ACM conferences, and were able to obtain, build and run the code for only 102. (I'm eliding away a lot of important detail here; please do read the original study!)<br />
<br />
Rather than discuss sharing of code today, however, I'd like to talk about sharing data, and particularly sharing data in malware analysis.<br />
<br />
For static analysis of malware, sharing the malware executable is usually sufficient to satisfy the requirement for releasing data; anyone can then go and look at the same static code and reach the same conclusions by following the author's description. A number of sites exist to provide access to such malware samples, such as <a href="http://virusshare.com/">VirusShare</a>, <a href="http://openmalware.org/">OpenMalware</a>, and <a href="http://contagiodump.blogspot.com/">Contagio</a>.<br />
<br />
The data associated with a dynamic analysis is more difficult to share. Software execution is by nature ephemeral: each run of a program may be slightly different based on things like timings, the availability of network servers, the versions of software installed on the machine, and more. This problem is especially apparent with malware, which typically has a short "shelf life". Many malware samples need to contact their command and control servers to operate, and these C&C servers often disappear within days or weeks after a piece of malware is released. Malware may even be designed to "self-destruct" after a certain date, exiting immediately if it is run too long after its creation.<br />
<br />
Thus, a researcher who tries to reproduce a dynamic malware analysis by running a sample from last year will almost certainly discover that the malware no longer has the behavior originally seen. As a result, <i>most dynamic analyses of malware are currently not reproducible in any meaningful sense</i>.<br />
<br />
Record and replay provides a solution. As I have <a href="http://moyix.blogspot.com/2014/01/panda-reproducibility-and-open-science.html">discussed in the past</a>, record and replay allows one to reproduce a whole-system dynamic execution by creating a compact log of the nondeterministic inputs to a system. These logs can be <a href="http://www.rrshare.org/">shared</a> and then <a href="https://github.com/moyix/panda/blob/master/docs/record_replay.md">replayed in PANDA</a>, allowing anyone to re-run the exact execution and be assured that every instruction will be executed exactly the same way.<br />
<br />
To put my malware where my mouth is, I've set up a site where, every day, 100 new malware record/replay logs and associated PCAPs will be posted. This is currently something of a trial run, so there may be some changes as I shake out the bugs; in particular, I hope to give it a nicer interface than just a brute listing of all the MD5s. Check it out:<br />
<br />
<a href="http://panda.gtisc.gatech.edu/malrec/">http://panda.gtisc.gatech.edu/malrec/</a><br />
<br />
Here are some ideas for what to do with this data:<br />
<ol style="text-align: left;">
<li><a href="https://github.com/moyix/panda/tree/master/qemu/panda_plugins/replaymovie">Create movies</a> of all the malware executions and watch them to see if there's anything interesting. For example, here's a hilarious extortion attempt from last night:<br /><center>
<video controls="" height="480" width="640">
<source src="http://panda.gtisc.gatech.edu/malrec/movies/d04af20b-acd0-474d-8315-a4f192183acc.mp4" type="video/mp4"></source>
Your browser does not support the video tag.
</video></center>
</li>
<li>Use something like <a href="http://www.cc.gatech.edu/~brendan/tzb_author.pdf">TZB</a> to find all printable strings accessed in memory throughout the entire execution, and build a search engine that indexes all of these strings, so you could search for "bitcoin" and find all the bitcoin stealing samples in the corpus.</li>
<li>Create system call traces and then use them to automatically apply behavioral labels to the corpus.</li>
<li>Go apply your expertise in machine learning to do something really cool that I haven't even thought of because I'm bad at machine learning, <i>without</i> having to set up your own malware analysis platform.</li>
</ol>
I'm really excited to see what we can accomplish.<br />
<ol style="text-align: left;">
</ol>
The malware recordings are graciously hosted by the <a href="https://www.gtisc.gatech.edu/">Georgia Tech Information Security Center</a>, who are also providing me with access to malware samples. Thanks in particular to <a href="http://www.cc.gatech.edu/people/paul-royal">Paul Royal</a> and <a href="http://www.cc.gatech.edu/people/adam-allred">Adam Allred</a> for helping me make this a reality after I pitched it at CSAW THREADS.</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-82000925163038394492014-11-26T17:56:00.000-05:002014-11-26T17:56:59.155-05:00Replaying Regin in PANDA<div dir="ltr" style="text-align: left;" trbidi="on">
Regin, a piece of state-sponsored malware that may have been used to attack telecoms and cryptographers, has recently come to light. There are <a href="http://www.kaspersky.com/about/news/virus/2014/Regin-a-malicious-platform-capable-of-spying-on-GSM-networks">several</a> <a href="http://www.symantec.com/connect/blogs/regin-top-tier-espionage-tool-enables-stealthy-surveillance">good</a> <a href="https://firstlook.org/theintercept/2014/11/24/secret-regin-malware-belgacom-nsa-gchq/">writeups</a> out there, and I encourage you to check them out.<br />
<br />
Getting access to samples in cases like this is often a challenge. Luckily, both <a href="https://s3.amazonaws.com/tiregin/regin.zip">The Intercept</a> and <a href="https://twitter.com/VXShare/status/537078969737428992">VXShare</a> (<b>warning</b>: both links contain live malware) have released samples thought to be associated with Regin, so that others can perform independent analysis. So far, it appears that the samples are all of the "stage1" component of the malware, rather than the initial "stage0" infector or the later stages.<br />
<br />
In order to allow others to do dynamic analysis of this malware, I built a very small malware sandbox setup using PANDA. The sandbox essentially just executes a sample for five minutes, recording it using PANDA's record and replay facility. The process is slightly complicated by the fact that most of the stage1 samples are kernel-mode components; to (hopefully) deal with this I use the <span style="font-family: Courier New, Courier, monospace;">sc</span> utility to create and start a service with the malware sample.<br />
<br />
So, for normal executables:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">start sample.exe</span><br />
<br />
And for the kernel mode components:<br />
<br />
<span style="font-family: Courier New, Courier, monospace;">sc create sample binPath= sample.exe type= kernel</span><br />
<span style="font-family: Courier New, Courier, monospace;">sc start sample</span><br />
<span style="font-family: Courier New, Courier, monospace;"><br /></span>
So, without further ado, here are the recordings, associated PCAPs, and videos of the samples being executed:<br />
<br />
<a href="http://laredo-13.mit.edu/~brendan/regin/">http://laredo-13.mit.edu/~brendan/regin/</a><br />
<br />
The <span style="font-family: Courier New, Courier, monospace;">index.txt</span> file shows the mapping between the original sample names and the auto-generated names used by the malware sandbox, along with the MD5s of each sample. Note that I have not tried to ensure that these samples really are Regin, and at least one (sample ID <span style="font-family: Courier New, Courier, monospace;">26ed64ef-fcde-4171-99aa-e1e46301315d</span>, MD5 <span style="font-family: Courier New, Courier, monospace;">0e783c9ea50c4341313d7b6b4037245b</span>) seems to in fact be a <a href="http://www.megasecurity.org/trojans/q/qqrein/Qqrein_all.html">QQ info stealer</a>. There are also a few duplicates due to overlaps in the samples provided by The Intercept and VXShare; I have kept both in case a differential analysis between two runs turns out to be useful.<br />
<br />
Happy malware analysis! And if you have more samples, please get in touch on Twitter (<a href="https://twitter.com/moyix">@moyix</a>) or <a href="mailto:brendan@cs.columbia.edu">email me</a>!</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-83772691384902715652014-10-06T20:43:00.000-04:002014-10-06T20:43:49.694-04:00PANDA VM Updated<div dir="ltr" style="text-align: left;" trbidi="on">
By popular request, I've updated the PANDA VM to a more recent version of PANDA. Get it here:<br />
<br />
<a href="http://amnesia.gtisc.gatech.edu/~moyix/pandavm-20141005.tar.bz2">pandavm-20141005.tar.bz2</a><br />
<br />
The version in the VM is based on Git revision <span style="background-color: white; color: #444444; line-height: 16.7999992370605px;"><a href="https://github.com/moyix/panda/commit/28787825aaf514da22e11650fdfca3ba82b9fc57" style="font-family: Consolas, 'Liberation Mono', Menlo, Courier, monospace; font-size: 12px;">28787825aaf514da22e11650fdfca3ba82b9fc57</a><span style="font-family: inherit;">.</span></span><br />
<span style="background-color: white; color: #444444; line-height: 16.7999992370605px;"><span style="font-family: inherit;"><br /></span></span>
<span style="background-color: white; color: #444444; line-height: 16.7999992370605px;"><span style="font-family: inherit;">Enjoy!</span></span></div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com3tag:blogger.com,1999:blog-6787362638788314904.post-26177652690652561742014-07-03T13:25:00.000-04:002014-07-06T16:07:06.973-04:00Breaking Spotify DRM with PANDA<div dir="ltr" style="text-align: left;" trbidi="on">
<i><b>Disclaimer</b>: Although I think DRM is both stupid and evil, I don't advocate pirating music. Therefore, this post will stop short of providing a turnkey solution for ripping Spotify music, but it will fully describe the theory behind the technique and its implementation in PANDA. Don't be evil.</i><br />
<br />
<i><b>Update 6/6/2014: </b>The following post assumes you know what PANDA is (a platform for dynamic analysis based on QEMU). If you want to know more, check out <a href="http://moyix.blogspot.com/2013/09/announcing-panda-platform-for.html">my introductory post on PANDA</a>.</i><br />
<br />
This past weekend I spoke at REcon, a conference on reverse engineering held every year in Montreal. I had a fantastic time there getting to meet other people interested in problems of memory analysis, reverse engineering, and dynamic analysis. One of the topics of my REcon talk was how to use PANDA to break Spotify DRM, and since the video from the talk won't be posted for a while, I thought I'd write up a post showing how we can use PANDA and statistics to pull out unencrypted OGGs from Spotify.<br />
<br />
<h3 style="text-align: left;">
Gathering Data</h3>
<br />
The first step is to gather some data. We want to know what function inside Spotify is doing the actual decryption of the songs, so that we can then hook it and pull out the decrypted (but not decompressed) audio file. So to start with, we'll take a recording of Spotify playing a song; we can then apply whatever analysis we want to the replay. Working with a replay rather than a live system will also make our job considerably easier – no need to worry that we're going to slow things down enough to trip anti-debugging measures or network timeouts. I've prepared a record/repay log of <a href="http://www.rrshare.org/detail/28/">Spotify playing 30 seconds of a song</a>, which you can use to follow along with what comes next. The recording is 12 billion instructions, which gives us a lot of data to work with!<br />
<br />
Just for fun, here's a movie of that replay, generated by taking screenshots throughout the replay and then stitching them into a video:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<video controls="" height="240" width="320">
<source src="http://laredo-13.mit.edu/~brendan/replaymovies/fixedlen/spotify.mp4" type="video/mp4"></source>
Your browser does not support the video tag.
</video></div>
<br />
<br />
<h3 style="text-align: left;">
Some Theory</h3>
<div>
<br /></div>
The next challenge is to figure out how we can identify the function that takes in encrypted data and outputs decrypted data. For this we turn to the <a href="https://www.usenix.org/node/182951">excellent work of Ruoyu Wang, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna</a> [1]. Their clever insight was that when you look at the distribution of bytes in encrypted vs. compressed streams, the <i>byte entropy</i> of the two is very similar, but compressed streams don't look very <i>random</i>. To illustrate this, let's look at the histograms for an encrypted mp3 file, and its decrypted version. First, encrypted:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://i.imgur.com/EPmUxXB.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://i.imgur.com/EPmUxXB.png" height="300" width="400" /></a></div>
<br />
Now the same file, decrypted:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://i.imgur.com/SIMTWlV.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="http://i.imgur.com/SIMTWlV.png" height="300" width="400" /></a></div>
<br />
You can clearly see that the one on the bottom looks significantly less "random" – or more precisely, the distribution of bytes is not very uniform. However, if we compute the byte entropy of each, they are both very close to the theoretical maximum of 8 bits per byte – the mp3 has 7.968480 bits of entropy per byte, whereas the encrypted file has 7.999981 bits per byte.<br />
<br />
We can make this intuition more precise by turning to statistics. The <a href="http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test">Pearson chi-squared test</a> (<span style="font-family: Times, Times New Roman, serif;">χ<super>2</super></span>) lets us compute a value for how much an observed distribution deviates from some ideal distribution. In this case, we expect the bytes in an encrypted file to be uniformly random, so we can compare with the uniform distribution by computing:<br />
<br />
<div style="text-align: center;">
<img src="http://i.imgur.com/bcrTdbY.png" height="50" style="-webkit-user-select: none;" width="164" /></div>
<div style="text-align: left;">
<br />
Here, <i>O<sub>i</sub></i> is the observed frequency of each byte, and <i>E<sub>i</sub></i> is the expected frequency, which for a uniform byte distribution with <i>n</i> samples will be (1/256)*<i>n.</i><br />
<br />
Similarly, the entropy of some ovserved data can be computed as:<br />
<br />
<div style="text-align: center;">
<img src="http://i.imgur.com/7rmofEj.png" height="48" style="-webkit-user-select: none;" width="242" /></div>
<div style="text-align: center;">
<br /></div>
<div style="text-align: left;">
Where <i>p(x<sub>i</sub>)</i> is the observed frequency of each byte value in the data.<br />
<br />
Based on the work of Wang et al., if we find a function that reads a lot of high-entropy, highly random data, and writes a lot of high-entropy, non-random data, that's likely to be our guy!<br />
<br />
<h3 style="text-align: left;">
</h3>
<h3 style="text-align: left;">
Enter the PANDA</h3>
<br />
But enough theory. How do we actually gather the data we need in PANDA? We will want some way of gathering, for each function, statistics on the contents of buffers read and written by each function in the replay. As it happens, PANDA has a plugin called <span style="font-family: Courier New, Courier, monospace;"><a href="https://github.com/moyix/panda/blob/wip/recondemo/qemu/panda_plugins/unigrams/unigrams.cpp">unigrams</a></span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: inherit;">that will get us the data we want.</span><br />
<span style="font-family: inherit;"><br />
The</span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: Courier New, Courier, monospace;"><a href="https://github.com/moyix/panda/blob/wip/recondemo/qemu/panda_plugins/unigrams/unigrams.cpp">unigrams</a></span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: inherit;">plugin works by tracking every memory read and write made by the system. When it sees a read or write, it looks up the current process context (i.e., CR3 on x86), program counter, and the callsite of the parent function (this last is done with the help of the</span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: Courier New, Courier, monospace;"><a href="https://github.com/moyix/panda/blob/wip/recondemo/qemu/panda_plugins/callstack_instr/callstack_instr.cpp">callstack_instr</a></span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: inherit;">plugin). Together, these three pieces of information allow us to put the individual memory access in context and separate out memory accesses made in different program contexts into coherent streams of data. So to gather the raw data we want, we can just run:</span><br />
<span style="font-family: Times, Times New Roman, serif;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">x86_64-softmmu/qemu-system-x86_64 -m 1024 -replay spotify \</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> -panda-plugin x86_64-softmmu/panda_plugins/panda_callstack_instr.so \</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> -panda-plugin x86_64-softmmu/panda_plugins/panda_unigrams.so</span><br />
<span style="font-family: Times, 'Times New Roman', serif;"><br /></span>
<span style="font-family: inherit;">This produces two files, </span><span style="font-family: Courier New, Courier, monospace;">unigram_mem_read_report.bin</span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: inherit;">and</span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: Courier New, Courier, monospace;">unigram_mem_write_report.bin</span><span style="font-family: Times, Times New Roman, serif;">. </span><span style="font-family: inherit;">The format of these files isn't terribly interesting, but they can be parsed using the Python code found in the</span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: Courier New, Courier, monospace;"><a href="https://github.com/moyix/panda/blob/wip/recondemo/scripts/unigram_hist.py">unigram_hist.py</a></span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: inherit;">script. Essentially, it consists of many, many rows of data that have the (callsite, program counter, CR3) triple followed by an array of 256 integers giving the number of times each byte was read or written at that point in the code.</span><br />
<span style="font-family: Times, Times New Roman, serif;"><br /></span>
<span style="font-family: inherit;">Armed with this data, we want to now go through each callsite and look for those that meet the following criteria:</span><br />
<br />
<ol style="text-align: left;">
<li><span style="font-family: inherit;">The function both reads and writes a lot of data, in roughly equal amounts.</span></li>
<li><span style="font-family: inherit;">The byte entropy of the data read is high, and its </span><span style="font-family: Times, Times New Roman, serif;">χ</span><super><span style="font-family: Times, Times New Roman, serif;">2</span><span style="font-family: inherit;"> value (deviation from random) is low.</span></super></li>
<li><super style="font-family: inherit;">The byte entropy of the data </super><super><span style="font-family: inherit;">written is high, and</span></super><super style="font-family: inherit;"> its </super><span style="font-family: Times, Times New Roman, serif;">χ</span><super><span style="font-family: Times, Times New Roman, serif;">2</span><span style="font-family: inherit;"> value is high.</span></super></li>
</ol>
<span style="font-family: inherit;">This is precisely what the </span><a href="https://github.com/moyix/panda/blob/wip/recondemo/scripts/find_drm.py"><span style="font-family: Courier New, Courier, monospace;">find_drm.py</span></a><span style="font-family: inherit;"> script does. We can run it like so:</span></div>
<div style="text-align: left;">
<span style="font-family: inherit;"><br /></span></div>
<div style="text-align: left;">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">./find_drm.py unigram_mem_read_report.bin unigram_mem_write_report.bin</span></div>
<div style="text-align: left;">
<span style="font-family: inherit;"><br /></span></div>
<div style="text-align: left;">
<span style="font-family: inherit;">Among its </span><a href="http://laredo-13.mit.edu/~brendan/spotify_find_drm.txt" style="font-family: inherit;">output</a><span style="font-family: inherit;">, we find the following promising candidate:</span><br />
<span style="font-family: Times, Times New Roman, serif;"><br /></span>
<br />
<pre style="white-space: pre-wrap; word-wrap: break-word;">(00719b84 3f1ac2e0): 3 x 1 combinations
Read sizes: 44033, 701761, 701761
Write sizes: 701761
Read rand: 2.238299, 258.176922, 263.599258
Write rand: 142018.776009
Best input/output ratio (0 is best possible): 0.0</pre>
<span style="font-family: Times, Times New Roman, serif;"><br /></span>
<span style="font-family: inherit;">This function read two buffers of size </span><span style="font-family: inherit; white-space: pre-wrap;">701,761 bytes and wrote one of size </span><span style="font-family: inherit; white-space: pre-wrap;">701,761 bytes – given that we played 30 seconds of the song, this looks just about right. The randomness of the input buffers was quite high (recall that in the </span><span style="font-family: Times, Times New Roman, serif;">χ</span><super><span style="font-family: Times, Times New Roman, serif;">2</span><span style="font-family: inherit;"> test, high numbers mean the data observed is less likely to be random), but the output buffer was not very random.</span></super><br />
<h3 style="text-align: left;">
<super style="font-family: Times, 'Times New Roman', serif;"><br /></super></h3>
<h3 style="text-align: left;">
<super style="font-family: Times, 'Times New Roman', serif;">Dumping the Data</super></h3>
<span style="font-family: inherit;"><super><br /></super></span>
<span style="font-family: inherit;"><super>So how can we confirm our guess? Well, the easiest thing is to simply dump out the data seen at that point. If we go back up to the beginning of the <a href="http://laredo-13.mit.edu/~brendan/spotify_find_drm.txt">output of the script</a>, we have a list of all the (callsite, program counter, CR3) identifiers for reads and writes that matched our criteria. Looking through the writes for our candidate callsite (</super><span style="white-space: pre-wrap;">00719b84), we find it here:</span></span><br />
<span style="white-space: pre-wrap;"><br /></span>
<br />
<pre style="white-space: pre-wrap; word-wrap: break-word;">(00719b84 0042e2ed 3f1ac2e0): 701761 bytes
</pre>
<div>
<br /></div>
<span style="font-family: inherit;">We can now use another PANDA plugin, </span><span style="font-family: Courier New, Courier, monospace;">tapdump</span><span style="font-family: inherit;">, to dump out all the data flowing through that point in the program. First we create a text file named </span><span style="font-family: Courier New, Courier, monospace;">tap_points.txt</span><span style="font-family: inherit;"> in the QEMU directory, and put in it:</span><br />
<span style="font-family: Times, Times New Roman, serif;"><br /></span>
<br />
<pre style="white-space: pre-wrap; word-wrap: break-word;">00719b84 0042e2ed 3f1ac2e0</pre>
<span style="font-family: Times, Times New Roman, serif;"><br /></span>
<span style="font-family: inherit;">Next we run the replay again with the </span><span style="font-family: Courier New, Courier, monospace;">tapdump</span><span style="font-family: inherit;"> plugin enabled.</span><br />
<span style="font-family: Times, Times New Roman, serif;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">x86_64-softmmu/qemu-system-x86_64 -m 1024 -replay spotify \</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> -panda-plugin x86_64-softmmu/panda_plugins/panda_callstack_instr.so \</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> -panda-plugin x86_64-softmmu/panda_plugins/panda_tapdump.so</span><br />
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"><br /></span></div>
<span style="font-family: inherit;">This produces two files, </span><span style="font-family: Courier New, Courier, monospace;">read_tap_buffers.txt.gz</span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: inherit;">and</span><span style="font-family: Times, Times New Roman, serif;"> </span><span style="font-family: Courier New, Courier, monospace;">write_tap_buffers.txt.gz</span><span style="font-family: inherit;">, which contain the data read and written at the specified points. If you examine this with zless, you'll see lots of lines of addresses, followed by a single byte value. Separating out each field onto its own line and annotating, these are: </span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">0000000082678e78 [Caller 13]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">000000008260dcc3 [Caller 12]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">[...]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">000000000071a1a5 [Caller 2]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">0000000000719b84 [Caller 1]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">000000000042e2ed [PC]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">000000003f1ac2e0 [Address space]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">000000000b256570 [Write address]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">269882976</span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"> [Index]</span><br />
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;"> 4f [Data]</span><br />
<div>
<br /></div>
<div>
<span style="font-family: inherit;">The extra callstack information is included so that, if necessary, more calling context can be used to pull out just the data we're interested in. In our case, however, just one level turns out to be enough. Finally, we want to turn this text file into a binary stream. In the scripts directory, there is a script called </span><span style="font-family: Courier New, Courier, monospace;"><a href="https://github.com/moyix/panda/blob/wip/recondemo/scripts/split_taps.py">split_taps.py</a></span><span style="font-family: inherit;"> which will go through a gzipped tapdump output file and separate out each distinct stream found in the file (based on our usual identifier of (callsite, program counter, CR3)).</span></div>
<div>
<span style="font-family: inherit;"><br /></span></div>
<div>
<span style="font-family: inherit;">So now we can run this on the writes seen at our candidate for the decryption function:</span></div>
<div>
<span style="font-family: inherit;"><br /></span></div>
<div>
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">./split_taps.py write_tap_buffers.txt.gz spotify</span></div>
<div>
<span style="font-family: inherit;"><br /></span></div>
And obtain spotify.0000000000719b84.000000000042e2ed.000000003f1ac2e0.dat, which contains the binary data written at program counter 0x0042e2ed, called from callsite 0x00719b84, inside of the process with CR3 0x3f1ac2e0. So, is this audio we seek?<br />
<br />
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">$ file spotify.0000000000719b84.000000000042e2ed.000000003f1ac2e0.dat </span></div>
<br />
<div class="p1">
<span style="font-family: Courier New, Courier, monospace; font-size: x-small;">spotify.0000000000719b84.000000000042e2ed.000000003f1ac2e0.dat: Ogg data</span></div>
<br />
<span style="font-family: inherit;">This looks good! Of course, the proof of the pudding is in the eating, and the proof of the audio is in the listening, so do...</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">$ cvlc </span><span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;">spotify.0000000000719b84.000000000042e2ed.000000003f1ac2e0.dat</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">And you should hear a rather familiar tune :)</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<h3 style="text-align: left;">
<span style="font-family: inherit;">Concluding Thoughts</span></h3>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">As I mentioned in the disclaimer, this by itself is just the starting point for what you would need to really break Spotify's DRM. It doesn't give you a way to obtain the key for each song and decrypt it wholesale. Instead, you would have to place a hook in the function identified by this process and pull it out as it's played, which limits it to realtime decryption (and Spotify's packing and anti-debugging may make it hard to place the hook in the first place!). Although I can certainly imagine more efficient processes, I think for now this is a nice balance between enabling piracy and showing off the power of PANDA.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">If you now want to get a better understanding of the function we found inside Spotify, you can create a memory dump, extract the unpacked Spotify binary (which is packed with Themida) <a href="https://code.google.com/p/volatility/wiki/CommandReference23#procmemdump">using Volatility</a>, and the load it up in IDA and go to </span>0x0042e2ed, which is the location where decrypted data is written out.<br />
<span style="font-family: inherit;"><br /></span>
<br />
<h3 style="text-align: left;">
<span style="font-family: inherit;">Postscript</span></h3>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">One may wonder what happens when the function that contains 0x</span><span style="font-family: inherit;">0042e2ed is called by others. As it turns out, this appears to be a generic decryption function that is used for other media throughout Spotify, including album art! It is left as an exercise to the reader to dump and examine the rest of the data that this function decrypts.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<h3 style="text-align: left;">
<span style="font-family: inherit;">References</span></h3>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">
[1] Steal This Movie: Automatically Bypassing DRM Protection in Streaming Media Services. Wang, R., Shoshitaishvili, Y., Kruegel, C., and Vigna, G. USENIX Security Symposium, Washington, D.C., 2013.</span><br />
<div>
<br /></div>
</div>
</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com4tag:blogger.com,1999:blog-6787362638788314904.post-48895699937672851882014-01-28T11:29:00.000-05:002014-01-28T11:29:00.323-05:00PANDA, Reproducibility, and Open Science<div dir="ltr" style="text-align: left;" trbidi="on">
<b>tl;dr</b>: PANDA now supports detached replays (you don't need the underlying VM image to run a replay), and they can be shared at a new site called <a href="http://laredo-13.mit.edu:8080/rrshare/">PANDA Share</a>. Hooray for reproducibility!<br />
<br />
One of the most inspiring developments of the past few years has been the push for <a href="http://en.wikipedia.org/wiki/Open_science">open science</a>, the movement to ensure that scientific publications, data, and software are freely available to all. In computer science, a big part of this has been a trend towards making software and experimental data available once a paper has been published, so that others can verify experiments and "stand on the shoulders of giants" by extending the software. There have also been <a href="http://reproducibleresearch.net/index.php/Main_Page">initiatives</a> aimed at making sure that the results of experiments in computer science can be replicated.<br />
<br />
In the latest release of PANDA, our Platform for Architecture-Neutral Dynamic Analysis, we've taken an important step in ensuring that experiments in dynamic analysis can be freely shared and replicated: as of <a href="https://github.com/moyix/panda/commit/9139261d70b13837ddae1c5327e86229c1188ea9">commit 9139261d70</a>, PANDA creates and loads standalone record/replay logs. This means that you can create a recording of an execution and then share it with others, and they will be able to precisely duplicate the same execution on their own machine, down to the last instruction. Any of PANDA's plugins can be applied to such executions, allowing new analyses to be run on existing, shared executions.<br />
<br />
What does this enable? To start with, this makes it possible to share experimental data from research in dynamic analysis. In our paper <a href="http://dl.acm.org/citation.cfm?id=2516697">Tappan Zee (North) Bridge</a>, we performed many experiments that showed how to find useful points to hook in an OS; however, because these were based on executions that were tied to virtual machine disk images, we weren't able to share the data necessary to exactly reproduce our experiments (since that would require sharing a Windows VM with proprietary software). Now, however, we can simply share the <a href="http://laredo-13.mit.edu:8080/rrshare/tagged/tzb">detached recordings for the TZB experiments</a>, allowing anyone to verify, for example, that our plugins can find <a href="http://laredo-13.mit.edu:8080/rrshare/detail/22/">SSL master secrets in IE8 on Windows</a>. We also hope that collections of interesting recordings can form the basis of new benchmarks for dynamic analysis, allowing different implementations and algorithms to be directly compared by running them against a standard set of executions.<br />
<br />
Aside from the benefits to reproducibility of dynamic analyses, we hope that this will also permit the creation and sharing of interesting executions that can then be studied by the whole community. For example, we are releasing today <a href="http://laredo-13.mit.edu:8080/rrshare/detail/26/">a recording of the FBI-authored shellcode</a> that was <a href="http://www.wired.com/threatlevel/2013/09/freedom-hosting-fbi/">recently used</a> to identify Tor users connecting to sites hosted by Freedom Hosting. This means that anyone can re-run the recording and analyze every instruction executed by the shellcode to confirm for themselves the information that has appeared in <a href="http://tsyrklevich.net/tbb_payload.txt">public writeups</a>.<br />
<br />
To provide a central location for sharing interesting executions, we have created a site called <a href="http://laredo-13.mit.edu:8080/rrshare/">PANDA Share</a> where PANDA recordings can be uploaded. Each recording comes with a short description and the command line for PANDA needed to reproduce the execution. Right now, the repository contains the recordings of our <a href="http://laredo-13.mit.edu:8080/rrshare/tagged/tzb/">Tappan Zee Bridge experiments</a>, and the <a href="http://laredo-13.mit.edu:8080/rrshare/detail/26/">FBI shellcode recording</a>. We are planning to add many more soon, and hope that others will share their own!</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com2tag:blogger.com,1999:blog-6787362638788314904.post-11739897846982905662013-10-04T14:36:00.000-04:002013-10-04T14:39:35.797-04:00Prebuilt VM for PANDA Now Available<div dir="ltr" style="text-align: left;" trbidi="on">
I have just created a <a href="http://amnesia.gtisc.gatech.edu/~moyix/pandavm.tar.bz2">prebuilt Virtualbox VM for testing PANDA</a>. It's a current Debian 7.1 install with the latest (as of 10/4/2013) version of PANDA and prerequisites installed. The username and password for the VM are "<span style="font-family: Courier New, Courier, monospace;">panda:panda</span>", with root password "<span style="font-family: Courier New, Courier, monospace;">panda</span>".<br />
<br />
Also included is a Debian i386 QCOW2 image (created by <a href="http://people.debian.org/~aurel32/qemu/">Aurelien Jarno</a>) that can be used to test PANDA. Once you have the VM booted and you're logged in, you can cd into the panda/qemu directory and do:<br />
<br />
<code>panda@pandavm:~/panda/qemu$ x86_64-softmmu/qemu-system-x86_64 \<br />
-m 256 -hda ~/qcow/debian_squeeze_i386_standard.qcow2 -monitor stdio</code><br />
<br />
<span style="font-family: inherit;">This will start up an instance of PANDA and boot the Debian image. From there you can create recordings and replay them with PANDA's various plugins; see <a href="https://github.com/moyix/panda/tree/master/docs">the documentation</a> for more details.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Hopefully this will make it easier for people to get started with PANDA!</span></div>Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com1tag:blogger.com,1999:blog-6787362638788314904.post-72670439039710420252013-09-30T13:07:00.000-04:002013-09-30T13:07:12.103-04:00Announcing PANDA: A Platform for Architecture-Neutral Dynamic Analysis<div dir="ltr" style="text-align: left;" trbidi="on">
I'm pleased to announce the initial release of a new open source dynamic analysis platform built on QEMU, named <a href="http://github.com/moyix/panda">PANDA (Platform for Architecture-Neutral Dynamic Analysis)</a>. It has a number of features that combine to make it a uniquely powerful platform for analyzing software as it executes:<br />
<br />
<ul style="text-align: left;">
<li><b>Record and Replay</b>: PANDA is capable of recording the non-deterministic inputs during a whole-system execution and later deterministically replaying them. This means that heavyweight analyses that would be too slow to run on a live execution can be decoupled to run on the replayed execution instead. We recently used this in <a href="http://www.cc.gatech.edu/~brendan/tzb_author.pdf">our 2013 ACM CCS paper</a> to monitor every memory access made by an OS and applications, which would not have been feasible without record and replay. Record and replay is currently supported for i386, x86_64, and ARM, with more architectures planned. For more details see the <a href="https://github.com/moyix/panda/blob/master/docs/record_replay.md">record and replay documentation</a>.</li>
<li><b>Android Support</b>: Thanks to excellent work by Josh Hodosh, PANDA can act as an Android emulator, running modern versions of Android. See the <a href="https://github.com/moyix/panda/blob/master/docs/Android.md">Android documentation</a> for more details.</li>
<li><b>Plugin Architecture</b>: Plugins can be written in C and C++. PANDA supports callbacks for many types of event within QEMU, making it easy to write an analysis plugin; for example, a simple system call tracer is ~60 lines of code. Check out the <a href="https://github.com/moyix/panda/blob/master/docs/PANDA.md">plugin documentation</a> for more information.</li>
<li><b>LLVM Execution</b>: Borrowed from <a href="http://dslab.epfl.ch/proj/s2e">S2E</a>, this execution mode translates guest code to <a href="http://llvm.org/">LLVM</a> and then JIT compiles it to native code; this means that plugins can analyze and transform the LLVM IR rather than working directly on native code. Unique to PANDA is the ability to also translate QEMU's helper functions (which are implemented in C and cover operations too complex to be handled in QEMU's native IR) to LLVM, meaning analyses in PANDA can be complete. This was recently used to implement <a href="http://link.springer.com/chapter/10.1007%2F978-3-642-37051-9_8">architecture-neutral dynamic taint analysis</a>.</li>
<li><b>Modern QEMU</b>: PANDA is based on QEMU 1.0.1, with some additional fixes and enhancements backported. Unlike platforms such as BitBlaze/TEMU, which use QEMU 0.9.1, this allows PANDA to support modern OSes such as Windows 8.</li>
</ul>
<div>
If you want to get started, <a href="https://github.com/moyix/panda">check out the project on GitHub</a>, and read some of the documentation:</div>
<div>
<ul style="text-align: left;">
<li><a href="https://github.com/moyix/panda/blob/master/docs/compile.txt">Compiling PANDA</a></li>
<li><a href="https://github.com/moyix/panda/blob/master/docs/PANDA.md">The Plugin API</a></li>
<li><a href="https://github.com/moyix/panda/blob/master/docs/record_replay.md">Record and Replay</a></li>
<li><a href="https://github.com/moyix/panda/blob/master/docs/Android.md">Android</a></li>
<li><a href="https://github.com/moyix/panda/blob/master/docs/panda_ssltut.md">Finding SSL/TLS Master Secrets with PANDA (tutorial)</a></li>
</ul>
</div>
<div>
Thanks to all the people who have contributed to making PANDA a reality over the past year, including:</div>
<div>
<ul style="text-align: left;">
<li>Josh Hodosh</li>
<li>Ryan Whelan</li>
<li>Tim Leek</li>
<li>Michael Zhivich</li>
<li>Patrick Hulin</li>
<li>Anthony Eden</li>
<li>Sam Coe</li>
<li>Nathan VanBenschoten</li>
</ul>
</div>
</div>
Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com5tag:blogger.com,1999:blog-6787362638788314904.post-63003680449963797342012-02-07T14:47:00.000-05:002012-02-07T14:49:10.696-05:00Virtuoso – Initial Code Release<div dir="ltr" style="text-align: left;" trbidi="on">
I've just gotten word that the Virtuoso source code has been approved by the sponsor for public release, so I've uploaded version 1.0 to the <a href="http://code.google.com/p/virtuoso/">Virtuoso Google Code site</a>! Thanks to Tim Leek at MIT Lincoln Laboratory for seeing this project through the lengthy release review process!<br />
<br />
Also on Google Code, you can find an <a href="http://code.google.com/p/virtuoso/wiki/Installation">installation guide</a> and a <a href="http://code.google.com/p/virtuoso/wiki/Walkthrough">walkthrough</a> to get you started.<br />
<br />
Check it out, and generate some memory analysis tools! If you run into trouble, you can shoot me an email and I'll do my best to help out, but keep in mind that this is a research project, and so there are still lots of rough edges. Enjoy!</div>Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-85382146872559620762011-09-06T13:34:00.000-04:002011-09-06T13:34:59.709-04:00What I Did on My Summer VacationOver the summer I worked at <a href="http://research.microsoft.com/en-us/">Microsoft Research</a>, which has a fantastically smart bunch of people working on really cool and interesting problems. I just noticed that they've posted the video of my end-of-internship talk, <a href="http://research.microsoft.com/apps/video/default.aspx?id=152832">Monitoring Untrusted Modern Applications with Collective Record and Replay</a>. Please take a look if you're curious about what it might look like to try and monitor mobile apps in the wild with low overhead!Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0tag:blogger.com,1999:blog-6787362638788314904.post-1427891570718457112011-05-28T03:34:00.004-04:002011-05-28T03:50:43.264-04:00Paper and Slides Available for "Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection"I've recently returned from Oakland, CA, where the <a href="http://www.ieee-security.org/TC/SP2011/">2<sup>5</sup> IEEE Symposium on Security and Privacy</a> was held. There were a lot of excellent talks, and it was great to catch up with others in the security community. Now that the conference is over, I'm happy to release the paper and slides of our work, "Virtuoso: Narrowing the Semantic Gap in Virtual Machine Introspection", which I have described in an <a href="http://moyix.blogspot.com/2011/03/automatically-generating-memory.html">earlier post</a>.<div><br /></div><div>The slides contain some animations, and so I've made them available in three formats:</div><div><ul><li><a href="http://www.cc.gatech.edu/~brendan/Virtuoso_Oakland.key">Keynote (iWork 2009)</a></li><li><a href="http://www.cc.gatech.edu/~brendan/Virtuoso_Oakland.pdf">PDF</a></li><li><a href="http://www.cc.gatech.edu/~brendan/Virtuoso_Oakland_notes.pdf">PDF with notes</a></li></ul>You can also get a copy of the <a href="http://www.cc.gatech.edu/~brendan/virtuoso.pdf">full paper here</a>. I'm also hoping to have the source ready for release soon; when it is available, you'll be able to find it on <a href="http://code.google.com/p/virtuoso/">Google Code under the name Virtuoso</a>.</div><div><br /></div><div>Once again, thanks to my most excellent co-authors at MIT Lincoln Labs and Georgia Tech for helping me see this project through!</div>Brendan Dolan-Gavitthttp://www.blogger.com/profile/17143824408632888880noreply@blogger.com0