Friday, December 11, 2009

Thoughts on Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications

Authors: David Brumley (CMU), Pongsin Poosankam (CMU), Dawn Song(UC Berkeley), Jiang Zheng (U. Pittsburgh)

Venue: 2008 IEEE Symposium on Security and Privacy

The work describes how to generate exploits for an unpatched binary program P given a patched version P'. This is important because it highlights the fact that the patching process itself is important. Staggering updates over a long period of time can lead to exploits that were previously unknown and that can be generated in minutes.

The work, however, only targets input validation checks. So, for example, if a buffer overflow was fixed by checking the input size or the pointer, the system the author built works. If, on the other hand, it was fixed by increasing the buffer size, the system fails.

The authors generate exploits (user inputs to a program that compromises its safety) in four main steps:
  1. "Identify the new sanitization checks added in P'." This is done using off-the-shelf binary differencing tools like EBDS. EBDS does syntactic analysis, but not semantic. So if a check in P is i > 10 while in P' it's i - 1 > 10 the latter might be reported as a new check by EBDS. The verification step later on weeds out these semantic differences.
  2. For each new check found, generate a candidate exploit which fails the new check in P' by:
    1. Calculating the boolean function F that specifies the weakest constraints on the inputs that would fail these checks. This is done by basically trying to find which instructions in the program to include in the formula. They use three approaches:
      1. Dynamic: If we know the constraints on the inputs that exercise the check, they simply specify the constraints that lead to the new check and add the constraint that the input must fail the new check. This is the case if the new check is on a common path executed for all inputs. This is the quickest method, and generates the easiest constraint functions to solve. In practice, this is done by analyzing an execution trace, lifting the executed instructions into the Vine modeling language for easier analysis, then doing the analysis.
      2. Static: (See Creating vulnerability signatures using weakest preconditions) In this approach, the whole program is lifted to the Vine modeling language first. Then, the program's control flow graph is analyzed and the program is chopped so that it only includes paths that lead to the new checks. The constraint function is then basically the whole CFG. This approach is more thorough and will generate exploits that might require more searching in the dynamic approach, but the generated functions are usually much larger and more difficult to solve.
      3. Hybrid: Use the dynamic approach to reach a junction point in the execution closer to the new check, then use the static approach between that point and the new check. This approach creates smaller constraint functions so they are easier to solve but also allows wider coverage to find exploits.
        They automate the process by iterating through junction points until an exploit is found, where the heuristic they use for these points is to place them at procedure boundaries. While the past two approaches have been done by other before, the authors claim the hybrid approach as a novel contribution.
    2. Solve the constraint function F to find the candidate exploits using an off-the-shelf solver.
  3. Check that the candidate exploit is a real exploit by checking that it compromises the program's "safety policy", where "a safety policy" is a function that maps the program's state space to a boolean value of safe or unsafe. Examples include type checking, buffer overflows, ... In the paper, the authors use policies that are enforceable by their execution monitor (dynamic taint analysis to detect information leaks and memory safety issues, dynamic type checking, return address integrity, ...)
  4. Generate additional variants x' by solving the new constraint function F'(x') = F(x') and x' != x

If the solver fails to find an input, then the path described by the constraint function does not have an exploit. If it takes too much time, then the authors set a timeout so that the system moves on to check a new path by, for example, changing the mix (junction) point in the hybrid approach.

For evaluation, the authors ran their system on 5 MS updates. Some exploits worked with dynamic analysis, others worked with static analysis, and other required the hybrid approach. Exploits required between tens of seconds to less than 3 minutes, and most of them resulted in control hijacking.

The techniques seem to also be useful in other applications such as automatic deviation detection to check whether two programs behave the same, or, more concretely, to check if compiler optimizations affected the behavior of a program. But this doesn't seem to me to be easily attainable since the differences between two programs might be large enough that the techniques don't work.

The solutions to the automatic patch exploit generation problem they propose are obvious: obfuscation, patch encryption so that everyone can apply it simultaneously, and fast patch distribution using P2P. They mention that off-line systems would stay vulnerable, but this is the same issue with the current path distribution architecture.

The Likes:
  • The paper presents a cool attack
  • It works (at least for the cases shown)!

The Dislikes:
  • There was no real quantitative estimate of how important this is in practice or how well the system will fare in general. What is the percentage of bugs it does work on and through which a bad guy can cause real damage?
  • There was too much notation and repetition
  • I did not understand anything on dynamic analysis. The math notation looked horrible. I'm sure the ideas are probably not too difficult to understand, but I didn't want to wade through it with no helping hand (i.e. real explanation).
  • This is not quite about the paper, but there are no satisfactory solutions to the problem they raised yet. It would've been nice if they were able to describe one, but that's not the goal here.
  • I wish there was a bit more education on the technical background.

Thursday, December 3, 2009

Thoughts on Dispatcher: Enabling Active Botnet Infiltration using Automatic Protocol Reverse-Engineering

Authors: Juan Caballero, Pongsin Poosankam, Christian Kreibich, Dawn Song

Venue: CCS 2009

The paper describes a tool that can be used to automatically reverse engineer the protocol format of an unknown protocol. The tool can be used to deduce the format and some of the semantics of the fields in both sent and received protocol messages. It can do this reverese engineering without requiring the binaries on both sides of a protocol, which makes it useful in situations where only the client is available (such as a bot binary) or only the server.

The ideas used are quite clever and simple. The main idea is to look at each byte in the buffer that constitutes a message sent or received, and then create a dependency chain for that byte. The dependency chain is the sequence of instructions and data sources used to write the byte. The dependecy chain ends when a source for an instruction is an immediate, a constant, another memory location, or unknown (such as an arithmetic op). Then by comparing the ultimate sources for contiguous destination bytes and seeing if they are derived from the same contiguous set of source bytes, they can tell the position and length of buffers that form the fields in a message. They do this hierarchically in the buffer so they can also extract the hierarchy of fields.

They can infer some semantics by looking at where these fields are used. If they are used as arguments to or if they are return values of known functions, they can immediately get the type, and some semantics such as an IP address when used in the connect function.

Furthermore, they are able to infer loops using static and dynamic analysis and then be able to infer fields that are used to indicate offsets and lengths. And by looking at repeated bytes, they are able to infer delimiters. While arrays are inferred by assuming that arrays are usually written inside loops.

To handle encrypted messages, they use the intuition behind other work, ReFormat by Z. Wang et al. The idea is that functions that have a disproportionate amount of arithmetic instructions to other types of instructions are probably cryptographic. They use this idea to infer which buffers are unencrypted and operate on them. They have used a threshold of 20 minimum instructions for a function to be considered and a ratio of 0.55, and were able to get a false positive rate of 0.002%.

For evaluation, the authors studied the MegaD command and control C&C protocol. They were able to infer the message formats used by MegaD and were able to rewrite a message for indicating SMTP sending capabilities. They have also looked analyzed other open known protocols and compared their results using Dispatcher with those from Wireshark. They say that Dispatcher outperforms Wireshark.

In terms of resiliency, the authors claim that they target fundamental properties to avoid becoming obsolete if malware adapts. However, they do confess that if malware can detect that it is being run in a virtual environment, then their technique can be circumvented.

The Likes:
  • The paper is very well written and clear.
  • The ideas are nice and clever, and they seem to work well.
  • They do not need both sides of a protocol
  • Can be used to infer correctness of messages when a protocol is known
  • Can make writing a Wireshark dissector easier
  • Might be used to send erroneous information to botnet masters (I don't know how useful this is. If you know that a machine is already compromised, you can should fix the situation. You cannot take over the botnet.)
The Dislikes:
  • Can be circumvented by detecting a virtualized environment. I don't know how easy this is, but for serious malware, this might not be a problem.
  • The authors claim that they target fundamental properties, but provide no evidence or proof. It is not clear that the properties they target are really fundamental. For example, they use some known functions to infer semantics. The bad guy might ship his own libraries so as to restrict the API used. The thresholds used to detect encryption and decryption are empirical, and seem easy to obfuscate. Adaptive malware might be able to change keyword values or restructure the program to make it more difficult to infer which functions are doing what. The program might be written in a difficult to analyze way where fields of the messages are always results of arithmetic computations. The program might use complex loops with jumps and gotos that make analysis difficult.
  • They claim they do better than Wireshark, but it's not clear why they made that conclusion. The dissectors made different assumptions about what delimiters are, but it does not mean they are incorrect (or maybe I just did not understand what they meant).