Thursday, April 14, 2011

Thoughts on What's New About Cloud Computing Security

Authors: Yanpei Chen, Vern Paxson and Randy H. Katz
While the economic case for cloud computing is compelling, the security challenges it poses are equally striking. In this work we strive to frame the full space of cloud-computing security issues, attempting to separate justified concerns from possible over-reactions. We examine contemporary and historical perspectives from industry, academia, government, and “black hats”. We argue that few cloud computing security issues are fundamentally new or fundamentally intractable; often what appears “new” is so only relative to “traditional” computing of the past several years. Looking back further to the time-sharing era, many of these problems already received attention. On the other hand, we argue that two facets are to some degree new and fundamental to cloud computing: the complexities of multi-party trust considerations, and the ensuing need for mutual auditability.

My Summary:
This technical report provides many nice analogies and comparisons between mechanisms and problems that have been faced in the days of yore (e.g. Multics) and today's cloud computing environment. There are several examples of what problems are old and what are new. It has a nice compilation of anecdotes and references on how the Cloud has been broken or abused such as FBI raiding a Cloud data center because one of the customers might have been doing illegal stuff. They emphasize that one of the major problems faced today is auditing so that authorities know they have seized all evidence and so that customers know that only required evidence was seized and no more. Similarly, auditing is required to help customers and providers build mutual trust. The paper does not break new ground, but does nicely summarize many of the issues faced in cloud computing security. However, they gloss over regulatory issues such as enforcing HIPAA compliance in multi-tenant systems and the possibility of having to update these requirements.

Wednesday, April 13, 2011

Thoughts on Apiary: Easy-to-Use Desktop Application Fault Containment on Commodity Operating Systems

Authors: Shaya Potter and Jason Nieh

Desktop computers are often compromised by the interaction of untrusted data and buggy software. To address this problem, we present Apiary, a system that transparently contains application faults while retaining the usage metaphors of a traditional desktop environment. Apiary accomplishes this with three key mechanisms. It isolates applications in containers that integrate in a controlled manner at the display and file system. It introduces ephemeral containers that are quickly instantiated for single application execution, to prevent any exploit that occurs from persisting and to protect user privacy. It introduces the Virtual Layered File System to make instantiating containers fast and space efficient, and to make managing many containers no more complex than a single traditional desktop. We have implemented Apiary on Linux without any application or operating system kernel changes. Our results with real applications, known exploits, and a 24-person user study show that Apiary has modest performance overhead, is effective in limiting the damage from real vulnerabilities, and is as easy for users to use as a traditional desktop.

My Summary:
Apiary uses Linux containers to isolate sets of programs, which they term applications. Their main contribution is making container isolation useable. They do this in three main ways:

  1. Display integration so that the windows from each container show up in the integrated desktop environment. They use MetaVNC for this with a daemon running in each container.
  2. Container file system integration, which they call Virtual Layered File System (VLFS) that allows files that are the same within each container to be shared copy-on-write to a new layer that obscures the original file. All files are shared read-only between containers to begin with, but each layer can have its own private files in its own layer. Applications such as Firefox will have to be instantiated separately  for accessing secure websites such as banks and other general websites. VLFS is based on unioning file systems with some extensions to handles updates between layers and to handle deletes.
  3. A global application layer enables applications to be instantiated within their own containers from other containers. For example, Firefox can call /usr/bin/xpdf instantiated within its own container through this layer.
Apiary uses the notion of ephemeral containers for applications that do not need to store persistent state across executions. For example, an ephemeral container can be used to instantiate viewers and browsers such as xpdf and Firefox or programs such as virus-scanners.

Apiary has low overhead compared to Linux, because VLFS is fast and has no real computation required except for name lookups. Additionally containers are based on Linux Containers which are fast themselves. The authors did a usability study that showed that Apiary isn't too annoying. However, they did not consider the fact that it would be annoying to have multiple email clients and multiple Web browsers.

Apiary is an interesting idea, and is a step in the right direction for creating better isolation. It protects the user from buggy viewers such as music players and pdf viewers when exploited by malicious input files because these exploits cannot persist. But because ephemeral containers always have full read-only access to the filesystem, it seems that they would be able to exfiltrate sensitive data unless the containers are locked down with no network access. On the other hand, such exploits do not persist because ephemeral containers store no state between executions.

I can definitely see how VLFS can be useful in other similar systems. Sharing in Apiary is explicitly done through a separate container that has a special-purpose file manager. It seems to me this would be quite annoying. Is there an easier way?

Thoughts on TightLip: Keeping Applications from Spilling the Beans

Authors: Aydan R. Yumerefendi, Benjamin Mickle, and Landon P. Cox

Access control miscon gurations are widespread and can result in damaging breaches of con dentiality. This paper presents TightLip, a privacy management system that helps users de ne what data is sensitive and who is trusted to see it rather than forcing them to understand or predict how the interactions of their software packages can leak data. The key mechanism used by TightLip to detect and prevent breaches is the doppelganger process. Doppelgangers are sandboxed copy processes that inherit most, but not all, of the state of an original process. The operating system runs a doppelganger and its original in parallel and uses divergent process outputs to detect potential privacy leaks. Support for doppelgangers is compatible with legacy-code, requires minor modi cations to existing operating systems, and imposes negligible overhead for common workloads. SpecWeb99 results show that Apache running on a TightLip prototype exhibits a 5% slowdown in request rate and response time compared to an unmodi fied server environment.

My Summary:
Idea is to tag some files as secret using 1 bit in the inode attrs for a file (modified ext3). Then as processes run, if they read a secret file, the OS starts a doppleganger process. The doppleganger runs the same as the original process with one exception. Instead of getting the actual data from secret files, it obtains data from shadow copies of these files that have been scrubbed. These scrubbed copies do not have any of the sensitive information. Scrubbers are file-type specific, and so they cannot be used across any type of file. The execution of the original and the doppleganger processes are constantly compared to find out when they diverge. This is done by monitoring the stream of system calls they make. If the processes do diverge, then the execution of the original file depends on the contents of the secret files. If the original attempts to send data over the network, a security breach might occur. The user's security policy decides what to do in that case: replace the original process with the doppleganger, allow the security breach, or kill the process. All of these operations are done very efficiently because dopplegangers are new kernel objects that share much of the kernel state with the original process and they do not execute system calls.

Well written paper that's easy to read and understand. Cute idea (that I have been toying with in my head until I was pointed to the paper). Seems useful for a wide class of applications and has very low overhead.

TightLip suffers from a number of limitations. Most important: Large TCB (Kernel is trusted, they need type-specific trusted scrubbers), small probability of false negatives (if scrubbed data turns out to have similar properties as original data, e.g. parity), can have false positives (any divergence between doppleganger and original marks everything afterwards as tainted, similar to implicit flows). False positives are a minor problem when execution does not depend on value of secret data like FTP, but they can be a big problem when TightLip is used for a full system and for every application. It is also probably not feasible to use to protect against malware (malware can create many false positives to make the system unusable), but that was not the authors' intention anyway.

Wednesday, January 19, 2011

Thoughts on RON: Resilient Overlay Networks

Authors: David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek, Robert Morris

Venue: Proc. 18th ACM SOSP, Banff, Canada, October 2001


The paper describes the design and implementation of an overlay network that can be used to subvert the underlying default IP routing. A RON is a set of nodes that cooperate to select the best overlay path to route traffic over given an application's requirements. The application links to the RON library and uses the library's functions to send and receive traffic. Each RON node monitors the connection quality to every other node in the network, and uses that information to best route traffic.

There is no authentication in RON, and all nodes have to implicitly trust each other. However, RON does provide the ability for node providers to specify complex policies on what traffic to accept (constrained by the lack of authentication). But without authentication, it would be difficult to bill any particular entity for traffic, an important aspect given that RON nodes need to be quite powerful.

Apparently, a route diversion of only one hop has been found to achieve quite a significant boost in performance and to solve most of the problems, and it was found that routing over RON has enabled connectivity recovery in less than 20s, much faster than BGP reconvergence.

RON does not scale well, and so RONs need to be limited in size to about 50. The scalability bottleneck is due to the fact that each node does quite a bit of monitoring on various paths and maintains a large database. However, there have been follow ups to the work that try to improve on the scalability.

Tuesday, January 18, 2011

Thoughts on Overlay Networks and the Future of the Internet

Authors: Dave Clark, Bill Lehr, Steve Bauer, Peyman Faratin, Rahul Sami, John Wroclawski

Venue: Communications and Strategies Journal, no. 63, 3rd quarter 2006, p1


The paper provides a good overview on overlays and attempts to provide a formal definition and taxonomy.

An overlay is a set of servers deployed across the Internet that:

    1.  Provide infrastructure to one or more applications,
    2. Take responsibility for the forwarding and handling of application data in ways that are different from or in competition with what is part of the basic Internet,
    3. Can be operated in an organized and coherent way by third parties (which may include collections of end-users).
  1. peer-to-peer e.g. Napster and Gnutella
  2. CDN e.g. Akamai
  3. Routing e.g. RON
  4. Security e.g. VPNs, Tor, Entropy
  5. Experimental e.g. PlanetLab, I3
  6. Other e.g. email, Skype, MBone
The authors assert that overlays do not follow the end-to-end principle because even though from the IP layer's point of view, overlay servers are simply end-nodes, from the application's point of view, they are considered infrastructure.

The paper discusses policy issues and the relationship between industry structure and overlays, asking several thought-provoking questions. It then goes into depth discussing the implications of CDN overlays, security overlays, and routing overlays.

One passage I really enjoyed was the description of why BGP is insufficient:
... Broadly speaking, BGP allows each ISP to express its policies for accepting, forwarding, and passing off packets using a variety of control knobs. BGP then performs a distributed computation to determine the "best" path along which packets from each source to each destination should be forwarded. 
This formulation raises two difficulties, one fundamental and one pragmatic. The first of these is that the notion of "best" is in fact insufficient to fully express the routing task. "Best" is a single dimensional concept, but routing is a multi-dimensional problem. Individual ISPs, in making their routing decisions, may choose to optimize a wide variety of properties. Among these might be 1) the cost of passing on a packet; 2) the distribution of traffic among different physical links within their infrastructure to maximize utilization and minimize congestion -  so-called traffic engineering; and 3) performance in some dimension, such as bandwidth available to the traffic or transmission delay across the ISP. Furthermore, because the management of each ISP chooses its own objectives, different ISPs may choose to optimize different quantities, leading to an overall path that captures no simple notion of "best", and rarely if ever is best for the user. 
A second, pragmatic problem with the current internet routing infrastructure is that it has evolved over time from one in which simple technical objectives dominated to one in which ISPs often wish to express complex policy requirements. For this reason the knobs - the methods available within BGP to control routing choices - have also evolved over time, and are presently somewhat haphazard and baroque. This compounds the fundamental problem by making it harder for ISPs to express precisely the policies they desire, even after those policies are known.
The paper overall is an easy, entertaining read and gives a nice overview of the issues surrounding overlays and their use and deployment in the Internet.

Thursday, March 4, 2010

Thoughts on Loose Source Routing as a Mechanism for Traffic Policies

Authors: Katerina Argyraki, David R. Cheriton

Venue: Proceedings of the ACM SIGCOMM workshop on Future directions in network architecture



  • Senders want to choose paths so they can route around failures/congestion and use "good" paths for some definition of good. These are transmit policies. **JN: Don't they care more about paths to them though?
  • Receivers want to filter incoming packets to mitigate DDoS attacks. These are receive policies.
  • Today can only control first/last hop which is insufficient.
  • Packet has WRAP shim layer between IP and payload.
  • WRAP header has two paths: forward and reverse.
  • Sender specifies the path through trusted relays and puts their IP addresses in the forward path. Reverse path is empty.
  • The sender also sets the IP dst field to the first relay's IP address and sets the src field to its IP address.
  • At each relay, the first hop is popped from the forward path and used as the IP dst address, and the outgoing interface's IP addr is used as the src address. The old src address is put in the reverse path.
  • This continues until destination is reached.
The authors claim this is better than IP's LSRR option because:
  1. They claim it is easier to do in hardware ** JN: Maybe only marginally if anything.
  2. it does not go into the IP options field which causes packets to go into the router's slow path usually. They claim the issue is that the IP options field is variable length. Conventional wire-speed filters can be used to filter these packets. ** JN: I'm guessing not based on the WRAP header though.
  3. The src IP address is not that of the original source IP but that of previous relay, and the receiver uses the recorded reverse path rather than reversing the forward path as is done in traditional LSRR. They claim this makes it harder to hijack communication. **JN: I might be missing something here, but this claim seems totally bogus. A bad guy can make a packet look whichever way it wants and just insert its address in the reverse path. I don't see how WRAP helps at all.
For transmit policies, either the provider can choose paths or the end-host can. Obviously. They say that they can use things like FBR (Feedback-based Routing).

For receive policies, they need an additional mechanism which is Active Internet Traffic Filtering (AITF) which becomes more accurate using WRAP.

Other things like MPLS and DiffServ  end up becoming sort of like virtual circuits (Stealth Virtual Circuits) and go against the motivation for choosing datagrams for IP over a connection-oriented protocol.

(See comments below by Katerina for rebuttals and clarifications)

Friday, December 11, 2009

Thoughts on Automatic Patch-Based Exploit Generation is Possible: Techniques and Implications

Authors: David Brumley (CMU), Pongsin Poosankam (CMU), Dawn Song(UC Berkeley), Jiang Zheng (U. Pittsburgh)

Venue: 2008 IEEE Symposium on Security and Privacy

The work describes how to generate exploits for an unpatched binary program P given a patched version P'. This is important because it highlights the fact that the patching process itself is important. Staggering updates over a long period of time can lead to exploits that were previously unknown and that can be generated in minutes.

The work, however, only targets input validation checks. So, for example, if a buffer overflow was fixed by checking the input size or the pointer, the system the author built works. If, on the other hand, it was fixed by increasing the buffer size, the system fails.

The authors generate exploits (user inputs to a program that compromises its safety) in four main steps:
  1. "Identify the new sanitization checks added in P'." This is done using off-the-shelf binary differencing tools like EBDS. EBDS does syntactic analysis, but not semantic. So if a check in P is i > 10 while in P' it's i - 1 > 10 the latter might be reported as a new check by EBDS. The verification step later on weeds out these semantic differences.
  2. For each new check found, generate a candidate exploit which fails the new check in P' by:
    1. Calculating the boolean function F that specifies the weakest constraints on the inputs that would fail these checks. This is done by basically trying to find which instructions in the program to include in the formula. They use three approaches:
      1. Dynamic: If we know the constraints on the inputs that exercise the check, they simply specify the constraints that lead to the new check and add the constraint that the input must fail the new check. This is the case if the new check is on a common path executed for all inputs. This is the quickest method, and generates the easiest constraint functions to solve. In practice, this is done by analyzing an execution trace, lifting the executed instructions into the Vine modeling language for easier analysis, then doing the analysis.
      2. Static: (See Creating vulnerability signatures using weakest preconditions) In this approach, the whole program is lifted to the Vine modeling language first. Then, the program's control flow graph is analyzed and the program is chopped so that it only includes paths that lead to the new checks. The constraint function is then basically the whole CFG. This approach is more thorough and will generate exploits that might require more searching in the dynamic approach, but the generated functions are usually much larger and more difficult to solve.
      3. Hybrid: Use the dynamic approach to reach a junction point in the execution closer to the new check, then use the static approach between that point and the new check. This approach creates smaller constraint functions so they are easier to solve but also allows wider coverage to find exploits.
        They automate the process by iterating through junction points until an exploit is found, where the heuristic they use for these points is to place them at procedure boundaries. While the past two approaches have been done by other before, the authors claim the hybrid approach as a novel contribution.
    2. Solve the constraint function F to find the candidate exploits using an off-the-shelf solver.
  3. Check that the candidate exploit is a real exploit by checking that it compromises the program's "safety policy", where "a safety policy" is a function that maps the program's state space to a boolean value of safe or unsafe. Examples include type checking, buffer overflows, ... In the paper, the authors use policies that are enforceable by their execution monitor (dynamic taint analysis to detect information leaks and memory safety issues, dynamic type checking, return address integrity, ...)
  4. Generate additional variants x' by solving the new constraint function F'(x') = F(x') and x' != x

If the solver fails to find an input, then the path described by the constraint function does not have an exploit. If it takes too much time, then the authors set a timeout so that the system moves on to check a new path by, for example, changing the mix (junction) point in the hybrid approach.

For evaluation, the authors ran their system on 5 MS updates. Some exploits worked with dynamic analysis, others worked with static analysis, and other required the hybrid approach. Exploits required between tens of seconds to less than 3 minutes, and most of them resulted in control hijacking.

The techniques seem to also be useful in other applications such as automatic deviation detection to check whether two programs behave the same, or, more concretely, to check if compiler optimizations affected the behavior of a program. But this doesn't seem to me to be easily attainable since the differences between two programs might be large enough that the techniques don't work.

The solutions to the automatic patch exploit generation problem they propose are obvious: obfuscation, patch encryption so that everyone can apply it simultaneously, and fast patch distribution using P2P. They mention that off-line systems would stay vulnerable, but this is the same issue with the current path distribution architecture.

The Likes:
  • The paper presents a cool attack
  • It works (at least for the cases shown)!

The Dislikes:
  • There was no real quantitative estimate of how important this is in practice or how well the system will fare in general. What is the percentage of bugs it does work on and through which a bad guy can cause real damage?
  • There was too much notation and repetition
  • I did not understand anything on dynamic analysis. The math notation looked horrible. I'm sure the ideas are probably not too difficult to understand, but I didn't want to wade through it with no helping hand (i.e. real explanation).
  • This is not quite about the paper, but there are no satisfactory solutions to the problem they raised yet. It would've been nice if they were able to describe one, but that's not the goal here.
  • I wish there was a bit more education on the technical background.