Thursday, April 14, 2011

Thoughts on What's New About Cloud Computing Security

Authors: Yanpei Chen, Vern Paxson and Randy H. Katz
URL: http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-5.html
Abstract:
While the economic case for cloud computing is compelling, the security challenges it poses are equally striking. In this work we strive to frame the full space of cloud-computing security issues, attempting to separate justified concerns from possible over-reactions. We examine contemporary and historical perspectives from industry, academia, government, and “black hats”. We argue that few cloud computing security issues are fundamentally new or fundamentally intractable; often what appears “new” is so only relative to “traditional” computing of the past several years. Looking back further to the time-sharing era, many of these problems already received attention. On the other hand, we argue that two facets are to some degree new and fundamental to cloud computing: the complexities of multi-party trust considerations, and the ensuing need for mutual auditability.

My Summary:
This technical report provides many nice analogies and comparisons between mechanisms and problems that have been faced in the days of yore (e.g. Multics) and today's cloud computing environment. There are several examples of what problems are old and what are new. It has a nice compilation of anecdotes and references on how the Cloud has been broken or abused such as FBI raiding a Cloud data center because one of the customers might have been doing illegal stuff. They emphasize that one of the major problems faced today is auditing so that authorities know they have seized all evidence and so that customers know that only required evidence was seized and no more. Similarly, auditing is required to help customers and providers build mutual trust. The paper does not break new ground, but does nicely summarize many of the issues faced in cloud computing security. However, they gloss over regulatory issues such as enforcing HIPAA compliance in multi-tenant systems and the possibility of having to update these requirements.

Wednesday, April 13, 2011

Thoughts on Apiary: Easy-to-Use Desktop Application Fault Containment on Commodity Operating Systems

Authors: Shaya Potter and Jason Nieh
URL: http://www.usenix.org/event/atc10/tech/full_papers/Potter.pdf
Abstract:

Desktop computers are often compromised by the interaction of untrusted data and buggy software. To address this problem, we present Apiary, a system that transparently contains application faults while retaining the usage metaphors of a traditional desktop environment. Apiary accomplishes this with three key mechanisms. It isolates applications in containers that integrate in a controlled manner at the display and file system. It introduces ephemeral containers that are quickly instantiated for single application execution, to prevent any exploit that occurs from persisting and to protect user privacy. It introduces the Virtual Layered File System to make instantiating containers fast and space efficient, and to make managing many containers no more complex than a single traditional desktop. We have implemented Apiary on Linux without any application or operating system kernel changes. Our results with real applications, known exploits, and a 24-person user study show that Apiary has modest performance overhead, is effective in limiting the damage from real vulnerabilities, and is as easy for users to use as a traditional desktop.


My Summary:
Apiary uses Linux containers to isolate sets of programs, which they term applications. Their main contribution is making container isolation useable. They do this in three main ways:

  1. Display integration so that the windows from each container show up in the integrated desktop environment. They use MetaVNC for this with a daemon running in each container.
  2. Container file system integration, which they call Virtual Layered File System (VLFS) that allows files that are the same within each container to be shared copy-on-write to a new layer that obscures the original file. All files are shared read-only between containers to begin with, but each layer can have its own private files in its own layer. Applications such as Firefox will have to be instantiated separately  for accessing secure websites such as banks and other general websites. VLFS is based on unioning file systems with some extensions to handles updates between layers and to handle deletes.
  3. A global application layer enables applications to be instantiated within their own containers from other containers. For example, Firefox can call /usr/bin/xpdf instantiated within its own container through this layer.
Apiary uses the notion of ephemeral containers for applications that do not need to store persistent state across executions. For example, an ephemeral container can be used to instantiate viewers and browsers such as xpdf and Firefox or programs such as virus-scanners.

Apiary has low overhead compared to Linux, because VLFS is fast and has no real computation required except for name lookups. Additionally containers are based on Linux Containers which are fast themselves. The authors did a usability study that showed that Apiary isn't too annoying. However, they did not consider the fact that it would be annoying to have multiple email clients and multiple Web browsers.

Apiary is an interesting idea, and is a step in the right direction for creating better isolation. It protects the user from buggy viewers such as music players and pdf viewers when exploited by malicious input files because these exploits cannot persist. But because ephemeral containers always have full read-only access to the filesystem, it seems that they would be able to exfiltrate sensitive data unless the containers are locked down with no network access. On the other hand, such exploits do not persist because ephemeral containers store no state between executions.

I can definitely see how VLFS can be useful in other similar systems. Sharing in Apiary is explicitly done through a separate container that has a special-purpose file manager. It seems to me this would be quite annoying. Is there an easier way?


Thoughts on TightLip: Keeping Applications from Spilling the Beans

Authors: Aydan R. Yumerefendi, Benjamin Mickle, and Landon P. Cox
URL: http://www.cs.duke.edu/~lpcox/nsdi07/camera.pdf
Abstract: 

Access control miscon gurations are widespread and can result in damaging breaches of con dentiality. This paper presents TightLip, a privacy management system that helps users de ne what data is sensitive and who is trusted to see it rather than forcing them to understand or predict how the interactions of their software packages can leak data. The key mechanism used by TightLip to detect and prevent breaches is the doppelganger process. Doppelgangers are sandboxed copy processes that inherit most, but not all, of the state of an original process. The operating system runs a doppelganger and its original in parallel and uses divergent process outputs to detect potential privacy leaks. Support for doppelgangers is compatible with legacy-code, requires minor modi cations to existing operating systems, and imposes negligible overhead for common workloads. SpecWeb99 results show that Apache running on a TightLip prototype exhibits a 5% slowdown in request rate and response time compared to an unmodi fied server environment.

My Summary:
Idea is to tag some files as secret using 1 bit in the inode attrs for a file (modified ext3). Then as processes run, if they read a secret file, the OS starts a doppleganger process. The doppleganger runs the same as the original process with one exception. Instead of getting the actual data from secret files, it obtains data from shadow copies of these files that have been scrubbed. These scrubbed copies do not have any of the sensitive information. Scrubbers are file-type specific, and so they cannot be used across any type of file. The execution of the original and the doppleganger processes are constantly compared to find out when they diverge. This is done by monitoring the stream of system calls they make. If the processes do diverge, then the execution of the original file depends on the contents of the secret files. If the original attempts to send data over the network, a security breach might occur. The user's security policy decides what to do in that case: replace the original process with the doppleganger, allow the security breach, or kill the process. All of these operations are done very efficiently because dopplegangers are new kernel objects that share much of the kernel state with the original process and they do not execute system calls.

Well written paper that's easy to read and understand. Cute idea (that I have been toying with in my head until I was pointed to the paper). Seems useful for a wide class of applications and has very low overhead.

TightLip suffers from a number of limitations. Most important: Large TCB (Kernel is trusted, they need type-specific trusted scrubbers), small probability of false negatives (if scrubbed data turns out to have similar properties as original data, e.g. parity), can have false positives (any divergence between doppleganger and original marks everything afterwards as tainted, similar to implicit flows). False positives are a minor problem when execution does not depend on value of secret data like FTP, but they can be a big problem when TightLip is used for a full system and for every application. It is also probably not feasible to use to protect against malware (malware can create many false positives to make the system unusable), but that was not the authors' intention anyway.


Wednesday, January 19, 2011

Thoughts on RON: Resilient Overlay Networks

Authors: David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek, Robert Morris

Venue: Proc. 18th ACM SOSP, Banff, Canada, October 2001

Summary:

The paper describes the design and implementation of an overlay network that can be used to subvert the underlying default IP routing. A RON is a set of nodes that cooperate to select the best overlay path to route traffic over given an application's requirements. The application links to the RON library and uses the library's functions to send and receive traffic. Each RON node monitors the connection quality to every other node in the network, and uses that information to best route traffic.

There is no authentication in RON, and all nodes have to implicitly trust each other. However, RON does provide the ability for node providers to specify complex policies on what traffic to accept (constrained by the lack of authentication). But without authentication, it would be difficult to bill any particular entity for traffic, an important aspect given that RON nodes need to be quite powerful.

Apparently, a route diversion of only one hop has been found to achieve quite a significant boost in performance and to solve most of the problems, and it was found that routing over RON has enabled connectivity recovery in less than 20s, much faster than BGP reconvergence.

RON does not scale well, and so RONs need to be limited in size to about 50. The scalability bottleneck is due to the fact that each node does quite a bit of monitoring on various paths and maintains a large database. However, there have been follow ups to the work that try to improve on the scalability.

Tuesday, January 18, 2011

Thoughts on Overlay Networks and the Future of the Internet

Authors: Dave Clark, Bill Lehr, Steve Bauer, Peyman Faratin, Rahul Sami, John Wroclawski

Venue: Communications and Strategies Journal, no. 63, 3rd quarter 2006, p1

Summary:


The paper provides a good overview on overlays and attempts to provide a formal definition and taxonomy.

Definition:
An overlay is a set of servers deployed across the Internet that:

    1.  Provide infrastructure to one or more applications,
    2. Take responsibility for the forwarding and handling of application data in ways that are different from or in competition with what is part of the basic Internet,
    3. Can be operated in an organized and coherent way by third parties (which may include collections of end-users).
Taxonomy:
  1. peer-to-peer e.g. Napster and Gnutella
  2. CDN e.g. Akamai
  3. Routing e.g. RON
  4. Security e.g. VPNs, Tor, Entropy
  5. Experimental e.g. PlanetLab, I3
  6. Other e.g. email, Skype, MBone
The authors assert that overlays do not follow the end-to-end principle because even though from the IP layer's point of view, overlay servers are simply end-nodes, from the application's point of view, they are considered infrastructure.

The paper discusses policy issues and the relationship between industry structure and overlays, asking several thought-provoking questions. It then goes into depth discussing the implications of CDN overlays, security overlays, and routing overlays.

One passage I really enjoyed was the description of why BGP is insufficient:
... Broadly speaking, BGP allows each ISP to express its policies for accepting, forwarding, and passing off packets using a variety of control knobs. BGP then performs a distributed computation to determine the "best" path along which packets from each source to each destination should be forwarded. 
This formulation raises two difficulties, one fundamental and one pragmatic. The first of these is that the notion of "best" is in fact insufficient to fully express the routing task. "Best" is a single dimensional concept, but routing is a multi-dimensional problem. Individual ISPs, in making their routing decisions, may choose to optimize a wide variety of properties. Among these might be 1) the cost of passing on a packet; 2) the distribution of traffic among different physical links within their infrastructure to maximize utilization and minimize congestion -  so-called traffic engineering; and 3) performance in some dimension, such as bandwidth available to the traffic or transmission delay across the ISP. Furthermore, because the management of each ISP chooses its own objectives, different ISPs may choose to optimize different quantities, leading to an overall path that captures no simple notion of "best", and rarely if ever is best for the user. 
A second, pragmatic problem with the current internet routing infrastructure is that it has evolved over time from one in which simple technical objectives dominated to one in which ISPs often wish to express complex policy requirements. For this reason the knobs - the methods available within BGP to control routing choices - have also evolved over time, and are presently somewhat haphazard and baroque. This compounds the fundamental problem by making it harder for ISPs to express precisely the policies they desire, even after those policies are known.
The paper overall is an easy, entertaining read and gives a nice overview of the issues surrounding overlays and their use and deployment in the Internet.