Wednesday, May 13, 2009

Thoughts on The Collective: A Cache-Based System Management Architecture

Authors: Ramesh Chandra, Nickolai Zeldovich, Constantine Sapuntzakis, Monica S. Lam

BibTeX:

@INPROCEEDINGS{thecollective,
author = {Ramesh Chandra and Nickolai Zeldovich and Constantine Sapuntzakis and Monica S. Lam},
title = {The Collective: A Cache-Based System Management Architecture},
booktitle = {In Proc. 2nd Symposium on Networked Systems Design & Implementation (NSDI},
year = {2005},
pages = {259--272}
}

Summary:

The Collective is a system that allows users to load virtual appliances from a cloud, cache them locally, and run them. There are two types of data: User and Appliance. User data is mutable by the user and is stored and backed up online as the user modifies it (this include things like docs and profiles). Appliance data is immutable (except by an admin) and a pristine unmodified copy is re-run every time a VM is started with that appliance.

The Collective simplifies deployment and management. All machines run Virtual Appliance Transciever software (VAT) that contains the VMM and provides an interface for the user to login, select an appliance, and access his data. The VAT self-updates without requiring any intervention from the user.

Appliances can be updated by an admin and on the next reboot, a user would use the updated image. Updates are tracked by versioning using a simple numbering and directory hierarchy scheme. When downloaded, they are stored using Copy-on-Write (COW) disk caches for large blocks and use replication for small meta-data. It is possible to cache full appliance images so that a user can work offline disconnected from the appliance repo.

The evaluation section is very well constructed and nicely sums up how well the system works. They found prefetching works, and using traces to decide what to prefetch can be a significant boon to performance. Interactive and I/O intensive work is as expected: bad. Maintenance/upgrading/deploying is easy and painless.

The Good:
I've been thinking about a system like this for a while, and they've taken it to completion and even started a company with it (Mocha5). I think the architecture is sound but is limited by the technology (bandwidth, virtualization, ...). I enjoyed reading the experience section because it was an evaluation of whether the system met its goals. I will definitely read their eval and experience section again before writing my next paper.

The Bad:
Unfortunately, I/O bound and interactive ops are horrible. It's not clear if much can be done to remedy the situation, but since the Collective will be used for interactive apps and to manage users' desktops, it seems that the Collective is unusable. I would really like to manage my machines this way, but I'm already sick of Windows running so slowly in a VM.

They have not mentioned much about security. They only said they use SSH to transfer data and perform authentication. I guess this was not a concern and they assumed that only important thing was to make sure that the appliance itself did not get compromised. I would argue that the important piece is for the user data not to be compromised. Maybe the assumption they run with is that if a VM is secured and patched then the user data will be safe.

No comments:

Post a Comment