Summary:
The paper presents a solution to the multipath routing and scalalability problems in the current Internet that uses BGP. The new solution solves these problems while allowing all BGP policies, exposing a large set of paths, making it possible to know what paths would be used, giving senders the choice in path selection, and reducing the size of the forwarding information base (FIB). The dataplane works as follows:
- Each network will label its routers and end-hosts with one or more vnodes.
- Each network exposes a vnode to its neighbors. When packets arrive from that neighbor, they will arrive to the exposed vnode.
- Each network constructs a set of paths it allows using its vnodes and the neighbor's vnodes. These paths are called pathlets.
- Each pathlet is labeled with a forwarding identifier (FID), and each (first hop vnode, FID) pair is globally unique. The first hop vnode is the first vnode in a pathlet identified by its FID.
- Pathlets are disseminated through the network (this is the control plane's job).
- A network can then proceed to construct more general pathlets that start at a vnode in the network but can contain vnodes in other non-neighboring networks.
- These "general" pathlets are of the form v1, FID1, ..., FIDn
- A sender constructs a path by finding a number of pathlets that together connect the sender to the receiver. The path looks like the following: v1, F1, ..., Fn where the Fi are FIDs.
- When the packet arrives at the router labeled v1 in the sender's provider, the router checks that v1 is the start of the pathlet identified with F1, and drops it if not.
- Otherwise, there are two choices. 1) If the FID1 is a "base" case FID that only connects local vnodes with neighboring vnodes, the router pops v1 and F1 from the path and replaces them with the vnode v2 at the end of the pathlet. 2) Otherwise if F1 is a "generalized" pathlet, then is replaces v1 and F1 with v2, FID2, ..., FIDm. So the packet now has path: v2, FID2, ..., FIDm, F2, ..., Fn.
- Each (vnode, FID) pair at the beginning of the path is either replaced with one vnode or with a vnode and a list of other FIDs, until the destination is reached.
Pathlet dissemination uses a path vector algorithm.
For a particular class of transit policies the authors call local transit (LT) policies, forwarding tables in the switches can be very small. These LT policies construct only "base" case pathlets that only have local vnodes or neighboring vnodes. Unfortunately, these policies cannot depend on a route's destination such as in BGP.
They come up with the notion of "policy emulation" to compare their architecture to others. They define "policy emulation" as: protocol P can emulate protocol Q if P can produce every outcome that Q can produce while using a similar amount of overhead. By their admission, this definition is limited since it has no notion of security or cost.
Pathlet routing can emulate BGP and IP strict routing with no modifications. To emulate MIRO and NIRA, the authors needed to extend the pathlet routing dissemination protocol to allow a router to pull pathlets from a non-neighboring vnode. On the other hand, pathlet routing cannot emulate "generalized" NIRA or Feedback Based Routing (FBR) because they use a packet's previous hops. In addition, FBR has both blacklisting and whitelisting that can represent some policies much more efficiently than pathlets. Pathlet rouing cannot emulate routing deflections and path splicing since they would require a large amount of control traffic to the senders.
The evaluation was alright. They cite a much smaller number for average AS-level path length than we expected (3.77) using CAIDA as-rank.caida.org. The maximum header size they use is 10 bytes which is very reasonable. I didn't look too deeply at the eval, and nothing popped out as weird or extraordinary.
The Good:
They describe a really cute way to do source routing while adhering to transit network policies. They do this while still not requiring a large overhead. While hinted at in the paper, pathlets and vnodes can be used for various network level services such as QoS. They also came up with the interesting notion protocol emulation.
The Bad:
There is a practicality side that has been completely overlooked: how do you actually put this in the real world. First, I don't know how to build such a thing easily in hardware. Also, the packet size changes between hops; how does this affect feasibility? What about security? Can the vnodes be attached to the neighbors?
[edit] Actually I don't really think the above is so important (after much more thinking). What's more important is the following:
- It is difficult (if not impossible) to create policies that depend on prefixes of the path
- The ASes can screw with a source by using a different path
- The receiver has no idea what path was followed (Can it even tell what sender sent something?)
I really did not like the fact that they needed an "extension" to the dissemination protocol. This seemed like a hack in order to allow pathlet routing to emulate NIRA and MIRO. Then again, they could probably come up with a nicer and more efficient dissemination protocol.
Yup, the paper is nice. I liked especially the vnode abstraction and hacks with it.
ReplyDeleteI would some things to "the bad" part. Since sources can select more paths and since price is not part of equation, this protocol ought to be prone to oscilations. Sources can certainly measure path quality and select their paths based on it. Given two transit AS to the same destination, most sources fluctuate beetween unless some sophisticated pricing/feedback mechanism is designed.
There is also an issue with failures/recovery. Right now if path fails, my packets are rerouted somewhere far away - sometimes I don't even know about it. Since we move to source routing the intermediate nodes are powerless to reroute the traffic.