Jad's Blog

Monday, April 27, 2009

Thoughts on The Omnivore's Dilemma

Author: Michael Pollan

Summary:
The author tries to follow four food chains: the industrial, the industrial organic, the pastoral, and the forest.

Industrial: Corn -> Cattle -> processing -> supermarket -> McDonald's -> Humans
Industrial Organic: Corn -> Cattle/Chicken -> supermarket -> Humans
Pastoral: Grass -> Animals -> Humans
Forest: Hunt/gather -> humans

The book describes origins of corn and its history, how food became cheap, and so on. It talks about each piece in the food chains above in details and talks about their history. The last part about hunting and gathering was particularly entertaining.

The Good:
The book inspired me to start my own garden. I've learned a load of interesting trivia and information. It is very entertaining and eye-opening.

The Bad:
At times the author is a little too judgmental and biased. But this is not too bad.

The Ugly:
None really.

Thoughts on Pathlet Routing

Authors: P. Brighten Godfrey, Igor Ganichev, Scott Shenker, Ion Stoica

Summary:
The paper presents a solution to the multipath routing and scalalability problems in the current Internet that uses BGP. The new solution solves these problems while allowing all BGP policies, exposing a large set of paths, making it possible to know what paths would be used, giving senders the choice in path selection, and reducing the size of the forwarding information base (FIB). The dataplane works as follows:

Each network will label its routers and end-hosts with one or more vnodes.
Each network exposes a vnode to its neighbors. When packets arrive from that neighbor, they will arrive to the exposed vnode.
Each network constructs a set of paths it allows using its vnodes and the neighbor's vnodes. These paths are called pathlets.
Each pathlet is labeled with a forwarding identifier (FID), and each (first hop vnode, FID) pair is globally unique. The first hop vnode is the first vnode in a pathlet identified by its FID.
Pathlets are disseminated through the network (this is the control plane's job).
A network can then proceed to construct more general pathlets that start at a vnode in the network but can contain vnodes in other non-neighboring networks.
These "general" pathlets are of the form v1, FID1, ..., FIDn
A sender constructs a path by finding a number of pathlets that together connect the sender to the receiver. The path looks like the following: v1, F1, ..., Fn where the Fi are FIDs.
When the packet arrives at the router labeled v1 in the sender's provider, the router checks that v1 is the start of the pathlet identified with F1, and drops it if not.
Otherwise, there are two choices. 1) If the FID1 is a "base" case FID that only connects local vnodes with neighboring vnodes, the router pops v1 and F1 from the path and replaces them with the vnode v2 at the end of the pathlet. 2) Otherwise if F1 is a "generalized" pathlet, then is replaces v1 and F1 with v2, FID2, ..., FIDm. So the packet now has path: v2, FID2, ..., FIDm, F2, ..., Fn.
Each (vnode, FID) pair at the beginning of the path is either replaced with one vnode or with a vnode and a list of other FIDs, until the destination is reached.

Note that a sender cannot cheat because only allowed paths will have valid FIDs, and packets wanting to use FIDs must arrive at the correct vnode.

Pathlet dissemination uses a path vector algorithm.

For a particular class of transit policies the authors call local transit (LT) policies, forwarding tables in the switches can be very small. These LT policies construct only "base" case pathlets that only have local vnodes or neighboring vnodes. Unfortunately, these policies cannot depend on a route's destination such as in BGP.

They come up with the notion of "policy emulation" to compare their architecture to others. They define "policy emulation" as: protocol P can emulate protocol Q if P can produce every outcome that Q can produce while using a similar amount of overhead. By their admission, this definition is limited since it has no notion of security or cost.

Pathlet routing can emulate BGP and IP strict routing with no modifications. To emulate MIRO and NIRA, the authors needed to extend the pathlet routing dissemination protocol to allow a router to pull pathlets from a non-neighboring vnode. On the other hand, pathlet routing cannot emulate "generalized" NIRA or Feedback Based Routing (FBR) because they use a packet's previous hops. In addition, FBR has both blacklisting and whitelisting that can represent some policies much more efficiently than pathlets. Pathlet rouing cannot emulate routing deflections and path splicing since they would require a large amount of control traffic to the senders.

The evaluation was alright. They cite a much smaller number for average AS-level path length than we expected (3.77) using CAIDA as-rank.caida.org. The maximum header size they use is 10 bytes which is very reasonable. I didn't look too deeply at the eval, and nothing popped out as weird or extraordinary.

The Good:
They describe a really cute way to do source routing while adhering to transit network policies. They do this while still not requiring a large overhead. While hinted at in the paper, pathlets and vnodes can be used for various network level services such as QoS. They also came up with the interesting notion protocol emulation.

The Bad:
There is a practicality side that has been completely overlooked: how do you actually put this in the real world. First, I don't know how to build such a thing easily in hardware. Also, the packet size changes between hops; how does this affect feasibility? What about security? Can the vnodes be attached to the neighbors?
[edit] Actually I don't really think the above is so important (after much more thinking). What's more important is the following:

It is difficult (if not impossible) to create policies that depend on prefixes of the path
The ASes can screw with a source by using a different path
The receiver has no idea what path was followed (Can it even tell what sender sent something?)

The Ugly:
I really did not like the fact that they needed an "extension" to the dissemination protocol. This seemed like a hack in order to allow pathlet routing to emulate NIRA and MIRO. Then again, they could probably come up with a nicer and more efficient dissemination protocol.

Monday, April 6, 2009

Thoughts on Why Cryptosystems Fail

Authors: Ross Anderson

BibTeX:

@inproceedings{168615,
author = {Anderson,, Ross},
title = {Why cryptosystems fail},
booktitle = {CCS '93: Proceedings of the 1st
ACM conference on Computer and communications security},
year = {1993},
isbn = {0-89791-629-8},
pages = {215--227},
location = {Fairfax, Virginia, United States},
doi = {http://doi.acm.org/10.1145/168588.168615},
publisher = {ACM},
address = {New York, NY, USA},
}

Summary:
The author uses automated teller machines (ATMs) as a driving example through the paper. He explains how ATMs work, and how various attackers have or could have managed to defraud banks by attacking ATMs. These attacks are allowed due to human error, negligence, or malignance, lack of quality control, lack of a feedback loop, incomplete standards, ... Mainly, it is not due to the weaknesses usually studied formally in universities or companies such as cryptanalysis. The author maintains that security systems should be treated similar to safety critical systems. Security systems should have certification levels that take a whole security system into account from the cryptography at its lowest levels to the training of employees to the treatment of a wide array of threats.

The Good:
The paper presents a strong point well, mainly that security systems should be studied by looking at the environment where it will be used. The paper includes many interesting examples from the banking industry and the analogy between secure systems and safety critical systems such as avionics helps drive the point home.

The Bad:
The author says that the right paradigm is to make a security officer's job less mechanical by following the aviation industry's paradigm: a properly trained crew is the first line of defense. I'm not sure what he means by that point, but it is difficult to train people well, and even then it is difficult to do quality control over their work. It seems we would want to build systems that have been designed keeping in mind that individual components themselves are probably insecure but when put together make the whole system secure. Perhaps this is what he meant.

The Ugly:
There were way too many examples. I would have much preferred that the author elaborated on the paradigm developed and how it applies to some examples, than just list the examples for a few pages. Many things are obvious too, but that's probably because this is an older paper and we now know more.

Tuesday, March 31, 2009

Thoughts on NIRA: A New Internet Routing Architecture

Authors: Xiaowei Yang (MIT LCS)

BibTeX:

@inproceedings{Yang:NIRA,
author = {Yang,, Xiaowei},
title = {NIRA: a new Internet routing architecture},
booktitle = {FDNA '03: Proceedings of the ACM SIGCOMM workshop on Future directions in network architecture},
year = {2003},
isbn = {1-58113-748-0},
pages = {301--312},
location = {Karlsruhe, Germany},
doi = {http://doi.acm.org/10.1145/944759.944768},
publisher = {ACM},
address = {New York, NY, USA},
}

Summary:
In the intro, the author argues that it is important to allow users to choose their domain-level routes. This fosters competition and innovation and drives costs down. Then she adds a disclaimer, saying this is not provable. NIRA is designed with the following in mind: scalability, robustness, efficiency, heterogeneous user choices, and practical provider compensation. Security is not mentioned.

NIRA provides "valley-free" routes that go up a sender's provider chain and down a receiver's provider chain going through a single provider that is sommon to both chains. In the Core of the Internet, packets cannot be pushed further up. Addresses are 128-bit hierarchical addresses, and each address identifies a unique route segment to the core. A full route is represented as a number of "segments" that are valley-free. Each valley-free segment can be represented by a pair of addresses. Routes that have peers (i.e. are not valley-free because there is no common provider in the up and down chains) are represented with multiple addresses.

The author argues that source routes are not terribly insecure when end-to-end authentication is used (where the attacker spoofs an address and inserts its real address in the return source route to get replies). Address-based authentication should not be used.

The Core forms a "routing region", users don't choose providers in the core, and so for most cases, a pair of addresses suffices to indicate the route. Additional optimizations include extended addressing (adding an address to a peering link) and multiple routing regions.

Addressing feasibility: Looked at the route table dumps from Route Views server. Used heurestics to get provider/customer relationships. Concluded it looks feasible.

TIPP (Topology Information Propagation Protocol) is used instead of a LS protocol to limit topology information propagation, but does not include dynamic conditions of interconnections.

NRRS (Name-to-Route Resolution Service) act like DNS and translate hierarchical names to hard-coded routes.

When a high level provider changes its top-level provider, all customers and customers' customers and so on need to change their addresses. However, AS birth and death rate is less than 15 per day and AS-level link birth and death rate is less than 50. See Chen et al. The Origin of Power Laws in Internet Topologies Revisited. So, rate is low and changes won't happen often.

Error handling is done by a reactive message from a router or a timeout.

Nothing interesting is said about provider compensation. Business relationships are either between directly connected domains or indirectly connected ones. With indirectly connected ones, the problem of authenticating senders and receivers is an issue that is punted.

What happens in the DoS or spam case when the receiver has to pay for routes? The author says this is a problem with user-empowerment in general. Incorrectly, she assumes that because BGP can be used to control which providers announce an edge domain's adress prefixes, the problem is solved today. In NIRA this is still a problem.

The Good:
The paper suggests an interesting addressing and routing architecture that allows senders to select routes and receivers to slightly determine the choice. The architecture requires a small aount of state and does not seem to consume too much packet overhead (although this was not specifically measured).

The Bad:
The case for user choice is not very convincing. It's not clear to me that user choice of route will lead to all the green pastures that were promised in the paper. Issues of route availability and errors were not handled very well. Timeouts are horrible. It is also not clear to me how a receiver can control the path that packets take towards him. It seems that if a receiver is going to be paying money, he wants to have as much control over the route choice as a sender. I think a provider can choose any path it likes to reach a sender, and neither the sender not the receiver necessarily know about it.

The Ugly:
There are many issues that are left open that I thought the paper was going to talk about. The most important of these is the issue of authenticating users and billing them. Another problem is attacking a receiver and making him pay money for worthless traffic. Then again, I guess this is an issue any architecture would face. But this is especially important here because there is an emphasis on the receiver paying money for receiving traffic.

Sunday, March 29, 2009

Recursive and Iterative designs in Verilog

Disclaimer: Note that the examples below for the unary_and have not been compiled and are just there for demonstration (probably have a few minor bugs). The paramatrizeable priority decoder example on the other hand is complete, and is in production use.

The Verilog language allows designers to write recursive modules using a generate block. For example:

module unary_and
#(parameter WIDTH = 32)
(input [WIDTH-1:0] in_and,
output            out_and)

generate
 if(WIDTH == 1) begin
   assign out_and = in_and;
 end
 else if(WIDTH == 2) begin
   assign out_and = in_and[0] & in_and[1];
 end
 else begin
   unary_and #(.WIDTH (WIDTH/2))
     unary_and_low
       (.in_and  (in_and[WIDTH/2-1:0]),
        .out_and (out_and_low));

   unary_and #(.WIDTH (WIDTH - WIDTH/2))
     unary_and_high
       (.in_and  (in_and[WIDTH-1:WIDTH/2]),
        .out_and (out_and_high));

   assign out_and = out_and_low & out_and_high;
 end
endgenerate
endmodule

Unfortunately, not all synthesis tools can handle such designs. One way to get around this problem is by using using two nested loops to build each level of the tree of the recursive modules. The outer loop iterates over levels, and the inner loop creates the nodes at each level. For simplicity, we change the parameter to be the number of levels to create. Here's an example:

module unary_and
#(parameter NUM_LEVELS = 5,
 parameter WIDTH = 2**NUM_LEVELS)
(input [WIDTH-1:0] in_and,
output            out_and)

generate
 genvar i,j;
 // start at the lowest level in the tree
 for (i=0; i < NUM_LEVELS; i=i+1) begin:gen_levels
   // do nodes at each level
   for (j=0; j < 2**(NUM_LEVELS-i-1); j=j+1) begin:gen_nodes
     wire in_low;
     wire in_high;
     wire out;

     // at the lowest level, use the module's inputs
     if (i==0) begin
       assign in_low   = in_and[j*2];
       assign in_high  = in_and[j*2 + 1];
     end
     // otherwise, use the outputs of the lower level
     else begin
       assign in_low   = gen_levels[i-1].gen_nodes[j*2];
       assign in_high  = gen_levels[i-1].gen_nodes[j*2 + 1];
     end
  
     assign out = in_low & in_high;

   end
 end
endgenerate

assign out_and = gen_levels[NUM_LEVELS-1].gen_nodes[0].out;

endmodule

A more useful example is the design of an efficient parametrizeable priority decoder. I have looked everywhere and couldn't find a good design that worked for various input widths. I've found one that used a recursive structure, but of course that didn't synthesize. The example below shows a complete priority decoder. There are two implementation styles. If STYLE is set to 0 (default), then the implementation builds a multiplexer tree that multiplexes the value to be output. The other style constructs one bit of the output at each level. Below I give two modules, the priority_encoder itself and a tester to test that it works.

/***************************************************               
* $Id$                                                            
*                                                                 
* Module: priority_encoder.v                                      
* Project: NF2.1                                                  
* Author: Jad Naous
* Description: parametrizable priority encoder.                   
*                                                                 
* Highest priority by default given to rightmost bit.             
*                                                                 
* Usually just specify the OUTPUT_WIDTH. INPUT_WIDTH              
* is optional. Defaults to 2**OUTPUT_WIDTH.                       
*                                                                 
***************************************************/              
module priority_encoder                                            
#(parameter OUTPUT_WIDTH = 8,                                    
 parameter RIGHT_TO_LEFT_PRIORITY = 1)                          

 (input  [0:(2**OUTPUT_WIDTH)-1]  unencoded_input,
  output [OUTPUT_WIDTH-1:0]       encoded_output,
  output                          valid);     

localparam INPUT_WIDTH = 2**OUTPUT_WIDTH;
localparam INPUT_VAL_WIDTH = INPUT_WIDTH*OUTPUT_WIDTH;
localparam STYLE = 0;

  generate
     genvar i,j;
     if(STYLE==0) begin
        for(i=0; i<OUTPUT_WIDTH; i=i+1) 
        begin:gen_levels
           for(j=0; j<INPUT_WIDTH/(2**(i+1)); j=j+1)
           begin:gen_nodes
              wire [OUTPUT_WIDTH-1:0] value;
              wire                    valid;

              wire [OUTPUT_WIDTH-1:0] left_val;
              wire                    left_vld;
              wire [OUTPUT_WIDTH-1:0] right_val;
              wire                    right_vld;

              if(i==0) begin
                 assign left_val    = j*2;
                 assign left_vld    = unencoded_input[j*2];
                 assign right_val   = j*2 + 1;
                 assign right_vld   = unencoded_input[j*2+1];
              end
              else begin
                 assign left_val    
                     = gen_levels[i-1].gen_nodes[j*2].value;
                 assign left_vld    
                     = gen_levels[i-1].gen_nodes[j*2].valid;
                 assign right_val   
                     = gen_levels[i-1].gen_nodes[j*2+1].value;
                 assign right_vld
                     = gen_levels[i-1].gen_nodes[j*2+1].valid;
              end // else: !if(i==0)

              assign value = (RIGHT_TO_LEFT_PRIORITY ?
                              (right_vld ? right_val : left_val) :
                              (left_vld ? left_val : right_val));
              assign valid = right_vld | left_vld;
           end // block: gen_nodes
        end // block: gen_levels

        assign encoded_output 
            = gen_levels[OUTPUT_WIDTH-1].gen_nodes[0].value;
        assign valid
            = gen_levels[OUTPUT_WIDTH-1].gen_nodes[0].valid;
     end // if (STYLE==0)
    
     else begin
         for(i=0; i<OUTPUT_WIDTH; i=i+1)
         begin:gen_levels
            for(j=0; j<INPUT_WIDTH/(2**(i+1)); j=j+1)
            begin:gen_nodes
               wire [i:0]              value;
               wire                    valid;

               if(i==0) begin
                  assign value = RIGHT_TO_LEFT_PRIORITY 
                                 ? unencoded_input[j*2+1] 
                                 : unencoded_input[j*2];
                  assign valid = unencoded_input[j*2+1] 
                                 | unencoded_input[j*2];
               end
               else begin
                  wire [i-1:0]  left_val;
                  wire          left_vld;
                  wire [i-1:0]  right_val;
                  wire          right_vld;
                  assign left_val    
                      = gen_levels[i-1].gen_nodes[j*2].value;
                  assign left_vld
                      = gen_levels[i-1].gen_nodes[j*2].valid;
                  assign right_val
                      = gen_levels[i-1].gen_nodes[j*2+1].value;
                  assign right_vld
                      = gen_levels[i-1].gen_nodes[j*2+1].valid;
                  assign value
                      = (RIGHT_TO_LEFT_PRIORITY ?
                         (right_vld ? {1'b1, right_val} 
                                    : {1'b0, left_val}) :
                          (left_vld ? {1'b0, left_val} 
                                    : {1'b1, right_val}));
                  assign valid = right_vld | left_vld;
               end // else: !if(i==0)
            end // block: gen_nodes
         end // block: gen_levels

         assign encoded_output
             = gen_levels[OUTPUT_WIDTH-1].gen_nodes[0].value;
         assign valid
             = gen_levels[OUTPUT_WIDTH-1].gen_nodes[0].valid;
      end // else: !if(STYLE==0)
   endgenerate
endmodule   

// synthesise translate_off
module pri_encode_test ();
   reg [0:7] unencoded_input = 8'h0;
   wire [2:0] encoded_output;
   wire valid;
   
   priority_encoder
     #(.OUTPUT_WIDTH(3))
     priority_encoder
     (.valid (valid),
      .encoded_output (encoded_output),
      .unencoded_input (unencoded_input));

   initial begin
      unencoded_input[7] = 1'b1;
      unencoded_input[2] = 1'b1;
      unencoded_input[3] = 1'b1;

      #2 if(encoded_output != 7) $display("%t ERROR", $time);
      
      #2 unencoded_input = 0;
      unencoded_input[6] = 1'b1;
      unencoded_input[3] = 1'b1;
      unencoded_input[0]  = 1'b1;

      #2 if(encoded_output != 6) $display("%t ERROR", $time);
        
      #2 unencoded_input = 0;
      unencoded_input[0]  = 1'b1;

      #2 if(encoded_output != 0) $display("%t ERROR", $time);
   
      #2 unencoded_input = 0;
      unencoded_input[7] = 1'b1;

      #2 if(encoded_output != 7) $display("%t ERROR", $time);

      #2 unencoded_input = 255;

      #2 if(encoded_output != 7) $display("%t ERROR", $time);

      #2 $display("%t Test ended.", $time);
   end // initial begin
endmodule // pri_encode_test
// synthesise translate_on

Note that in synthesis, and surprisingly, style 0 turns out to be more efficient.