Quarterly Technical Report, January 2003
This quarter we focused on the Spines overlay network infrastructure
and on the Wackamole NxWay failover for servers and routers.
- The Spines overlay network infrastructure.
We developed an end-to-end reliability over our hop-by-hop reliability
approach. We have a complete socket capability, similar to a TCP socket
that flows over the overlay end-to-end. As a by product of our approach,
we can now provide a TCP-fair implementation of an efficient user-level reliable
We demonstrated that employing hop-by-hop reliability techniques
considerably reduces the average latency and jitter of reliable
communication while still being fair with external Internet traffic.
In order to deploy our protocols over the Internet we considered networking
aspects such as congestion control, internal and external fairness,
flow control and end-to-end reliability.
We showed that the benefit of hop-by-hop reliability greatly overcomes
the overhead associated with reliable overlay routing given by factors
such as processing overhead and CPU scheduling, and achieves much better
performance compared to standard end-to-end TCP connections deployed
on the same overlay network.
We are getting ready to release Spines in the near future under as an open
source project. It should be available at Spines.org
- Wackamole: N-Way fail-over infrastructure for servers and routers.
We have evaluated Wackamole's performance varying the number of servers in the cluster
and adapting the latencies of the Spread toolkit to optimize performance.
We are able to achieve NxWay failover in a cluster within 12 seconds using
the standard timeouts of the Spread toolkit. This is improved to under 2 seconds using
a tuned version of Spread, that fits non-congested local area networks.
We have specified the Wackamole algorithm and proved its correctness, which
can be found in the technical report below.
N-Way Fail-Over Infrastructure for Survivable Servers and Routers. |
Technical Report CNDS-2002-5, December 2002.
Yair Amir, Ryan Caudy, Ashima Munjal, Theo Schlossnagle and Ciprian
Maintaining the availability of critical servers and routers is an important
concern for many organizations. At the lowest level, IP addresses represent the
global namespace by which services are accessible on the Internet.
We introduce Wackamole, a completely distributed software solution
based on a provably correct algorithm that negotiates the
assignment of IP addresses among the currently available servers upon
detection of faults. This reallocation ensures that at any given time
any public IP address of the server cluster is covered exactly once,
as long as at least one physical server survives the network fault.
The same technique is extended to support highly available routers.
The paper presents the design considerations,
algorithm specification and correctness proof, discusses
the practical usage for server clusters and for routers,
and evaluates the performance of the system.
We have released Wackamole version 2.0.0 in
November 2002. The system is supported now under Linux, FreeBSD, Solaris 8, and MacOS-X.
One of the main improvements is the new support for NxWay fail-over for routers.
So far, we have registered over 800 downloads of the software from our web site.
Plans for Next Quarter:
We plan to release the first version of Spines
as an overlay network research tool and make it available open source.
Our focus for the next quarter will be on providing multicast
functionality similar to IP Multicast using the overlay networks.
Questions or comments to:
webmaster (at) dsn.jhu.edu
TEL: (410) 516-5562
FAX: (410) 516-6134
Distributed Systems and Networks Lab|
Computer Science Department
Johns Hopkins University
3400 N. Charles Street
Baltimore, MD 21218-2686