3.1 Neighbor Aliveness

[Next] [Up] [Previous] [Contents]
Next: 3.2 Configuration with Yoid Up: 3 Odds and Ends Previous: 3 Odds and Ends

3.1 Neighbor Aliveness

One problem endemic to yoid that you won't find to nearly the same degree in IP multicast is that of detecting neighbor aliveness. In IP multicast (in either its native or tunneled modes), each router has a relatively small number of neighbors, independent of the number of multicast groups. The neighbors are relatively nearby, often on the same wire. Furthermore, routers are ``predictable'' boxes--they can be expected to answer a ping in a certain amount of time with high probability. As a result, each router can afford to ping its neighbors frequently (say one ping per second), and can quickly deduce that a router is no longer alive with a high probability of correctness.

Not so with yoid. Here, any given member has several neighbors for every tree it has joined. It is easy to imagine a typical host joining hundreds of trees, each with only sporadic transmissions. For instance, one tree for each stock of interest, one for each mailing list of interest, one for each news source of interest, and so on. One ping per second, say, across several hundred neighbors amounts to a lot of overhead.

The problem is made worse by the fact that it is harder to determine aliveness for a typical desktop PC than for a router. A typical PC may suddenly become overloaded (for instance because a new application is being loaded), and may disappear for seconds at a time. Mistaking a busy neighbor for a down neighbor is costly. As a result, conservatively speaking, it can take on the order of half a minute to determine that a down neighbor is in fact down. While this is going on, the tree is partitioned.

We're looking for two things here. First, a way to do neighbor aliveness without generating huge volumes of traffic (relative to the amount of application data over the tree). Second, we want a way of dealing with the fact that neighbor down detection can take a long time.

Before getting into possible solutions, I want to first point out that at least for one class of application--namely file distribution--aliveness detection (or lack thereof) isn't as much of a problem, for two reasons. First, transmission to/from each neighbor is continuous--either the file has been fully received/sent or the neighbor should be actively sending/receiving it. Lack of activity works as a signal that the neighbor is unavailable. Explicit pinging is therefore not necessary and so there is no extra overhead.

Second, order-minute lapses in transmission are, for many applications, not a serious problem (provided they are relatively infrequent). An example would be download of a large software package. The download takes many minutes in any event, so an occasional lapse in transmission is certainly not going to break the application and may not even be noticed. Even streaming audio or video, provided that enough of the stream is locally buffered, can survive lapses in transmission.

In what follows I outline several approaches, that can be used alone or in combination, for attacking the following two scenarios. In both scenarios, application traffic is sporadic (not a steady predictable stream) and possibly infrequent. In the first scenario, occasional order-minute lapses are acceptable to the application, so we're primarily interested in reducing ping overhead. An example here would be mail distribution for a typical mailing list. In the second scenario, lapses are not acceptable--application data must be received very shortly after it is transmitted or the application breaks. An example here would be stock quotes for an active trader or certain internet games (understanding here that most games probably involve a pretty steady stream of traffic). I call the first scenario time-relaxed, and the second time-critical.

(The reader might be thinking that nobody in their right mind would use a network consisting of users' desktop PCs for distribution of time-sensitive stock information. I don't disagree, but to that I would repeat that one can always engineer a yoid infrastructure to look, topologically, like an IP or server infrastructure, using dedicated, reliable, and well-placed servers for yoid distribution. In other words, throwing money at the problem is a possible valid approach.)

Broadcast

The simplest approach is to simply distribute all application data using mesh broadcast. Because mesh broadcast can still work if one or more mesh neighbors go down, one can get away with pinging mesh neighbors less frequently than tree neighbors. Required ping frequency varies depending on the number of mesh neighbors, the probability that they will disconnect without notification, and the desired robustness of the application. If the volume of application data is similar or less than the ping volume, then simply doing mesh broadcast is a win. This approach is good for the time-critical scenario as well as the time-relaxed scenario.

Use the Mesh for Member Down Notification

Another approach is to use mesh broadcast not to deliver content per se, but rather to quickly inform members when a node is detected as non-responding. This is described in Section 2.3 under the heading ``YDP Sequence Numbering''. The process of detecting a node is down, broadcasting a notification about it, and having the recipients ``turn on'' alternative senders obviously takes time, but on the order of a few seconds, not half a minute. This means it should be adequate for many time-critical applications.

Aliveness Buddies

Both use-the-mesh approaches still generate per-tree ping traffic, which while less than a non-mesh approach still may amount to significant amounts of volume. An approach that is less sensitive to the number of trees a host may be attached to is the aliveness buddy approach. Here, each host selects a small number of other hosts (three or four), not necessarily among its current tree neighbors, as aliveness buddies. These hosts should be as nearby as possible.

Aliveness buddies ping each other continuously and can determine relatively quickly when each other has gone down. They also tell each other who all their current neighbors (tree and mesh) are. When a host detects that its aliveness buddy has gone away, it tells all of the buddy's neighbors. A host with a very large number of neighbors (because it is attached to a very large number of trees) could have more aliveness buddies, and could tell each of a portion of its neighbors.

This approach is not as useful for the time-critical scenarios, because a host, even if it pings frequently, must still wait a significant amount of time to insure that its buddy is really down.

[Next] [Up] [Previous] [Contents]
Next: 3.2 Configuration with Yoid Up: 3 Odds and Ends Previous: 3 Odds and Ends

Paul Francis
Fri Oct 1 11:06:22 JST 1999