Why Do We Need Locality?
OceanStore's object location scheme needs two qualities
- We should be free to place object replicas near their access points.
- Finding a local object should a local operation.
With the first, OceanStore can place objects near their access points. With the second, an object search will find these carefully placed replicas. This gives us locality, which has three main benefits.
- Obviously, there is the latency aspect -- we need to be
able to route quickly to objects wherever they are, with the shortest
total distance. When local hops (in our building for instance) are 1ms
and global ones are > 100ms, this can be the difference between a usable
and unusable system.
- The farther you
travel, the more likely it is that (1) your message will get lost or
corrupted and (2) that there will be a network partition that prevents
you from traversing the path at all. Thus, locality is really important
from an availability/reliability standpoint. If you have enough locality, you can survive even when disconnected from the rest of the world.
- Networks are typically characterized
by the amount of bisection bandwidth they have to offer. The more
locality you can achieve in your access patterns, the greater number of
total activities you can have happening simultaneously. Although we
often talk about bandwidth as "free" it is not there yet.
OceanStore needs an object location system allows objects to be placed
in arbitrary locations. Having three copies of the Soda Hall room
schedule does not give you good locality unless you can place one near
Soda Hall, and you may need many random copies before one ends up
Finding a local object must be a local operation if we want locality.
Locality of objection location is hard to get, so there might be a tendency to give up and say that while one might pay a lot for the first object access, you store a pointer to the nearest copy and so do not pay for as much for later accesses. Here are a couple problems with this idea.
- First accesses might be relatively important. Obviously, this is true if objects are accessed only once. This may in fact be the common case if objects are cached at the client.
- This limits mobility of replicas. If objects move frequently, the the value of cached pointers in limited to the rate at which replicas move.
OceanStore, in order to be at its best, needs an object location and routing system that does not travel outside the local area unless its necessary to do so. Most object location systems available (CAN and Chord, for example) do not make this a goal. The system developed by Plaxton, Rajaraman, and Richa does have locality; however, it is too complicated to be usable, and does not allow for insertion of nodes into the network. Tapestry gives up the the theoretical guarantees of PRR to build a practical system but tries to maintain the spirit of locality present in their work.
There may be other valid approaches to this problem; for example, you
might use two completely different methods of finding objects, on
which works on the local area, and one which works in the larger
network. OceanStore is exploring this two level structure, combining
Tapestry with attenuated Bloom filters.
Last modified on 07/06/2002 by Kris Hildrum.