An RDF store (unlike an rdbms store) usually indexes everything. This means that the store's indexes are often larger then the data that it contains. Such large indexes makes I/O expensive and caching difficult. Scaling vertically by adding more memory for caching and faster disks can make a big difference, but it can get very expensive very quickly.
The alternative is to scale horizontally. This can be done in one of two ways: by mirroring the indexes on other machines, or by partitioning the indexes to other machines. The first option, called clustering, can reduce the I/O load, but will still have difficult caching it. The second option, called federating, reduces the I/O load on each machine, and can allow each machine to specialize, making caching much more effective.
Federating RDF stores is now going to be a lot easier with Sesame 3.0. Sesame 3.0 will support federating multiple (distributed) Sesame Repositories into a unified store. This allows large indexes to be distributed on multiple machines that are connected over a network. The federation supports multiple ways of partitioning the data. It can be partitioned by predicate (property), by subject, or both. When properly setup the federation can effectively proxy queries to the specialized members and join queries among the distributed members. For large-scale RDF stores, federating is becoming a valuable solution for RDF architecture.
Instructions for setting up a read only Sesame Federation can be found here: