Home

Cassandra - A Decentralized Structured Storage System

Authors: Avinash Lakshman & Prashant Malik (Facebook folks)

Date: 2011

Link: PDF


  1. Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure.
  2. Data model:
  3. Client API:
  4. Typically a read/write request for a key gets routed to any node in the Cassandra cluster:
  5. Cassandra partitions data across the cluster using consistent hashing but uses an order pre- serving hash function to do so.
  6. The authors describes the core distributed systems techniques used in Cassandra: partitioning, replication, membership, failure handling and scaling.