Sandstore

A framework for building distributed storage architectures

Modular • Storage-Agnostic • Deploy in Minutes

STARS-//COMMITS-//CONTRIBUTORS-//OPEN ISSUES-//LAST COMMIT-

What Is Sandstore?

Sandstore is a modular framework for building and experimenting with distributed storage architectures. Define your components, swap implementations freely, and explore how fundamental design decisions change the behavior of your system. Whether you are building file or object storage, whether metadata is colocated or separated, the framework stays the same. Spin up a real cluster in minutes with Docker and Kubernetes.

For Students

Study distributed systems concepts through a working implementation. Read the interfaces, follow the wiring, run the cluster, and see exactly how the pieces fit together.

For Researchers

Experiment with how architectural decisions change system behavior. Swap components, build new topologies, and test your hypotheses against a real running cluster.

For Engineers

Explore production distributed storage patterns in a clean, modular codebase. Understand the tradeoffs before committing to them in your own systems.

Core Features

Pluggable Components

Every layer is an interface with swappable implementations. Metadata engine, consensus mechanism, transport, chunk storage. Change any of them without touching the rest.

Real Topologies

Start with a working hyperconverged architecture today. The framework is designed so that building a new topology means writing new implementations, not modifying existing ones.

Deploy in Minutes

A 3-node cluster, leader election, and smoke tests from a single script. Docker and Kubernetes configurations are included and working.

Storage Agnostic

The interfaces make no assumptions about whether you are building file storage, object storage, or something else entirely. The topology is yours to define.

Modular Architecture

ControlPlaneOrchestrator

Owns metadata lifecycle, placement decisions, namespace operations, and consensus coordination.

DataPlaneOrchestrator

Owns chunk movement, replica fanout, read failover, and inbound chunk RPC handlers.

MetadataReplicator

Consensus-backed metadata application. Current implementation uses durable Raft with WAL and CRC protection.

ClusterService

Node discovery and membership management. Current implementation uses etcd.

Communicator

Inter-node transport. gRPC is the active implementation. HTTP is available as an alternative.

ChunkService

Low-level chunk storage and retrieval. Current implementation uses local disk.

The server layer depends only on the orchestrator interfaces. Every component beneath them is an extension point.

Get Started

Ready to explore distributed systems? Clone the repo, build the project, and start experimenting with a real distributed file system.

Start a 3-node cluster

git clone https://github.com/AnishMulay/sandstore
cd sandstore
docker compose -f deploy/docker/etcd/docker-compose.yaml up -d
./scripts/dev/run-smoke.sh

This boots a full 3-node Sandstore cluster on localhost, waits for leader election, and runs an end-to-end smoke test against it.

Contribute & Learn

Study the Code

Start with servers/node/wire_grpc_etcd.go to see how the current topology is assembled. Then read internal/orchestrators/interfaces.go to understand the extension points.

Build a Topology

Implement new versions of the orchestrator interfaces, write a wiring file, and you have a new architecture. No changes to the server layer or deploy tooling required.

Whether you are learning distributed systems for the first time or designing them professionally, Sandstore gives you something real to run, read, and modify.

View Issues and Contribute