# toy_sqlite **Repository Path**: mr_jianlong/toy_sqlite ## Basic Information - **Project Name**: toy_sqlite - **Description**: No description available - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-02-24 - **Last Updated**: 2026-04-29 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Toy SQLite A distributed SQLite database backed by Raft consensus protocol. ## Overview Toy SQLite implements a simple distributed key-value store that provides: - **Raft consensus** for consistent data replication - **Linearizable reads/writes** through the Raft leader - **Automatic leader election** on leader failure - **Dynamic cluster membership** - **SQLite-backed persistence** on each node This is a lightweight alternative to distributed databases like CockroachDB or TiDB for applications that need simple distributed persistence without complex infrastructure. ## Quick Start Get a single-node Toy SQLite instance running in 5 minutes: ### 1. Navigate to project directory ```bash cd /path/to/toy_sqlite ``` ### 2. Install dependencies ```bash pip install protobuf grpcio asyncio ``` ### 3. Start a single node ```bash python -m toy_sqlite \ --node-id node-1 \ --bind-address 127.0.0.1 \ --bind-port 9000 \ --data-dir ./data ``` You should see output like: ```log [node-1] INFO Node started, listening on 127.0.0.1:9000 [node-1] INFO Node started successfully ``` ### 4. Verify operation The node is now running and ready for cluster membership. To test with a full cluster, see the [Local Development](#local-development) section below. ## Architecture ```log ┌─────────────────────────────────────────────────────────────┐ │ Client API (KV) │ │ Get / Set / Delete / List │ └──────────────────────────┬──────────────────────────────────┘ │ ┌──────────────────────────▼──────────────────────────────────┐ │ Raft Consensus │ │ (Leader Election + Log Replication) │ └──────────────────────────┬──────────────────────────────────┘ │ ┌──────────────────────────▼──────────────────────────────────┐ │ State Machine │ │ (KV Storage) │ └──────────────────────────┬──────────────────────────────────┘ │ ┌──────────────────────────▼──────────────────────────────────┐ │ SQLite Storage │ │ (Logs + Metadata + KV Store) │ └─────────────────────────────────────────────────────────────┘ ``` ## Installation ### Requirements - **Python** 3.12+ (required by pyproject.toml) - **Dependencies**: protobuf, grpcio, asyncio (managed via pyproject.toml) ### Setup (Project Directory) All installation and setup commands should be run from the **project root directory** (`/path/to/toy_sqlite`): ```bash # Install dependencies pip install protobuf grpcio asyncio # Compile protobuf definitions protoc --python_out=src/toy_sqlite proto/*.proto ``` **Note**: The data directory will be automatically created on first run at `./data` (by default). ## Local Development For development and testing, run a 3-node local cluster on your machine. ### Setup 3-Node Cluster Each node will bind to a different port on localhost. This verifies Raft consensus behavior. #### Terminal 1: Start Node 1 ```bash cd /path/to/toy_sqlite python -m toy_sqlite \ --node-id node-1 \ --bind-address 127.0.0.1 \ --bind-port 9000 \ --seed-nodes "127.0.0.1:9001,127.0.0.1:9002" \ --data-dir ./data/node-1 ``` #### Terminal 2: Start Node 2 ```bash cd /path/to/toy_sqlite python -m toy_sqlite \ --node-id node-2 \ --bind-address 127.0.0.1 \ --bind-port 9001 \ --seed-nodes "127.0.0.1:9000,127.0.0.1:9002" \ --data-dir ./data/node-2 ``` #### Terminal 3: Start Node 3 ```bash cd /path/to/toy_sqlite python -m toy_sqlite \ --node-id node-3 \ --bind-address 127.0.0.1 \ --bind-port 9002 \ --seed-nodes "127.0.0.1:9000,127.0.0.1:9001" \ --data-dir ./data/node-3 ``` ### Port Allocation For local development: | Node | Port | Seed Nodes | | ------ | ---- | ------------------------------ | | node-1 | 9000 | 127.0.0.1:9001, 127.0.0.1:9002 | | node-2 | 9001 | 127.0.0.1:9000, 127.0.0.1:9002 | | node-3 | 9002 | 127.0.0.1:9000, 127.0.0.1:9001 | **Key Points:** - Each node has a unique port - Seed nodes include all other cluster members - Unique data directories prevent conflicts ### Verify Cluster When all three nodes are running, you should see: - **Election messages**: Nodes voting and electing a leader - **Heartbeat messages**: Leader sending heartbeats to followers - **One leader elected**: Check logs for "Leader elected" message ## Production Deployment Deploying to production requires planning for cluster size, network configuration, and operational procedures. ### Cluster Size Requirements For fault tolerance, use an odd number of nodes: | Cluster Size | Fault Tolerance | Majority Required | | ------------ | --------------- | ----------------- | | 3 nodes | 1 node failure | 2 nodes (67%) | | 5 nodes | 2 node failures | 3 nodes (60%) | | 7 nodes | 3 node failures | 4 nodes (57%) | **Recommendation**: Start with 3 nodes for production minimum, expand to 5 for higher availability. ### Host Configuration Example Deploy nodes across separate physical hosts: | Host | Node IP | Port | Data Directory | | ------ | ------- | -------- | -------------- | -------------------------- | | host-1 | node-1 | 10.0.1.1 | 9000 | /var/lib/toy_sqlite/node-1 | | host-2 | node-2 | 10.0.1.2 | 9000 | /var/lib/toy_sqlite/node-2 | | host-3 | node-3 | 10.0.1.3 | 9000 | /var/lib/toy_sqlite/node-3 | ### Node Startup Commands (Production) **On host-1 (10.0.1.1):** ```bash python -m toy_sqlite \ --node-id node-1 \ --bind-address 0.0.0.0 \ --bind-port 9000 \ --seed-nodes "10.0.1.2:9000,10.0.1.3:9000" \ --data-dir /var/lib/toy_sqlite/node-1 \ --log-level INFO ``` **On host-2 (10.0.1.2):** ```bash python -m toy_sqlite \ --node-id node-2 \ --bind-address 0.0.0.0 \ --bind-port 9000 \ --seed-nodes "10.0.1.1:9000,10.0.1.3:9000" \ --data-dir /var/lib/toy_sqlite/node-2 \ --log-level INFO ``` **On host-3 (10.0.1.3):** ```bash python -m toy_sqlite \ --node-id node-3 \ --bind-address 0.0.0.0 \ --bind-port 9000 \ --seed-nodes "10.0.1.1:9000,10.0.1.2:9000" \ --data-dir /var/lib/toy_sqlite/node-3 \ --log-level INFO ``` ### Network Configuration **Firewall Ports:** ```bash # Open port 9000 for Raft communication (adjust if using different port) sudo ufw allow 9000/tcp ``` **Bind Address:** - Use `0.0.0.0` for production (listen on all interfaces) - Use `127.0.0.1` for local development only ### Configuration Recommendations | Environment | Election Timeout | Heartbeat | Snapshot Threshold | | ------------ | ---------------- | --------- | ------------------ | | Local Dev | 150-300ms | 50ms | 1000 | | Same Data | 300-600ms | 100ms | 5000 | | Cross-Region | 1000-2000ms | 500ms | 10000 | ### Data Persistence **Data Directory Requirements:** - Use dedicated disk/volume for data directory - ~1GB per 1M key-value pairs (depending on value size) - Automated snapshots keep disk usage bounded **Backup Procedure:** 1. Stop node gracefully (SIGTERM) 2. Copy entire data directory 3. Restart node 4. Verify cluster health ### Rolling Upgrade To upgrade nodes without downtime: 1. Upgrade follower nodes one at a time 2. Verify each node rejoins cluster 3. Upgrade leader last (it will step down) 4. Confirm new leader elected ### Production Checklist - [ ] Odd number of nodes (3, 5, or 7) - [ ] Firewalls configured for Raft ports - [ ] Persistent volumes mounted for data directories - [ ] Monitoring configured for leader changes - [ ] Election timeouts tuned for network latency - [ ] Backup/restore procedure tested - [ ] Log rotation configured ## Docker Deployment Use Docker for easy testing and containerized deployment. ### Dockerfile Create `Dockerfile` in the project root: ```dockerfile FROM python:3.12-slim WORKDIR /app # Install system dependencies RUN apt-get update && apt-get install -y \ build-essential \ protobuf-compiler \ && rm -rf /var/lib/apt/lists/* # Copy project files COPY pyproject.toml ./ COPY src/ ./src/ COPY proto/ ./proto/ # Install Python dependencies RUN pip install --no-cache-dir protobuf grpcio asyncio # Compile protobuf definitions RUN protoc --python_out=src/toy_sqlite proto/*.proto # Create data directory RUN mkdir -p /data # Set working directory for node WORKDIR /app # Default command (can be overridden) CMD ["python", "-m", "toy_sqlite", "--bind-address", "0.0.0.0", "--data-dir", "/data"] ``` ### docker-compose.yml Create `docker-compose.yml` for a 3-node cluster: ```yaml version: "3.8" services: node-1: build: . command: > python -m toy_sqlite --node-id node-1 --bind-address 0.0.0.0 --bind-port 9000 --seed-nodes "node-2:9000,node-3:9000" --data-dir /data ports: - "9000:9000" volumes: - node1-data:/data networks: - toy-sqlite-net node-2: build: . command: > python -m toy_sqlite --node-id node-2 --bind-address 0.0.0.0 --bind-port 9000 --seed-nodes "node-1:9000,node-3:9000" --data-dir /data ports: - "9001:9000" volumes: - node2-data:/data networks: - toy-sqlite-net node-3: build: . command: > python -m toy_sqlite --node-id node-3 --bind-address 0.0.0.0 --bind-port 9000 --seed-nodes "node-1:9000,node-2:9000" --data-dir /data ports: - "9002:9000" volumes: - node3-data:/data networks: - toy-sqlite-net volumes: node1-data: node2-data: node3-data: networks: toy-sqlite-net: driver: bridge ``` ### Running with Docker **Build and start cluster:** ```bash cd /path/to/toy_sqlite docker-compose up -d ``` **View logs:** ```bash # All nodes docker-compose logs -f # Specific node docker-compose logs -f node-1 ``` **Stop cluster:** ```bash docker-compose down ``` **Verify operation:** 1. Logs should show cluster forming 2. One node should be elected leader 3. Heartbeats visible in logs ## Verification and Testing After deploying your cluster, verify it's working correctly. ### Cluster Connectivity Check ```bash # Check node logs for connection messages grep "Node started" logs/node-1.log # Verify nodes can reach each other telnet 9000 ``` ### Leader Election Verification Node logs indicate leader选举: ``` [node-1] INFO Leader elected: node-1, Term: 1 ``` Or programmatically: ```python from toy_sqlite.node import ToySQLiteNode from toy_sqlite.config import Config, NodeConfig # Connect to any node config = NodeConfig.from_args({ "node_id": "client", "bind_port": 9999, "seed_nodes": ["127.0.0.1:9000"] }).config node = ToySQLiteNode(config) await node.start() # Check leader leader, state = node.kv_service.get_leader_info() print(f"Current leader: {leader}") print(f"My state: {state}") await node.stop() ``` ### Sample Operations Test basic read/write operations: ```python from toy_sqlite.node import ToySQLiteNode from toy_sqlite.config import Config, NodeConfig config = NodeConfig.from_args({ "node_id": "client", "bind_port": 9999, "seed_nodes": ["127.0.0.1:9000"] }).config node = ToySQLiteNode(config) await node.start() # Write key-value node.kv_service.set("test-key", "test-value") # Read back value = node.kv_service.get("test-key") print(f"Read value: {value}") # Output: test-value # List keys keys = node.kv_service.list() print(f"All keys: {keys}") # Delete node.kv_service.delete("test-key") await node.stop() ``` ## Configuration Reference ### Command-Line Options All commands should be run from the **project root directory**. | Option | Default | Description | Recommended Values | | ------------------------ | ----------- | ----------------------------------------------------------- | ----------------------------- | | `--node-id` | `node-1` | Unique identifier for this node (must be unique in cluster) | Use hostname or readable ID | | `--bind-address` | `127.0.0.1` | Address to bind for incoming connections | `0.0.0.0` for production | | `--bind-port` | `9000` | Port to bind for incoming connections | Use standard or assigned port | | `--seed-nodes` | | Comma-separated seed node addresses (host:port) | All other cluster members | | `--election-timeout-min` | `150` | Minimum election timeout (milliseconds) | 150-2000 (adjust for network) | | `--election-timeout-max` | `300` | Maximum election timeout (milliseconds) | 2x-3x of min value | | `--heartbeat-interval` | `50` | Heartbeat interval (milliseconds) | 50-500 (1/10 of min timeout) | | `--snapshot-threshold` | `1000` | Log entries threshold for automatic snapshot | 1000-10000 (higher for prod) | | `--data-dir` | `./data` | Directory for persistent data (SQLite DB) | Use dedicated volume path | | `--log-level` | `INFO` | Log level (DEBUG/INFO/WARNING/ERROR) | `INFO` for production | ### Configuration Interdependencies - **Election Timeout vs Network Latency** - Election timeout should be 10-20x your typical network latency - Higher latency → increase election timeout range - **Heartbeat Interval vs Election Timeout** - Heartbeat should be 1/10 to 1/5 of minimum election timeout - Too fast → excessive network traffic - Too slow → leader fails to maintain control - **Snapshot Threshold vs Disk Space** - Higher threshold = larger logs before compaction - Lower threshold = more frequent compaction overhead - Balance between disk space and CPU usage ### Common Configuration Patterns **Development (single machine, 3 nodes):** ```bash python -m toy_sqlite \ --node-id node-1 \ --bind-address 127.0.0.1 \ --bind-port 9000 \ --seed-nodes "127.0.0.1:9001,127.0.0.1:9002" \ --data-dir ./data/node-1 \ --log-level DEBUG ``` **Production (multiple hosts):** ```bash python -m toy_sqlite \ --node-id db-primary-1 \ --bind-address 0.0.0.0 \ --bind-port 9000 \ --seed-nodes "db-primary-2:9000,db-primary-3:9000" \ --data-dir /var/lib/toy_sqlite/db-primary-1 \ --log-level INFO \ --election-timeout-min 500 \ --election-timeout-max 1000 \ --heartbeat-interval 200 ``` **High Write Volume:** ```bash python -m toy_sqlite \ --node-id high-throughput-1 \ --bind-address 0.0.0.0 \ --bind-port 9000 \ --seed-nodes "..." \ --data-dir /ssd/toy_sqlite/node-1 \ --snapshot-threshold 5000 ``` ## Troubleshooting ### Node won't start **Symptom**: "Address already in use" error **Solution**: Another process is using the port. Check with: ```bash lsof -i :9000 ``` **Fix**: Stop conflicting process or use different `--bind-port`. ### No leader elected **Symptom**: All nodes showing as follower/candidate **Causes**: - Network partition preventing communication - Election timeout too aggressive for network latency - Nodes can't reach majority (2/3 of cluster) **Solution**: 1. Check network connectivity between nodes 2. Increase election timeout range (try 500-1000ms instead of 150-300ms) 3. Ensure all nodes are running 4. Check logs for vote granting/rejections ### Write hangs forever **Symptom**: SET/DELETE operations never return **Cause**: Current node is not the leader **Solution**: ```python leader, state = node.kv_service.get_leader_info() print(f"Current leader: {leader}, My state: {state}") ``` **Fix**: If not leader, either: - Connect node to join cluster - Redirect operations to the leader node - Wait for current leader to write or step down ### Inconsistent data between nodes **Symptom**: Different values on different nodes **Causes**: - Network partition occurred but writes continued - Snapshot corruption - Manually modified SQLite database **Solution**: 1. Identify which partition is the current leader 2. Nodes in minority partition should be restarted with clean data 3. Verify snapshot integrity 4. Restore from backup if needed ### Logs grow indefinitely **Symptom**: Disk space filling up **Cause**: Snapshot threshold too high or automatic snapshots disabled **Solution**: 1. Reduce `--snapshot-threshold` (try 500-1000 instead of 1000+) 2. Verify snapshot creation in logs: look for "Snapshot created" 3. Manually trigger snapshot if needed: ```bash # Stop node, restart with lower threshold to trigger snapshot ``` ### Connection refused from peers **Symptom**: "Connection refused" or "Peer unreachable" **Causes**: - Seed node addresses incorrect - Firewall blocking Raft port - Wrong bind address (using 127.0.0.1 instead of 0.0.0.0) **Solution**: 1. Verify seed node addresses match actual listening addresses 2. Open firewall for Raft port 3. Use `--bind-address 0.0.0.0` for multi-host deployments ## API The node provides a key-value API: ```python from toy_sqlite.node import ToySQLiteNode from toy_sqlite.config import Config # Create node config = Config( node_id="node-1", bind_address="127.0.0.1", bind_port=9000, seed_nodes=["127.0.0.1:9001"], ) node = ToySQLiteNode(config) await node.start() # Get value (linearizable read) value = node.kv_service.get("my-key") # Set value (goes through Raft consensus) node.kv_service.set("my-key", "my-value") # Delete value node.kv_service.delete("my-key") # List all keys keys = node.kv_service.list() await node.stop() ``` ## Features - **Linearizable consistency**: All reads and writes go through Raft - **Automatic leader election**: Elects new leader on failure - **Log replication**: Replicates writes to majority before committing - **Dynamic membership**: Nodes can join/leave the cluster - **Snapshots**: Compacts log automatically when it grows large - **SQLite persistence**: Data persists across restarts ## Design Decisions - **Raft over Paxos**: Raft is simpler to implement correctly - **Asyncio + Protobuf**: Standard Python async networking with compact serialization - **Single-threaded per node**: Simpler architecture with no locking issues - **Key-value API only**: SQL support is a non-goal for MVP ## Limitations - Key-value interface only (no SQL queries) - Single-key operations (no multi-key transactions) - Requires majority for writes (minimum 3 nodes recommended) - Network partition handling may pause writes ## Testing ```bash # Run tests pytest tests/ # Run specific test pytest tests/test_protobuf.py -v ``` ## Implementation Status - [x] Project structure and configuration - [x] Protobuf message definitions - [x] SQLite storage layer - [x] Raft consensus core - [x] Network communication layer - [x] KV API and state machine - [x] Cluster membership - [x] Snapshot management - [x] Main node integration - [ ] Comprehensive test suite - [ ] Performance benchmarks ## Monitoring ### Key Metrics - **Current Term**: Increments on each election - **Commit Index**: Highest committed log entry - **Last Applied**: Highest entry applied to state machine - **Leader ID**: Current cluster leader ### Log Messages - `[...] Leader elected` - New leader elected - `[...] Snapshot created` - Log compaction occurred - `[...] Vote granted` - Node voted in election ### Health Checks ```python # Check if node is leader if node.kv_service.is_leader(): print("Node is leader") # Get cluster state leader, state = node.kv_service.get_leader_info() ``` ## License Apache License 2.0 ## References - [Raft Paper](https://raft.github.io/raft.pdf) - The Raft Consensus Algorithm - [Raft GitHub](https://github.com/ongardie/raft.github.io) - Raft resources and links