Make plan

2026-03-07 10:04:03 +01:00 · 2026-03-07 10:04:03 +01:00 · 23a075382f
commit 23a075382f
parent 2a2afa5f69
3 changed files with 386 additions and 0 deletions
--- a/PHILOSOPHY.md
+++ b/PHILOSOPHY.md
@ -0,0 +1,157 @@
+# Project Philosophy
+
+## Principles
+
+1. **Explicit over clever** — no magic helpers, no macros that hide control
+   flow, no trait gymnastics. Code reads top-to-bottom. A new reader should
+   understand what a function does without chasing through layers of
+   indirection.
+
+2. **Pure functions** — isolate decision logic from IO. A function that takes
+   data and returns data is testable, composable, and easy to reason about.
+   Keep it that way. Don't sneak in network calls or logging.
+
+3. **Linear flow** — avoid callbacks, deep nesting, and async gymnastics where
+   possible. A handler should read like a sequence of steps: look up the
+   record, pick a volume, build the response.
+
+4. **Minimize shared state** — pass values explicitly. The handler reads the
+   healthy volume set as a snapshot, then works with that snapshot. Don't hold
+   locks across IO. Don't reach into globals.
+
+5. **Minimize indirection** — don't hide logic behind abstractions that exist
+   "in case we need to swap the implementation later." We won't. A three-line
+   function inline is better than a trait with one implementor.
+
+## Applying the principles: separate decisions from execution
+
+Every request handler does two things: **decides** what should happen, then
+**executes** IO to make it happen. These should be separate functions.
+
+A decision is a pure function. It takes data in, returns a description of what
+to do. It doesn't call the network, doesn't touch the database, doesn't log.
+It can be tested with `assert_eq!` and nothing else.
+
+Execution is the messy part — HTTP calls, SQLite writes, error recovery. It
+reads the decision and carries it out. It's tested with integration tests.
+
+## Where this applies today
+
+### Already pure
+
+**`hasher.rs`** — the entire module is pure. `Ring` is a data structure.
+`get_volumes` and `key_path` are deterministic functions of their inputs. No
+IO, no state mutation beyond construction. This is the gold standard for the
+project.
+
+**`db.rs` query functions** — `get`, `list_keys`, `all_records` take a
+`&Connection` and return data. The connection is injected, not owned. The
+functions don't decide what to do with the data — they just retrieve it.
+
+**`db.rs` encode/parse** — `parse_volumes` and `encode_volumes` are pure
+transformations between JSON strings and `Vec<String>`.
+
+### Mixed (decision + execution interleaved)
+
+**`server.rs::put_key`** — this handler does three things in one function:
+
+1. *Decide* which volumes to write to (pure — ring lookup)
+2. *Execute* fan-out PUTs to nginx (IO)
+3. *Decide* whether to rollback based on results (pure — check which succeeded)
+4. *Execute* rollback DELETEs and/or index write (IO)
+
+Steps 1 and 3 could be extracted as pure functions:
+
+```rust
+// Pure: given a key and ring, compute the placement plan
+struct PutPlan {
+    path: String,
+    target_volumes: Vec<String>,
+}
+
+fn plan_put(ring: &Ring, key: &str, replication: usize) -> Result<PutPlan, AppError> {
+    let path = Ring::key_path(key);
+    let target_volumes = ring.get_volumes(key, replication);
+    if target_volumes.len() < replication {
+        return Err(AppError::VolumeError(...));
+    }
+    Ok(PutPlan { path, target_volumes })
+}
+
+// Pure: given fan-out results, decide what to do next
+enum PutOutcome {
+    AllSucceeded { volumes: Vec<String> },
+    NeedsRollback { succeeded: Vec<String> },
+}
+
+fn evaluate_put_results(results: &[(String, Result<(), String>)]) -> PutOutcome { ... }
+```
+
+**`server.rs::get_key`** — the "pick a healthy volume" logic is a pure
+function hiding inside an async handler:
+
+```rust
+// Pure: given a record's volumes and the healthy set, pick one
+fn pick_healthy_volume<'a>(
+    record_volumes: &'a [String],
+    healthy: &HashSet<String>,
+) -> Option<&'a str> {
+    record_volumes.iter().find(|v| healthy.contains(*v)).map(|v| v.as_str())
+}
+```
+
+### Intentionally impure
+
+**`volume.rs`** — this is an IO boundary. It wraps `reqwest` and talks to
+nginx. There's no decision logic here to extract; it's a thin adapter. Testing
+it means mocking HTTP. That's fine.
+
+**`health.rs`** — a side-effecting loop. It polls volumes and mutates shared
+state. No pure core to extract. Keep it simple.
+
+**`db.rs` writer thread** — the batch-and-commit loop is inherently stateful.
+The `execute_cmd` function is close to pure (it takes a connection and a
+command, returns a result), but it mutates the database. The batching logic
+(drain channel, group into transaction) is a state machine. Not worth
+abstracting further.
+
+## Guidelines
+
+1. **If a function takes only data and returns only data, it's pure.** Keep it
+   that way. Don't sneak in logging, metrics, or "just one network call."
+
+2. **If a handler has an `if` or `match` that decides between outcomes, that
+   decision can probably be a pure function.** Extract it. Name it. Test it.
+
+3. **IO boundaries should be thin.** `volume.rs` is a good example: format URL,
+   make request, check status, return bytes. No business logic.
+
+4. **Don't over-abstract.** A three-line pure function inline in a handler is
+   fine. Extract it when it gets complex enough to need its own tests, or when
+   the same decision appears in multiple places (e.g., rebuild and rebalance
+   both need "compute desired placement").
+
+5. **Shared state should be read-only snapshots when possible.** The handler
+   reads `healthy_volumes` and `ring` under a read lock, then releases it
+   before doing IO. This keeps the critical section small and makes the
+   decision logic operate on a snapshot, not live-mutating state.
+
+6. **Errors are data.** `AppError` is a value, not an exception. Functions
+   return `Result`, handlers pattern-match on it. The `IntoResponse` impl is
+   the only place where errors become HTTP responses — one place, one mapping.
+
+## Anti-patterns to avoid
+
+- **God handler** — a 100-line async fn that reads the DB, calls volumes, makes
+  decisions, handles errors, and formats the response. Break it up.
+
+- **Stringly-typed errors in business logic** — `volume.rs` uses `String` errors
+  because it's an IO boundary and the strings are for logging. Decision
+  functions should use typed errors.
+
+- **Hidden state reads** — if a function needs the healthy volume set, pass it
+  in. Don't reach into a global or lock a mutex inside a "pure" function.
+
+- **Testing IO to test logic** — if you need a Docker container running to test
+  whether "pick a healthy volume" works correctly, the logic isn't separated
+  from the IO.
--- a/TESTING.md
+++ b/TESTING.md
@ -0,0 +1,169 @@
+# Testing Strategy
+
+## Layers
+
+### 1. Unit tests (no IO)
+
+Pure logic that can be tested with `cargo test`, no containers needed.
+
+**`hasher.rs`** (already done)
+- Deterministic placement for the same key
+- All volumes appear when requesting full replication
+- Even distribution across volumes
+- Correct key path format
+
+**`db.rs`** (TODO)
+- Open an in-memory SQLite (`:memory:`)
+- Test put/get/delete/list_keys/all_records round-trip
+- Test upsert behavior (put same key twice)
+- Test soft delete (deleted flag)
+- Test bulk_put
+
+**Pure decision functions** (TODO, after refactor)
+- Given a record and a set of healthy volumes, which volume to redirect to?
+- Given fan-out results (list of Ok/Err), which volumes succeeded? Should we rollback?
+- Given current vs desired volume placement, what needs to move?
+
+### 2. Volume client tests (mock HTTP)
+
+Use a lightweight HTTP server in-process (e.g. `axum` itself or `wiremock`) to test `volume.rs` without real nginx.
+
+- PUT blob + .key sidecar → verify both requests made
+- GET blob → verify body returned
+- DELETE blob → verify both blob and .key deleted
+- DELETE non-existent → verify 404 is treated as success
+- Health check → respond 200 → healthy; timeout → unhealthy
+
+### 3. Integration tests (real nginx)
+
+Full end-to-end with Docker containers. These are slower but catch real issues.
+
+#### Setup
+
+```yaml
+# docker-compose.test.yml
+services:
+  vol1:
+    image: nginx:alpine
+    volumes:
+      - ./tests/nginx.conf:/etc/nginx/conf.d/default.conf
+      - vol1_data:/data
+    ports: ["3101:80"]
+
+  vol2:
+    image: nginx:alpine
+    volumes:
+      - ./tests/nginx.conf:/etc/nginx/conf.d/default.conf
+      - vol2_data:/data
+    ports: ["3102:80"]
+
+  vol3:
+    image: nginx:alpine
+    volumes:
+      - ./tests/nginx.conf:/etc/nginx/conf.d/default.conf
+      - vol3_data:/data
+    ports: ["3103:80"]
+
+volumes:
+  vol1_data:
+  vol2_data:
+  vol3_data:
+```
+
+```nginx
+# tests/nginx.conf
+server {
+    listen 80;
+    root /data;
+
+    location / {
+        dav_methods PUT DELETE;
+        create_full_put_path on;
+        autoindex on;
+        autoindex_format json;
+    }
+}
+```
+
+```toml
+# tests/test_config.toml
+[server]
+port = 3100
+replication_factor = 2
+virtual_nodes = 100
+
+[database]
+path = "/tmp/mkv-test/index.db"
+
+[[volumes]]
+url = "http://localhost:3101"
+
+[[volumes]]
+url = "http://localhost:3102"
+
+[[volumes]]
+url = "http://localhost:3103"
+```
+
+#### Test cases
+
+**Happy path**
+1. PUT `/hello` with body `"world"` → 201
+2. HEAD `/hello` → 200, Content-Length: 5
+3. GET `/hello` → 302 to a volume URL
+4. Follow redirect → body is `"world"`
+5. GET `/` → list contains `"hello"`
+6. DELETE `/hello` → 204
+7. GET `/hello` → 404
+
+**Replication verification**
+1. PUT `/replicated` → 201
+2. Read SQLite directly, verify 2 volumes listed
+3. GET blob from both volume URLs directly, verify identical content
+
+**Volume failure**
+1. PUT `/failtest` → 201
+2. Stop vol1 container
+3. GET `/failtest` → should still 302 to vol2 (healthy replica)
+4. PUT `/new-during-failure` → should fail if replication_factor can't be met, or succeed on remaining volumes depending on ring placement
+5. Restart vol1
+
+**Rebuild**
+1. PUT several keys
+2. Delete the SQLite database
+3. Run `mkv rebuild`
+4. GET all keys → should all still work
+
+**Rebalance**
+1. PUT several keys with 3 volumes
+2. Add a 4th volume to config
+3. Run `mkv rebalance --dry-run` → verify output
+4. Run `mkv rebalance` → verify keys migrated
+5. GET all keys → should all work
+
+#### Running integration tests
+
+```bash
+# Start volumes
+docker compose -f docker-compose.test.yml up -d
+
+# Wait for nginx to be ready
+sleep 2
+
+# Run integration tests
+cargo test --test integration
+
+# Tear down
+docker compose -f docker-compose.test.yml down -v
+```
+
+The integration test binary (`tests/integration.rs`) starts the mkv server
+in-process on a random port, runs all test cases sequentially (shared state),
+then shuts down.
+
+### 4. What we don't test
+
+- Performance / benchmarks (follow-up)
+- TLS, auth (not implemented)
+- Concurrent writers racing on the same key (SQLite serializes this correctly by design)
+- Blobs > available RAM (streaming is a follow-up)
--- a/TODO.md
+++ b/TODO.md
@ -0,0 +1,60 @@
+# mkv TODO
+
+## Done
+
+- [x] SQLite schema + pragmas + ReadPool + WriterHandle (`db.rs`)
+- [x] Config parsing with validation (`config.rs`)
+- [x] Error types with axum IntoResponse (`error.rs`)
+- [x] Consistent hash ring with virtual nodes (`hasher.rs`)
+- [x] Volume HTTP client with .key sidecar writes (`volume.rs`)
+- [x] Health checker background task (`health.rs`)
+- [x] HTTP handlers: GET (302 redirect), PUT (fan-out), DELETE, HEAD, list (`server.rs`)
+- [x] CLI with clap subcommands (`main.rs`)
+- [x] Wiring: state, routes, health checker startup
+
+## Remaining
+
+### Core features
+
+- [ ] **Rebuild tool** (`tools/rebuild.rs`)
+  - Scan all volumes for `.key` sidecar files (via nginx autoindex JSON)
+  - Read each `.key` file to recover the original key name
+  - Verify the blob exists alongside each `.key`
+  - Reconstruct SQLite index, merging replicas (same key on multiple volumes)
+  - Uses `db::bulk_put` for efficient batch inserts
+
+- [ ] **Rebalance tool** (`tools/rebalance.rs`)
+  - Load current index, compute desired placement via new ring config
+  - Diff current vs desired volumes per key
+  - Copy blobs to new volumes, delete from old ones
+  - Update SQLite records
+  - `--dry-run` flag to preview without acting
+  - Progress output: keys moved, bytes transferred, errors
+
+### Refactoring
+
+- [ ] **Extract pure decision logic from handlers** (see PHILOSOPHY.md)
+  - `put_key`: separate "compute placement" from "execute IO"
+  - `get_key`: separate "pick healthy volume" from "build redirect"
+  - Make fan-out results processing a pure function
+
+- [ ] **Typed volume errors** — replace `String` errors in `volume.rs` with a proper enum
+
+### Testing
+
+- [ ] **Unit tests for `db.rs`** — CRUD with an in-memory SQLite
+- [ ] **Integration test harness** (see TESTING.md)
+  - Docker Compose with 3 nginx volume containers
+  - Rust integration tests that spin up the full server
+  - PUT/GET/DELETE/HEAD happy path
+  - Failure test: kill a volume mid-test, verify reads still work from replica
+  - Rebalance test: add 4th volume, rebalance, verify all keys accessible
+- [ ] **Property-based tests for hasher** — verify ring invariants under add/remove
+
+### Polish
+
+- [ ] **Graceful shutdown** — drain in-flight requests, flush writer
+- [ ] **Metrics** — request count, latency histograms, volume error rates
+- [ ] **Request ID / tracing spans** — per-request trace context
+- [ ] **Config reload** — SIGHUP to reload config without restart
+- [ ] **Rate limiting on list endpoint** — prefix scan can be expensive