diff --git a/PHILOSOPHY.md b/PHILOSOPHY.md new file mode 100644 index 0000000..40c90a7 --- /dev/null +++ b/PHILOSOPHY.md @@ -0,0 +1,157 @@ +# Project Philosophy + +## Principles + +1. **Explicit over clever** — no magic helpers, no macros that hide control + flow, no trait gymnastics. Code reads top-to-bottom. A new reader should + understand what a function does without chasing through layers of + indirection. + +2. **Pure functions** — isolate decision logic from IO. A function that takes + data and returns data is testable, composable, and easy to reason about. + Keep it that way. Don't sneak in network calls or logging. + +3. **Linear flow** — avoid callbacks, deep nesting, and async gymnastics where + possible. A handler should read like a sequence of steps: look up the + record, pick a volume, build the response. + +4. **Minimize shared state** — pass values explicitly. The handler reads the + healthy volume set as a snapshot, then works with that snapshot. Don't hold + locks across IO. Don't reach into globals. + +5. **Minimize indirection** — don't hide logic behind abstractions that exist + "in case we need to swap the implementation later." We won't. A three-line + function inline is better than a trait with one implementor. + +## Applying the principles: separate decisions from execution + +Every request handler does two things: **decides** what should happen, then +**executes** IO to make it happen. These should be separate functions. + +A decision is a pure function. It takes data in, returns a description of what +to do. It doesn't call the network, doesn't touch the database, doesn't log. +It can be tested with `assert_eq!` and nothing else. + +Execution is the messy part — HTTP calls, SQLite writes, error recovery. It +reads the decision and carries it out. It's tested with integration tests. + +## Where this applies today + +### Already pure + +**`hasher.rs`** — the entire module is pure. `Ring` is a data structure. +`get_volumes` and `key_path` are deterministic functions of their inputs. No +IO, no state mutation beyond construction. This is the gold standard for the +project. + +**`db.rs` query functions** — `get`, `list_keys`, `all_records` take a +`&Connection` and return data. The connection is injected, not owned. The +functions don't decide what to do with the data — they just retrieve it. + +**`db.rs` encode/parse** — `parse_volumes` and `encode_volumes` are pure +transformations between JSON strings and `Vec`. + +### Mixed (decision + execution interleaved) + +**`server.rs::put_key`** — this handler does three things in one function: + +1. *Decide* which volumes to write to (pure — ring lookup) +2. *Execute* fan-out PUTs to nginx (IO) +3. *Decide* whether to rollback based on results (pure — check which succeeded) +4. *Execute* rollback DELETEs and/or index write (IO) + +Steps 1 and 3 could be extracted as pure functions: + +```rust +// Pure: given a key and ring, compute the placement plan +struct PutPlan { + path: String, + target_volumes: Vec, +} + +fn plan_put(ring: &Ring, key: &str, replication: usize) -> Result { + let path = Ring::key_path(key); + let target_volumes = ring.get_volumes(key, replication); + if target_volumes.len() < replication { + return Err(AppError::VolumeError(...)); + } + Ok(PutPlan { path, target_volumes }) +} + +// Pure: given fan-out results, decide what to do next +enum PutOutcome { + AllSucceeded { volumes: Vec }, + NeedsRollback { succeeded: Vec }, +} + +fn evaluate_put_results(results: &[(String, Result<(), String>)]) -> PutOutcome { ... } +``` + +**`server.rs::get_key`** — the "pick a healthy volume" logic is a pure +function hiding inside an async handler: + +```rust +// Pure: given a record's volumes and the healthy set, pick one +fn pick_healthy_volume<'a>( + record_volumes: &'a [String], + healthy: &HashSet, +) -> Option<&'a str> { + record_volumes.iter().find(|v| healthy.contains(*v)).map(|v| v.as_str()) +} +``` + +### Intentionally impure + +**`volume.rs`** — this is an IO boundary. It wraps `reqwest` and talks to +nginx. There's no decision logic here to extract; it's a thin adapter. Testing +it means mocking HTTP. That's fine. + +**`health.rs`** — a side-effecting loop. It polls volumes and mutates shared +state. No pure core to extract. Keep it simple. + +**`db.rs` writer thread** — the batch-and-commit loop is inherently stateful. +The `execute_cmd` function is close to pure (it takes a connection and a +command, returns a result), but it mutates the database. The batching logic +(drain channel, group into transaction) is a state machine. Not worth +abstracting further. + +## Guidelines + +1. **If a function takes only data and returns only data, it's pure.** Keep it + that way. Don't sneak in logging, metrics, or "just one network call." + +2. **If a handler has an `if` or `match` that decides between outcomes, that + decision can probably be a pure function.** Extract it. Name it. Test it. + +3. **IO boundaries should be thin.** `volume.rs` is a good example: format URL, + make request, check status, return bytes. No business logic. + +4. **Don't over-abstract.** A three-line pure function inline in a handler is + fine. Extract it when it gets complex enough to need its own tests, or when + the same decision appears in multiple places (e.g., rebuild and rebalance + both need "compute desired placement"). + +5. **Shared state should be read-only snapshots when possible.** The handler + reads `healthy_volumes` and `ring` under a read lock, then releases it + before doing IO. This keeps the critical section small and makes the + decision logic operate on a snapshot, not live-mutating state. + +6. **Errors are data.** `AppError` is a value, not an exception. Functions + return `Result`, handlers pattern-match on it. The `IntoResponse` impl is + the only place where errors become HTTP responses — one place, one mapping. + +## Anti-patterns to avoid + +- **God handler** — a 100-line async fn that reads the DB, calls volumes, makes + decisions, handles errors, and formats the response. Break it up. + +- **Stringly-typed errors in business logic** — `volume.rs` uses `String` errors + because it's an IO boundary and the strings are for logging. Decision + functions should use typed errors. + +- **Hidden state reads** — if a function needs the healthy volume set, pass it + in. Don't reach into a global or lock a mutex inside a "pure" function. + +- **Testing IO to test logic** — if you need a Docker container running to test + whether "pick a healthy volume" works correctly, the logic isn't separated + from the IO. diff --git a/TESTING.md b/TESTING.md new file mode 100644 index 0000000..e03b29e --- /dev/null +++ b/TESTING.md @@ -0,0 +1,169 @@ +# Testing Strategy + +## Layers + +### 1. Unit tests (no IO) + +Pure logic that can be tested with `cargo test`, no containers needed. + +**`hasher.rs`** (already done) +- Deterministic placement for the same key +- All volumes appear when requesting full replication +- Even distribution across volumes +- Correct key path format + +**`db.rs`** (TODO) +- Open an in-memory SQLite (`:memory:`) +- Test put/get/delete/list_keys/all_records round-trip +- Test upsert behavior (put same key twice) +- Test soft delete (deleted flag) +- Test bulk_put + +**Pure decision functions** (TODO, after refactor) +- Given a record and a set of healthy volumes, which volume to redirect to? +- Given fan-out results (list of Ok/Err), which volumes succeeded? Should we rollback? +- Given current vs desired volume placement, what needs to move? + +### 2. Volume client tests (mock HTTP) + +Use a lightweight HTTP server in-process (e.g. `axum` itself or `wiremock`) to test `volume.rs` without real nginx. + +- PUT blob + .key sidecar → verify both requests made +- GET blob → verify body returned +- DELETE blob → verify both blob and .key deleted +- DELETE non-existent → verify 404 is treated as success +- Health check → respond 200 → healthy; timeout → unhealthy + +### 3. Integration tests (real nginx) + +Full end-to-end with Docker containers. These are slower but catch real issues. + +#### Setup + +```yaml +# docker-compose.test.yml +services: + vol1: + image: nginx:alpine + volumes: + - ./tests/nginx.conf:/etc/nginx/conf.d/default.conf + - vol1_data:/data + ports: ["3101:80"] + + vol2: + image: nginx:alpine + volumes: + - ./tests/nginx.conf:/etc/nginx/conf.d/default.conf + - vol2_data:/data + ports: ["3102:80"] + + vol3: + image: nginx:alpine + volumes: + - ./tests/nginx.conf:/etc/nginx/conf.d/default.conf + - vol3_data:/data + ports: ["3103:80"] + +volumes: + vol1_data: + vol2_data: + vol3_data: +``` + +```nginx +# tests/nginx.conf +server { + listen 80; + root /data; + + location / { + dav_methods PUT DELETE; + create_full_put_path on; + autoindex on; + autoindex_format json; + } +} +``` + +```toml +# tests/test_config.toml +[server] +port = 3100 +replication_factor = 2 +virtual_nodes = 100 + +[database] +path = "/tmp/mkv-test/index.db" + +[[volumes]] +url = "http://localhost:3101" + +[[volumes]] +url = "http://localhost:3102" + +[[volumes]] +url = "http://localhost:3103" +``` + +#### Test cases + +**Happy path** +1. PUT `/hello` with body `"world"` → 201 +2. HEAD `/hello` → 200, Content-Length: 5 +3. GET `/hello` → 302 to a volume URL +4. Follow redirect → body is `"world"` +5. GET `/` → list contains `"hello"` +6. DELETE `/hello` → 204 +7. GET `/hello` → 404 + +**Replication verification** +1. PUT `/replicated` → 201 +2. Read SQLite directly, verify 2 volumes listed +3. GET blob from both volume URLs directly, verify identical content + +**Volume failure** +1. PUT `/failtest` → 201 +2. Stop vol1 container +3. GET `/failtest` → should still 302 to vol2 (healthy replica) +4. PUT `/new-during-failure` → should fail if replication_factor can't be met, or succeed on remaining volumes depending on ring placement +5. Restart vol1 + +**Rebuild** +1. PUT several keys +2. Delete the SQLite database +3. Run `mkv rebuild` +4. GET all keys → should all still work + +**Rebalance** +1. PUT several keys with 3 volumes +2. Add a 4th volume to config +3. Run `mkv rebalance --dry-run` → verify output +4. Run `mkv rebalance` → verify keys migrated +5. GET all keys → should all work + +#### Running integration tests + +```bash +# Start volumes +docker compose -f docker-compose.test.yml up -d + +# Wait for nginx to be ready +sleep 2 + +# Run integration tests +cargo test --test integration + +# Tear down +docker compose -f docker-compose.test.yml down -v +``` + +The integration test binary (`tests/integration.rs`) starts the mkv server +in-process on a random port, runs all test cases sequentially (shared state), +then shuts down. + +### 4. What we don't test + +- Performance / benchmarks (follow-up) +- TLS, auth (not implemented) +- Concurrent writers racing on the same key (SQLite serializes this correctly by design) +- Blobs > available RAM (streaming is a follow-up) diff --git a/TODO.md b/TODO.md new file mode 100644 index 0000000..a65f78f --- /dev/null +++ b/TODO.md @@ -0,0 +1,60 @@ +# mkv TODO + +## Done + +- [x] SQLite schema + pragmas + ReadPool + WriterHandle (`db.rs`) +- [x] Config parsing with validation (`config.rs`) +- [x] Error types with axum IntoResponse (`error.rs`) +- [x] Consistent hash ring with virtual nodes (`hasher.rs`) +- [x] Volume HTTP client with .key sidecar writes (`volume.rs`) +- [x] Health checker background task (`health.rs`) +- [x] HTTP handlers: GET (302 redirect), PUT (fan-out), DELETE, HEAD, list (`server.rs`) +- [x] CLI with clap subcommands (`main.rs`) +- [x] Wiring: state, routes, health checker startup + +## Remaining + +### Core features + +- [ ] **Rebuild tool** (`tools/rebuild.rs`) + - Scan all volumes for `.key` sidecar files (via nginx autoindex JSON) + - Read each `.key` file to recover the original key name + - Verify the blob exists alongside each `.key` + - Reconstruct SQLite index, merging replicas (same key on multiple volumes) + - Uses `db::bulk_put` for efficient batch inserts + +- [ ] **Rebalance tool** (`tools/rebalance.rs`) + - Load current index, compute desired placement via new ring config + - Diff current vs desired volumes per key + - Copy blobs to new volumes, delete from old ones + - Update SQLite records + - `--dry-run` flag to preview without acting + - Progress output: keys moved, bytes transferred, errors + +### Refactoring + +- [ ] **Extract pure decision logic from handlers** (see PHILOSOPHY.md) + - `put_key`: separate "compute placement" from "execute IO" + - `get_key`: separate "pick healthy volume" from "build redirect" + - Make fan-out results processing a pure function + +- [ ] **Typed volume errors** — replace `String` errors in `volume.rs` with a proper enum + +### Testing + +- [ ] **Unit tests for `db.rs`** — CRUD with an in-memory SQLite +- [ ] **Integration test harness** (see TESTING.md) + - Docker Compose with 3 nginx volume containers + - Rust integration tests that spin up the full server + - PUT/GET/DELETE/HEAD happy path + - Failure test: kill a volume mid-test, verify reads still work from replica + - Rebalance test: add 4th volume, rebalance, verify all keys accessible +- [ ] **Property-based tests for hasher** — verify ring invariants under add/remove + +### Polish + +- [ ] **Graceful shutdown** — drain in-flight requests, flush writer +- [ ] **Metrics** — request count, latency histograms, volume error rates +- [ ] **Request ID / tracing spans** — per-request trace context +- [ ] **Config reload** — SIGHUP to reload config without restart +- [ ] **Rate limiting on list endpoint** — prefix scan can be expensive