From 71abb1ed7d3271e4fe1650c51455eb9ced93d204 Mon Sep 17 00:00:00 2001 From: Silas Brack Date: Sat, 7 Mar 2026 16:16:35 +0100 Subject: [PATCH] Move philosophy into README --- PHILOSOPHY.md | 102 ------------------------------------------------ README.md | 105 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 105 insertions(+), 102 deletions(-) delete mode 100644 PHILOSOPHY.md diff --git a/PHILOSOPHY.md b/PHILOSOPHY.md deleted file mode 100644 index 683056d..0000000 --- a/PHILOSOPHY.md +++ /dev/null @@ -1,102 +0,0 @@ -# Project Philosophy - -## Principles - -1. **Explicit over clever** — no magic helpers, no macros that hide control - flow, no trait gymnastics. Code reads top-to-bottom. A new reader should - understand what a function does without chasing through layers of - indirection. - -2. **Pure functions** — isolate decision logic from IO. A function that takes - data and returns data is testable, composable, and easy to reason about. - Keep it that way. Don't sneak in network calls or logging. - -3. **Linear flow** — avoid callbacks, deep nesting, and async gymnastics where - possible. A handler should read like a sequence of steps: look up the - record, pick a volume, build the response. - -4. **Minimize shared state** — pass values explicitly. Don't hold locks across - IO. Don't reach into globals. - -5. **Minimize indirection** — don't hide logic behind abstractions that exist - "in case we need to swap the implementation later." We won't. A three-line - function inline is better than a trait with one implementor. - -## Applying the principles: separate decisions from execution - -Every request handler does two things: **decides** what should happen, then -**executes** IO to make it happen. These should be separate functions. - -A decision is a pure function. It takes data in, returns a description of what -to do. It doesn't call the network, doesn't touch the database, doesn't log. -It can be tested with `assert_eq!` and nothing else. - -Execution is the messy part — HTTP calls, SQLite writes, error recovery. It -reads the decision and carries it out. It's tested with integration tests. - -## Where this applies today - -### Already pure - -**`hasher.rs`** — the entire module is pure. `volumes_for_key` is a -deterministic function of its inputs. No IO, no state mutation. This is the -gold standard for the project. - -**`rebalance.rs::plan_rebalance`** — takes a slice of records and returns a -list of moves. Pure decision logic, tested with unit tests. - -**`db.rs` encode/parse** — `parse_volumes` and `encode_volumes` are pure -transformations between JSON strings and `Vec`. - -### Mixed (decision + execution interleaved) - -**`server.rs::put_key`** — this handler does three things in one function: - -1. *Decide* which volumes to write to (pure — `volumes_for_key`) -2. *Execute* fan-out PUTs to nginx (IO) -3. *Decide* whether to rollback based on results (pure — check which succeeded) -4. *Execute* rollback DELETEs and/or index write (IO) - -Steps 1 and 3 could be extracted as pure functions if they grow more complex. - -### Intentionally impure - -**`rebuild.rs`** — walks nginx autoindex and bulk-inserts into SQLite. The IO -is the whole point; there's no decision logic worth extracting. - -**`db.rs`** — wraps SQLite behind `Arc>` with -`spawn_blocking` to avoid blocking the tokio runtime. The mutex serializes all -access; `SQLITE_OPEN_NO_MUTEX` disables SQLite's internal locking since the -application mutex handles it. - -## Guidelines - -1. **If a function takes only data and returns only data, it's pure.** Keep it - that way. Don't sneak in logging, metrics, or "just one network call." - -2. **If a handler has an `if` or `match` that decides between outcomes, that - decision can probably be a pure function.** Extract it. Name it. Test it. - -3. **IO boundaries should be thin.** Format URL, make request, check status, - return bytes. No business logic. - -4. **Don't over-abstract.** A three-line pure function inline in a handler is - fine. Extract it when it gets complex enough to need its own tests, or when - the same decision appears in multiple places (e.g., rebuild and rebalance - both use `volumes_for_key`). - -5. **Errors are data.** `AppError` is a value, not an exception. Functions - return `Result`, handlers pattern-match on it. The `IntoResponse` impl is - the only place where errors become HTTP responses — one place, one mapping. - -## Anti-patterns to avoid - -- **God handler** — a 100-line async fn that reads the DB, calls volumes, makes - decisions, handles errors, and formats the response. Break it up. - -- **Hidden state reads** — if a function needs data, pass it in. Don't reach - into a global or lock a mutex inside a "pure" function. - -- **Testing IO to test logic** — if you need a Docker container running to test - whether volume selection works correctly, the logic isn't separated from the - IO. diff --git a/README.md b/README.md index ac77fe3..8fc89de 100644 --- a/README.md +++ b/README.md @@ -150,3 +150,108 @@ Volume servers should be on a private network that clients cannot reach directly - Data at rest (blobs are plain files on disk) - Malicious keys (no input sanitization beyond what nginx enforces on paths) - Index tampering (SQLite file has no integrity protection) + + +# Development + +## Principles + +1. **Explicit over clever** — no magic helpers, no macros that hide control + flow, no trait gymnastics. Code reads top-to-bottom. A new reader should + understand what a function does without chasing through layers of + indirection. + +2. **Pure functions** — isolate decision logic from IO. A function that takes + data and returns data is testable, composable, and easy to reason about. + Keep it that way. Don't sneak in network calls or logging. + +3. **Linear flow** — avoid callbacks, deep nesting, and async gymnastics where + possible. A handler should read like a sequence of steps: look up the + record, pick a volume, build the response. + +4. **Minimize shared state** — pass values explicitly. Don't hold locks across + IO. Don't reach into globals. + +5. **Minimize indirection** — don't hide logic behind abstractions that exist + "in case we need to swap the implementation later." We won't. A three-line + function inline is better than a trait with one implementor. + +## Applying the principles: separate decisions from execution + +Every request handler does two things: **decides** what should happen, then +**executes** IO to make it happen. These should be separate functions. + +A decision is a pure function. It takes data in, returns a description of what +to do. It doesn't call the network, doesn't touch the database, doesn't log. +It can be tested with `assert_eq!` and nothing else. + +Execution is the messy part — HTTP calls, SQLite writes, error recovery. It +reads the decision and carries it out. It's tested with integration tests. + +## Where this applies today + +### Already pure + +**`hasher.rs`** — the entire module is pure. `volumes_for_key` is a +deterministic function of its inputs. No IO, no state mutation. This is the +gold standard for the project. + +**`rebalance.rs::plan_rebalance`** — takes a slice of records and returns a +list of moves. Pure decision logic, tested with unit tests. + +**`db.rs` encode/parse** — `parse_volumes` and `encode_volumes` are pure +transformations between JSON strings and `Vec`. + +### Mixed (decision + execution interleaved) + +**`server.rs::put_key`** — this handler does three things in one function: + +1. *Decide* which volumes to write to (pure — `volumes_for_key`) +2. *Execute* fan-out PUTs to nginx (IO) +3. *Decide* whether to rollback based on results (pure — check which succeeded) +4. *Execute* rollback DELETEs and/or index write (IO) + +Steps 1 and 3 could be extracted as pure functions if they grow more complex. + +### Intentionally impure + +**`rebuild.rs`** — walks nginx autoindex and bulk-inserts into SQLite. The IO +is the whole point; there's no decision logic worth extracting. + +**`db.rs`** — wraps SQLite behind `Arc>` with +`spawn_blocking` to avoid blocking the tokio runtime. The mutex serializes all +access; `SQLITE_OPEN_NO_MUTEX` disables SQLite's internal locking since the +application mutex handles it. + +## Guidelines + +1. **If a function takes only data and returns only data, it's pure.** Keep it + that way. Don't sneak in logging, metrics, or "just one network call." + +2. **If a handler has an `if` or `match` that decides between outcomes, that + decision can probably be a pure function.** Extract it. Name it. Test it. + +3. **IO boundaries should be thin.** Format URL, make request, check status, + return bytes. No business logic. + +4. **Don't over-abstract.** A three-line pure function inline in a handler is + fine. Extract it when it gets complex enough to need its own tests, or when + the same decision appears in multiple places (e.g., rebuild and rebalance + both use `volumes_for_key`). + +5. **Errors are data.** `AppError` is a value, not an exception. Functions + return `Result`, handlers pattern-match on it. The `IntoResponse` impl is + the only place where errors become HTTP responses — one place, one mapping. + +## Anti-patterns to avoid + +- **God handler** — a 100-line async fn that reads the DB, calls volumes, makes + decisions, handles errors, and formats the response. Break it up. + +- **Hidden state reads** — if a function needs data, pass it in. Don't reach + into a global or lock a mutex inside a "pure" function. + +- **Testing IO to test logic** — if you need a Docker container running to test + whether volume selection works correctly, the logic isn't separated from the + IO. +