Correct philosophy document
This commit is contained in:
parent
984953502b
commit
e67f476ca4
1 changed files with 23 additions and 78 deletions
101
PHILOSOPHY.md
101
PHILOSOPHY.md
|
|
@ -15,9 +15,8 @@
|
|||
possible. A handler should read like a sequence of steps: look up the
|
||||
record, pick a volume, build the response.
|
||||
|
||||
4. **Minimize shared state** — pass values explicitly. The handler reads the
|
||||
healthy volume set as a snapshot, then works with that snapshot. Don't hold
|
||||
locks across IO. Don't reach into globals.
|
||||
4. **Minimize shared state** — pass values explicitly. Don't hold locks across
|
||||
IO. Don't reach into globals.
|
||||
|
||||
5. **Minimize indirection** — don't hide logic behind abstractions that exist
|
||||
"in case we need to swap the implementation later." We won't. A three-line
|
||||
|
|
@ -39,14 +38,12 @@ reads the decision and carries it out. It's tested with integration tests.
|
|||
|
||||
### Already pure
|
||||
|
||||
**`hasher.rs`** — the entire module is pure. `Ring` is a data structure.
|
||||
`get_volumes` and `key_path` are deterministic functions of their inputs. No
|
||||
IO, no state mutation beyond construction. This is the gold standard for the
|
||||
project.
|
||||
**`hasher.rs`** — the entire module is pure. `volumes_for_key` is a
|
||||
deterministic function of its inputs. No IO, no state mutation. This is the
|
||||
gold standard for the project.
|
||||
|
||||
**`db.rs` query functions** — `get`, `list_keys`, `all_records` take a
|
||||
`&Connection` and return data. The connection is injected, not owned. The
|
||||
functions don't decide what to do with the data — they just retrieve it.
|
||||
**`rebalance.rs::plan_rebalance`** — takes a slice of records and returns a
|
||||
list of moves. Pure decision logic, tested with unit tests.
|
||||
|
||||
**`db.rs` encode/parse** — `parse_volumes` and `encode_volumes` are pure
|
||||
transformations between JSON strings and `Vec<String>`.
|
||||
|
|
@ -55,65 +52,22 @@ transformations between JSON strings and `Vec<String>`.
|
|||
|
||||
**`server.rs::put_key`** — this handler does three things in one function:
|
||||
|
||||
1. *Decide* which volumes to write to (pure — ring lookup)
|
||||
1. *Decide* which volumes to write to (pure — `volumes_for_key`)
|
||||
2. *Execute* fan-out PUTs to nginx (IO)
|
||||
3. *Decide* whether to rollback based on results (pure — check which succeeded)
|
||||
4. *Execute* rollback DELETEs and/or index write (IO)
|
||||
|
||||
Steps 1 and 3 could be extracted as pure functions:
|
||||
|
||||
```rust
|
||||
// Pure: given a key and ring, compute the placement plan
|
||||
struct PutPlan {
|
||||
path: String,
|
||||
target_volumes: Vec<String>,
|
||||
}
|
||||
|
||||
fn plan_put(ring: &Ring, key: &str, replication: usize) -> Result<PutPlan, AppError> {
|
||||
let path = Ring::key_path(key);
|
||||
let target_volumes = ring.get_volumes(key, replication);
|
||||
if target_volumes.len() < replication {
|
||||
return Err(AppError::VolumeError(...));
|
||||
}
|
||||
Ok(PutPlan { path, target_volumes })
|
||||
}
|
||||
|
||||
// Pure: given fan-out results, decide what to do next
|
||||
enum PutOutcome {
|
||||
AllSucceeded { volumes: Vec<String> },
|
||||
NeedsRollback { succeeded: Vec<String> },
|
||||
}
|
||||
|
||||
fn evaluate_put_results(results: &[(String, Result<(), String>)]) -> PutOutcome { ... }
|
||||
```
|
||||
|
||||
**`server.rs::get_key`** — the "pick a healthy volume" logic is a pure
|
||||
function hiding inside an async handler:
|
||||
|
||||
```rust
|
||||
// Pure: given a record's volumes and the healthy set, pick one
|
||||
fn pick_healthy_volume<'a>(
|
||||
record_volumes: &'a [String],
|
||||
healthy: &HashSet<String>,
|
||||
) -> Option<&'a str> {
|
||||
record_volumes.iter().find(|v| healthy.contains(*v)).map(|v| v.as_str())
|
||||
}
|
||||
```
|
||||
Steps 1 and 3 could be extracted as pure functions if they grow more complex.
|
||||
|
||||
### Intentionally impure
|
||||
|
||||
**`volume.rs`** — this is an IO boundary. It wraps `reqwest` and talks to
|
||||
nginx. There's no decision logic here to extract; it's a thin adapter. Testing
|
||||
it means mocking HTTP. That's fine.
|
||||
**`rebuild.rs`** — walks nginx autoindex and bulk-inserts into SQLite. The IO
|
||||
is the whole point; there's no decision logic worth extracting.
|
||||
|
||||
**`health.rs`** — a side-effecting loop. It polls volumes and mutates shared
|
||||
state. No pure core to extract. Keep it simple.
|
||||
|
||||
**`db.rs` writer thread** — the batch-and-commit loop is inherently stateful.
|
||||
The `execute_cmd` function is close to pure (it takes a connection and a
|
||||
command, returns a result), but it mutates the database. The batching logic
|
||||
(drain channel, group into transaction) is a state machine. Not worth
|
||||
abstracting further.
|
||||
**`db.rs`** — wraps SQLite behind `Arc<Mutex<Connection>>` with
|
||||
`spawn_blocking` to avoid blocking the tokio runtime. The mutex serializes all
|
||||
access; `SQLITE_OPEN_NO_MUTEX` disables SQLite's internal locking since the
|
||||
application mutex handles it.
|
||||
|
||||
## Guidelines
|
||||
|
||||
|
|
@ -123,20 +77,15 @@ abstracting further.
|
|||
2. **If a handler has an `if` or `match` that decides between outcomes, that
|
||||
decision can probably be a pure function.** Extract it. Name it. Test it.
|
||||
|
||||
3. **IO boundaries should be thin.** `volume.rs` is a good example: format URL,
|
||||
make request, check status, return bytes. No business logic.
|
||||
3. **IO boundaries should be thin.** Format URL, make request, check status,
|
||||
return bytes. No business logic.
|
||||
|
||||
4. **Don't over-abstract.** A three-line pure function inline in a handler is
|
||||
fine. Extract it when it gets complex enough to need its own tests, or when
|
||||
the same decision appears in multiple places (e.g., rebuild and rebalance
|
||||
both need "compute desired placement").
|
||||
both use `volumes_for_key`).
|
||||
|
||||
5. **Shared state should be read-only snapshots when possible.** The handler
|
||||
reads `healthy_volumes` and `ring` under a read lock, then releases it
|
||||
before doing IO. This keeps the critical section small and makes the
|
||||
decision logic operate on a snapshot, not live-mutating state.
|
||||
|
||||
6. **Errors are data.** `AppError` is a value, not an exception. Functions
|
||||
5. **Errors are data.** `AppError` is a value, not an exception. Functions
|
||||
return `Result`, handlers pattern-match on it. The `IntoResponse` impl is
|
||||
the only place where errors become HTTP responses — one place, one mapping.
|
||||
|
||||
|
|
@ -145,13 +94,9 @@ abstracting further.
|
|||
- **God handler** — a 100-line async fn that reads the DB, calls volumes, makes
|
||||
decisions, handles errors, and formats the response. Break it up.
|
||||
|
||||
- **Stringly-typed errors in business logic** — `volume.rs` uses `String` errors
|
||||
because it's an IO boundary and the strings are for logging. Decision
|
||||
functions should use typed errors.
|
||||
|
||||
- **Hidden state reads** — if a function needs the healthy volume set, pass it
|
||||
in. Don't reach into a global or lock a mutex inside a "pure" function.
|
||||
- **Hidden state reads** — if a function needs data, pass it in. Don't reach
|
||||
into a global or lock a mutex inside a "pure" function.
|
||||
|
||||
- **Testing IO to test logic** — if you need a Docker container running to test
|
||||
whether "pick a healthy volume" works correctly, the logic isn't separated
|
||||
from the IO.
|
||||
whether volume selection works correctly, the logic isn't separated from the
|
||||
IO.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue