157 lines
6.5 KiB
Markdown
157 lines
6.5 KiB
Markdown
# Project Philosophy
|
|
|
|
## Principles
|
|
|
|
1. **Explicit over clever** — no magic helpers, no macros that hide control
|
|
flow, no trait gymnastics. Code reads top-to-bottom. A new reader should
|
|
understand what a function does without chasing through layers of
|
|
indirection.
|
|
|
|
2. **Pure functions** — isolate decision logic from IO. A function that takes
|
|
data and returns data is testable, composable, and easy to reason about.
|
|
Keep it that way. Don't sneak in network calls or logging.
|
|
|
|
3. **Linear flow** — avoid callbacks, deep nesting, and async gymnastics where
|
|
possible. A handler should read like a sequence of steps: look up the
|
|
record, pick a volume, build the response.
|
|
|
|
4. **Minimize shared state** — pass values explicitly. The handler reads the
|
|
healthy volume set as a snapshot, then works with that snapshot. Don't hold
|
|
locks across IO. Don't reach into globals.
|
|
|
|
5. **Minimize indirection** — don't hide logic behind abstractions that exist
|
|
"in case we need to swap the implementation later." We won't. A three-line
|
|
function inline is better than a trait with one implementor.
|
|
|
|
## Applying the principles: separate decisions from execution
|
|
|
|
Every request handler does two things: **decides** what should happen, then
|
|
**executes** IO to make it happen. These should be separate functions.
|
|
|
|
A decision is a pure function. It takes data in, returns a description of what
|
|
to do. It doesn't call the network, doesn't touch the database, doesn't log.
|
|
It can be tested with `assert_eq!` and nothing else.
|
|
|
|
Execution is the messy part — HTTP calls, SQLite writes, error recovery. It
|
|
reads the decision and carries it out. It's tested with integration tests.
|
|
|
|
## Where this applies today
|
|
|
|
### Already pure
|
|
|
|
**`hasher.rs`** — the entire module is pure. `Ring` is a data structure.
|
|
`get_volumes` and `key_path` are deterministic functions of their inputs. No
|
|
IO, no state mutation beyond construction. This is the gold standard for the
|
|
project.
|
|
|
|
**`db.rs` query functions** — `get`, `list_keys`, `all_records` take a
|
|
`&Connection` and return data. The connection is injected, not owned. The
|
|
functions don't decide what to do with the data — they just retrieve it.
|
|
|
|
**`db.rs` encode/parse** — `parse_volumes` and `encode_volumes` are pure
|
|
transformations between JSON strings and `Vec<String>`.
|
|
|
|
### Mixed (decision + execution interleaved)
|
|
|
|
**`server.rs::put_key`** — this handler does three things in one function:
|
|
|
|
1. *Decide* which volumes to write to (pure — ring lookup)
|
|
2. *Execute* fan-out PUTs to nginx (IO)
|
|
3. *Decide* whether to rollback based on results (pure — check which succeeded)
|
|
4. *Execute* rollback DELETEs and/or index write (IO)
|
|
|
|
Steps 1 and 3 could be extracted as pure functions:
|
|
|
|
```rust
|
|
// Pure: given a key and ring, compute the placement plan
|
|
struct PutPlan {
|
|
path: String,
|
|
target_volumes: Vec<String>,
|
|
}
|
|
|
|
fn plan_put(ring: &Ring, key: &str, replication: usize) -> Result<PutPlan, AppError> {
|
|
let path = Ring::key_path(key);
|
|
let target_volumes = ring.get_volumes(key, replication);
|
|
if target_volumes.len() < replication {
|
|
return Err(AppError::VolumeError(...));
|
|
}
|
|
Ok(PutPlan { path, target_volumes })
|
|
}
|
|
|
|
// Pure: given fan-out results, decide what to do next
|
|
enum PutOutcome {
|
|
AllSucceeded { volumes: Vec<String> },
|
|
NeedsRollback { succeeded: Vec<String> },
|
|
}
|
|
|
|
fn evaluate_put_results(results: &[(String, Result<(), String>)]) -> PutOutcome { ... }
|
|
```
|
|
|
|
**`server.rs::get_key`** — the "pick a healthy volume" logic is a pure
|
|
function hiding inside an async handler:
|
|
|
|
```rust
|
|
// Pure: given a record's volumes and the healthy set, pick one
|
|
fn pick_healthy_volume<'a>(
|
|
record_volumes: &'a [String],
|
|
healthy: &HashSet<String>,
|
|
) -> Option<&'a str> {
|
|
record_volumes.iter().find(|v| healthy.contains(*v)).map(|v| v.as_str())
|
|
}
|
|
```
|
|
|
|
### Intentionally impure
|
|
|
|
**`volume.rs`** — this is an IO boundary. It wraps `reqwest` and talks to
|
|
nginx. There's no decision logic here to extract; it's a thin adapter. Testing
|
|
it means mocking HTTP. That's fine.
|
|
|
|
**`health.rs`** — a side-effecting loop. It polls volumes and mutates shared
|
|
state. No pure core to extract. Keep it simple.
|
|
|
|
**`db.rs` writer thread** — the batch-and-commit loop is inherently stateful.
|
|
The `execute_cmd` function is close to pure (it takes a connection and a
|
|
command, returns a result), but it mutates the database. The batching logic
|
|
(drain channel, group into transaction) is a state machine. Not worth
|
|
abstracting further.
|
|
|
|
## Guidelines
|
|
|
|
1. **If a function takes only data and returns only data, it's pure.** Keep it
|
|
that way. Don't sneak in logging, metrics, or "just one network call."
|
|
|
|
2. **If a handler has an `if` or `match` that decides between outcomes, that
|
|
decision can probably be a pure function.** Extract it. Name it. Test it.
|
|
|
|
3. **IO boundaries should be thin.** `volume.rs` is a good example: format URL,
|
|
make request, check status, return bytes. No business logic.
|
|
|
|
4. **Don't over-abstract.** A three-line pure function inline in a handler is
|
|
fine. Extract it when it gets complex enough to need its own tests, or when
|
|
the same decision appears in multiple places (e.g., rebuild and rebalance
|
|
both need "compute desired placement").
|
|
|
|
5. **Shared state should be read-only snapshots when possible.** The handler
|
|
reads `healthy_volumes` and `ring` under a read lock, then releases it
|
|
before doing IO. This keeps the critical section small and makes the
|
|
decision logic operate on a snapshot, not live-mutating state.
|
|
|
|
6. **Errors are data.** `AppError` is a value, not an exception. Functions
|
|
return `Result`, handlers pattern-match on it. The `IntoResponse` impl is
|
|
the only place where errors become HTTP responses — one place, one mapping.
|
|
|
|
## Anti-patterns to avoid
|
|
|
|
- **God handler** — a 100-line async fn that reads the DB, calls volumes, makes
|
|
decisions, handles errors, and formats the response. Break it up.
|
|
|
|
- **Stringly-typed errors in business logic** — `volume.rs` uses `String` errors
|
|
because it's an IO boundary and the strings are for logging. Decision
|
|
functions should use typed errors.
|
|
|
|
- **Hidden state reads** — if a function needs the healthy volume set, pass it
|
|
in. Don't reach into a global or lock a mutex inside a "pure" function.
|
|
|
|
- **Testing IO to test logic** — if you need a Docker container running to test
|
|
whether "pick a healthy volume" works correctly, the logic isn't separated
|
|
from the IO.
|