Move philosophy into README
This commit is contained in:
parent
0c7e217135
commit
71abb1ed7d
2 changed files with 105 additions and 102 deletions
102
PHILOSOPHY.md
102
PHILOSOPHY.md
|
|
@ -1,102 +0,0 @@
|
||||||
# Project Philosophy
|
|
||||||
|
|
||||||
## Principles
|
|
||||||
|
|
||||||
1. **Explicit over clever** — no magic helpers, no macros that hide control
|
|
||||||
flow, no trait gymnastics. Code reads top-to-bottom. A new reader should
|
|
||||||
understand what a function does without chasing through layers of
|
|
||||||
indirection.
|
|
||||||
|
|
||||||
2. **Pure functions** — isolate decision logic from IO. A function that takes
|
|
||||||
data and returns data is testable, composable, and easy to reason about.
|
|
||||||
Keep it that way. Don't sneak in network calls or logging.
|
|
||||||
|
|
||||||
3. **Linear flow** — avoid callbacks, deep nesting, and async gymnastics where
|
|
||||||
possible. A handler should read like a sequence of steps: look up the
|
|
||||||
record, pick a volume, build the response.
|
|
||||||
|
|
||||||
4. **Minimize shared state** — pass values explicitly. Don't hold locks across
|
|
||||||
IO. Don't reach into globals.
|
|
||||||
|
|
||||||
5. **Minimize indirection** — don't hide logic behind abstractions that exist
|
|
||||||
"in case we need to swap the implementation later." We won't. A three-line
|
|
||||||
function inline is better than a trait with one implementor.
|
|
||||||
|
|
||||||
## Applying the principles: separate decisions from execution
|
|
||||||
|
|
||||||
Every request handler does two things: **decides** what should happen, then
|
|
||||||
**executes** IO to make it happen. These should be separate functions.
|
|
||||||
|
|
||||||
A decision is a pure function. It takes data in, returns a description of what
|
|
||||||
to do. It doesn't call the network, doesn't touch the database, doesn't log.
|
|
||||||
It can be tested with `assert_eq!` and nothing else.
|
|
||||||
|
|
||||||
Execution is the messy part — HTTP calls, SQLite writes, error recovery. It
|
|
||||||
reads the decision and carries it out. It's tested with integration tests.
|
|
||||||
|
|
||||||
## Where this applies today
|
|
||||||
|
|
||||||
### Already pure
|
|
||||||
|
|
||||||
**`hasher.rs`** — the entire module is pure. `volumes_for_key` is a
|
|
||||||
deterministic function of its inputs. No IO, no state mutation. This is the
|
|
||||||
gold standard for the project.
|
|
||||||
|
|
||||||
**`rebalance.rs::plan_rebalance`** — takes a slice of records and returns a
|
|
||||||
list of moves. Pure decision logic, tested with unit tests.
|
|
||||||
|
|
||||||
**`db.rs` encode/parse** — `parse_volumes` and `encode_volumes` are pure
|
|
||||||
transformations between JSON strings and `Vec<String>`.
|
|
||||||
|
|
||||||
### Mixed (decision + execution interleaved)
|
|
||||||
|
|
||||||
**`server.rs::put_key`** — this handler does three things in one function:
|
|
||||||
|
|
||||||
1. *Decide* which volumes to write to (pure — `volumes_for_key`)
|
|
||||||
2. *Execute* fan-out PUTs to nginx (IO)
|
|
||||||
3. *Decide* whether to rollback based on results (pure — check which succeeded)
|
|
||||||
4. *Execute* rollback DELETEs and/or index write (IO)
|
|
||||||
|
|
||||||
Steps 1 and 3 could be extracted as pure functions if they grow more complex.
|
|
||||||
|
|
||||||
### Intentionally impure
|
|
||||||
|
|
||||||
**`rebuild.rs`** — walks nginx autoindex and bulk-inserts into SQLite. The IO
|
|
||||||
is the whole point; there's no decision logic worth extracting.
|
|
||||||
|
|
||||||
**`db.rs`** — wraps SQLite behind `Arc<Mutex<Connection>>` with
|
|
||||||
`spawn_blocking` to avoid blocking the tokio runtime. The mutex serializes all
|
|
||||||
access; `SQLITE_OPEN_NO_MUTEX` disables SQLite's internal locking since the
|
|
||||||
application mutex handles it.
|
|
||||||
|
|
||||||
## Guidelines
|
|
||||||
|
|
||||||
1. **If a function takes only data and returns only data, it's pure.** Keep it
|
|
||||||
that way. Don't sneak in logging, metrics, or "just one network call."
|
|
||||||
|
|
||||||
2. **If a handler has an `if` or `match` that decides between outcomes, that
|
|
||||||
decision can probably be a pure function.** Extract it. Name it. Test it.
|
|
||||||
|
|
||||||
3. **IO boundaries should be thin.** Format URL, make request, check status,
|
|
||||||
return bytes. No business logic.
|
|
||||||
|
|
||||||
4. **Don't over-abstract.** A three-line pure function inline in a handler is
|
|
||||||
fine. Extract it when it gets complex enough to need its own tests, or when
|
|
||||||
the same decision appears in multiple places (e.g., rebuild and rebalance
|
|
||||||
both use `volumes_for_key`).
|
|
||||||
|
|
||||||
5. **Errors are data.** `AppError` is a value, not an exception. Functions
|
|
||||||
return `Result`, handlers pattern-match on it. The `IntoResponse` impl is
|
|
||||||
the only place where errors become HTTP responses — one place, one mapping.
|
|
||||||
|
|
||||||
## Anti-patterns to avoid
|
|
||||||
|
|
||||||
- **God handler** — a 100-line async fn that reads the DB, calls volumes, makes
|
|
||||||
decisions, handles errors, and formats the response. Break it up.
|
|
||||||
|
|
||||||
- **Hidden state reads** — if a function needs data, pass it in. Don't reach
|
|
||||||
into a global or lock a mutex inside a "pure" function.
|
|
||||||
|
|
||||||
- **Testing IO to test logic** — if you need a Docker container running to test
|
|
||||||
whether volume selection works correctly, the logic isn't separated from the
|
|
||||||
IO.
|
|
||||||
105
README.md
105
README.md
|
|
@ -150,3 +150,108 @@ Volume servers should be on a private network that clients cannot reach directly
|
||||||
- Data at rest (blobs are plain files on disk)
|
- Data at rest (blobs are plain files on disk)
|
||||||
- Malicious keys (no input sanitization beyond what nginx enforces on paths)
|
- Malicious keys (no input sanitization beyond what nginx enforces on paths)
|
||||||
- Index tampering (SQLite file has no integrity protection)
|
- Index tampering (SQLite file has no integrity protection)
|
||||||
|
|
||||||
|
|
||||||
|
# Development
|
||||||
|
|
||||||
|
## Principles
|
||||||
|
|
||||||
|
1. **Explicit over clever** — no magic helpers, no macros that hide control
|
||||||
|
flow, no trait gymnastics. Code reads top-to-bottom. A new reader should
|
||||||
|
understand what a function does without chasing through layers of
|
||||||
|
indirection.
|
||||||
|
|
||||||
|
2. **Pure functions** — isolate decision logic from IO. A function that takes
|
||||||
|
data and returns data is testable, composable, and easy to reason about.
|
||||||
|
Keep it that way. Don't sneak in network calls or logging.
|
||||||
|
|
||||||
|
3. **Linear flow** — avoid callbacks, deep nesting, and async gymnastics where
|
||||||
|
possible. A handler should read like a sequence of steps: look up the
|
||||||
|
record, pick a volume, build the response.
|
||||||
|
|
||||||
|
4. **Minimize shared state** — pass values explicitly. Don't hold locks across
|
||||||
|
IO. Don't reach into globals.
|
||||||
|
|
||||||
|
5. **Minimize indirection** — don't hide logic behind abstractions that exist
|
||||||
|
"in case we need to swap the implementation later." We won't. A three-line
|
||||||
|
function inline is better than a trait with one implementor.
|
||||||
|
|
||||||
|
## Applying the principles: separate decisions from execution
|
||||||
|
|
||||||
|
Every request handler does two things: **decides** what should happen, then
|
||||||
|
**executes** IO to make it happen. These should be separate functions.
|
||||||
|
|
||||||
|
A decision is a pure function. It takes data in, returns a description of what
|
||||||
|
to do. It doesn't call the network, doesn't touch the database, doesn't log.
|
||||||
|
It can be tested with `assert_eq!` and nothing else.
|
||||||
|
|
||||||
|
Execution is the messy part — HTTP calls, SQLite writes, error recovery. It
|
||||||
|
reads the decision and carries it out. It's tested with integration tests.
|
||||||
|
|
||||||
|
## Where this applies today
|
||||||
|
|
||||||
|
### Already pure
|
||||||
|
|
||||||
|
**`hasher.rs`** — the entire module is pure. `volumes_for_key` is a
|
||||||
|
deterministic function of its inputs. No IO, no state mutation. This is the
|
||||||
|
gold standard for the project.
|
||||||
|
|
||||||
|
**`rebalance.rs::plan_rebalance`** — takes a slice of records and returns a
|
||||||
|
list of moves. Pure decision logic, tested with unit tests.
|
||||||
|
|
||||||
|
**`db.rs` encode/parse** — `parse_volumes` and `encode_volumes` are pure
|
||||||
|
transformations between JSON strings and `Vec<String>`.
|
||||||
|
|
||||||
|
### Mixed (decision + execution interleaved)
|
||||||
|
|
||||||
|
**`server.rs::put_key`** — this handler does three things in one function:
|
||||||
|
|
||||||
|
1. *Decide* which volumes to write to (pure — `volumes_for_key`)
|
||||||
|
2. *Execute* fan-out PUTs to nginx (IO)
|
||||||
|
3. *Decide* whether to rollback based on results (pure — check which succeeded)
|
||||||
|
4. *Execute* rollback DELETEs and/or index write (IO)
|
||||||
|
|
||||||
|
Steps 1 and 3 could be extracted as pure functions if they grow more complex.
|
||||||
|
|
||||||
|
### Intentionally impure
|
||||||
|
|
||||||
|
**`rebuild.rs`** — walks nginx autoindex and bulk-inserts into SQLite. The IO
|
||||||
|
is the whole point; there's no decision logic worth extracting.
|
||||||
|
|
||||||
|
**`db.rs`** — wraps SQLite behind `Arc<Mutex<Connection>>` with
|
||||||
|
`spawn_blocking` to avoid blocking the tokio runtime. The mutex serializes all
|
||||||
|
access; `SQLITE_OPEN_NO_MUTEX` disables SQLite's internal locking since the
|
||||||
|
application mutex handles it.
|
||||||
|
|
||||||
|
## Guidelines
|
||||||
|
|
||||||
|
1. **If a function takes only data and returns only data, it's pure.** Keep it
|
||||||
|
that way. Don't sneak in logging, metrics, or "just one network call."
|
||||||
|
|
||||||
|
2. **If a handler has an `if` or `match` that decides between outcomes, that
|
||||||
|
decision can probably be a pure function.** Extract it. Name it. Test it.
|
||||||
|
|
||||||
|
3. **IO boundaries should be thin.** Format URL, make request, check status,
|
||||||
|
return bytes. No business logic.
|
||||||
|
|
||||||
|
4. **Don't over-abstract.** A three-line pure function inline in a handler is
|
||||||
|
fine. Extract it when it gets complex enough to need its own tests, or when
|
||||||
|
the same decision appears in multiple places (e.g., rebuild and rebalance
|
||||||
|
both use `volumes_for_key`).
|
||||||
|
|
||||||
|
5. **Errors are data.** `AppError` is a value, not an exception. Functions
|
||||||
|
return `Result`, handlers pattern-match on it. The `IntoResponse` impl is
|
||||||
|
the only place where errors become HTTP responses — one place, one mapping.
|
||||||
|
|
||||||
|
## Anti-patterns to avoid
|
||||||
|
|
||||||
|
- **God handler** — a 100-line async fn that reads the DB, calls volumes, makes
|
||||||
|
decisions, handles errors, and formats the response. Break it up.
|
||||||
|
|
||||||
|
- **Hidden state reads** — if a function needs data, pass it in. Don't reach
|
||||||
|
into a global or lock a mutex inside a "pure" function.
|
||||||
|
|
||||||
|
- **Testing IO to test logic** — if you need a Docker container running to test
|
||||||
|
whether volume selection works correctly, the logic isn't separated from the
|
||||||
|
IO.
|
||||||
|
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue