Improve typing and errors, clean up

This commit is contained in:
Silas Brack 2026-03-07 15:24:05 +01:00
parent 07490efc28
commit ec408aff29
5 changed files with 205 additions and 22 deletions

132
README.md Normal file
View file

@ -0,0 +1,132 @@
# mkv
Distributed key-value store for blobs. Thin index server (Rust + SQLite) in front of nginx volume servers. Inspired by [minikeyvalue](https://github.com/geohot/minikeyvalue).
## Usage
```bash
# Start the index server (replicates to 2 of 3 volumes)
mkv -d /tmp/index.db -v http://vol1:8080,http://vol2:8080,http://vol3:8080 -r 2 serve -p 3000
# Store a file
curl -X PUT -d "contents" http://localhost:3000/path/to/key
# Retrieve (returns 302 redirect to nginx)
curl -L http://localhost:3000/path/to/key
# Check existence and size
curl -I http://localhost:3000/path/to/key
# Delete
curl -X DELETE http://localhost:3000/path/to/key
# List keys (with optional prefix filter)
curl http://localhost:3000/?prefix=path/to/
```
### Operations
```bash
# Rebuild index by scanning all volumes (disaster recovery)
mkv -d /tmp/index.db -v http://vol1:8080,http://vol2:8080,http://vol3:8080 -r 2 rebuild
# Rebalance after adding/removing volumes (preview with --dry-run)
mkv -d /tmp/index.db -v http://vol1:8080,http://vol2:8080,http://vol3:8080,http://vol4:8080 -r 2 rebalance --dry-run
mkv -d /tmp/index.db -v http://vol1:8080,http://vol2:8080,http://vol3:8080,http://vol4:8080 -r 2 rebalance
```
### Volume servers
Any nginx with WebDAV enabled works:
```nginx
server {
listen 80;
root /data;
location / {
dav_methods PUT DELETE;
create_full_put_path on;
autoindex on;
autoindex_format json;
}
}
```
## What it does
- **HTTP API** — PUT, GET (302 redirect), DELETE, HEAD, LIST with prefix filtering
- **Replication** — fan-out writes to N volumes concurrently, all-or-nothing with rollback
- **Consistent hashing** — stable volume assignment; adding/removing a volume only moves ~1/N of keys
- **Rebuild** — reconstructs the SQLite index by scanning nginx autoindex on all volumes
- **Rebalance** — migrates data to correct volumes after topology changes, with `--dry-run` preview
- **Key-as-path** — blobs stored at `/{key}` on nginx, no content-addressing or sidecar files
- **Single binary** — no config files, everything via CLI flags
## What it doesn't do
- **Checksums** — no integrity verification; bit rot goes undetected
- **Auth** — no access control; anyone who can reach the server can read/write/delete
- **Encryption** — blobs stored as plain files on nginx
- **Streaming / range requests** — entire blob must fit in memory
- **Metadata** — no EXIF, tags, or content types; key path is all you get
- **Versioning** — PUT overwrites; no history
- **Compression** — blobs stored as-is
## Comparison to minikeyvalue
mkv is a ground-up rewrite of [minikeyvalue](https://github.com/geohot/minikeyvalue) in Rust.
| | mkv | minikeyvalue |
|--|-----|--------------|
| Language | Rust | Go |
| Index | SQLite (WAL mode) | LevelDB |
| Storage paths | key-as-path (`/{key}`) | content-addressed (md5 + base64) |
| GET behavior | Index lookup, 302 redirect | HEAD to volume first, then 302 redirect |
| PUT overwrite | Allowed | Forbidden (returns 403) |
| Hash function | SHA-256 per volume, sort by score | MD5 per volume, sort by score |
| MD5 of values | No | Yes (stored in index) |
| Health checker | No | No (checks per-request via HEAD) |
| Subvolumes | No | Yes (configurable fan-out directories) |
| Soft delete | No (hard delete) | Yes (UNLINK + DELETE two-phase) |
| S3 API | No | Partial (list, multipart upload) |
| App code | ~600 lines | ~1,000 lines |
| Tests | 17 (unit + integration) | 1 |
### Performance (10k keys, 1KB values, 100 concurrency)
Tested on the same machine with shared nginx volumes:
| Operation | mkv | minikeyvalue |
|-----------|-----|--------------|
| PUT | 10,000 req/s | 10,500 req/s |
| GET (full round-trip) | 7,000 req/s | 6,500 req/s |
| GET (index only) | 15,800 req/s | 13,800 req/s |
| DELETE | 13,300 req/s | 13,600 req/s |
Both are bottlenecked by nginx volume I/O. The index layer (SQLite) can sustain 378,000 writes/sec in isolation.
## Security
mkv assumes a **trusted network**. There is no built-in authentication, authorization, or encryption. This is the same security model as minikeyvalue — neither system is designed for direct exposure to the public internet.
### Trust model
The index server and volume servers (nginx) are expected to live on the same private network. GET requests return a 302 redirect to a volume URL, so clients must be able to reach the volumes directly. Anyone who can reach the index server can read, write, and delete any key. Anyone who can reach a volume can read any blob.
### Deploying with auth
Put a reverse proxy in front of the index server and handle authentication there:
- **Basic auth or API keys** at the reverse proxy for simple setups
- **mTLS** for machine-to-machine access
- **OAuth / JWT** validation at the proxy for multi-user setups
Volume servers should be on a private network that clients cannot reach directly, or use nginx's `secure_link` module to validate signed redirect URLs.
### What neither mkv nor minikeyvalue protect against
- Unauthorized reads/writes (no auth)
- Data in transit (no TLS unless the proxy adds it)
- Data at rest (blobs are plain files on disk)
- Malicious keys (no input sanitization beyond what nginx enforces on paths)
- Index tampering (SQLite file has no integrity protection)