## Storage This crate is an implementation of a storage access layer. It defines an interface for storing and retrieving data from a storage backend, in a generic way. Higher-level crates such as `query-ops` should use this crate to access the storage. This crates helps with decoupling the query execution logic from the underlying storage implementation. ### Public API | Item | Kind | Description | |--------------------------------------------------------------------|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `Storage` | trait | Backend-agnostic interface for storing and retrieving rows. Required methods: `create_relation`, `arity`, `scan_iter`, and `transaction`. The rest (`scan`, `scan_where`, `insert`, `delete`) have default implementations. | | `Transaction` | trait | Atomic batch of inserts and deletes against a `Storage`. `insert` returns a pending `RowId`; `commit` consumes the boxed transaction and returns a `CommittedTx`; dropping without committing rolls back. | | `CommittedTx` | struct | Result of a successful `Transaction::commit`. Resolves pending `RowId`s returned during the transaction to their post-commit form via `resolve`. Empty for KV adapters where pending equals real; populated for `geomerge`. | | `StorageError` | enum | Error type returned by every fallible method. Variants: `RelationNotFound`, `RelationExists`, `ArityMismatch`, `Validation`, `Decode`, `Unsupported`, and `Backend`. | | `CodecError` | enum | Wire-format failure reported as `StorageError::Decode`. Variants describe truncation, unknown tags, length overruns, and UTF-8 errors. | | `RowStream<'a>` | type alias | `Box), StorageError>> + 'a>`. The value yielded by `Storage::scan_iter` and `Storage::scan_where`. | | `RowId` | struct | Opaque, backend-assigned row identifier. Bytes are inline up to 36 bytes (covers every encoding the workspace produces today) and spill to the heap otherwise. Construct with `RowId::new(bytes)` or `RowId::from(u64)`. | | `Value` | enum | Single cell value. Variants: `Int(i64)`, `Str(String)`, and `Id(RowId)`. `Value::Id` is the foreign-key reference used by `geomerge` and any future referencing backend. | | `Table` | struct | Positional input relation with fixed arity. Produced from a backend scan by `scan_as_table`. Consumed by `query-ops` operators. | | `scan_as_table(&dyn Storage, &str) -> Result` | function | Materialize a relation from a `Storage` backend into a `Table` for query-language operators. Row IDs are dropped; only cell values remain. | | `MemoryStorage` | struct | In-process backend kept in `HashMap`s. Always available; useful for tests and snapshot oracles. | | `adapters::sqlite::SqliteStorage` | struct (feat) | `SQLite`-backed `Storage`, behind the `sqlite` feature. Uses `rusqlite` with bundled libsqlite3; supports a single connection with native write transactions. | | `adapters::redb::RedbStorage` | struct (feat) | Single-file B-tree backed `Storage`, behind the `redb` feature. Wraps `redb::WriteTransaction` for native atomic commits. | | `adapters::fjall::FjallStorage` | struct (feat) | LSM-tree backed `Storage`, behind the `fjall` feature. Each relation gets a partition; transactions buffer inserts and apply them on commit. | | `adapters::lmdb::LmdbStorage` | struct (feat) | mmap'd B-tree backed `Storage`, behind the `lmdb` feature. Wraps `heed`'s `RwTxn` for native atomic commits. | | `adapters::geomerge::GeomergeStorage` | struct (feat) | CRDT-backed `Storage` over the workspace's `geomerge` crate, behind the `geomerge` feature. Wraps `geomerge::Transaction` and resolves pending row IDs via `CommittedTx`. Deletion is not supported (append-only log). | Data types and their relationships:
Types
### Example The example below opens an in-memory backend, declares a relation, inserts two rows inside a single transaction, then reads the result. ```rust use storage::value::Value; use storage::{MemoryStorage, Storage, StorageError}; fn i(x: i64) -> Value { Value::Int(x) } fn main() -> Result<(), StorageError> { let mut storage = MemoryStorage::new(); storage.create_relation("edge", 2)?; let (a, b) = { let mut tx = storage.transaction()?; let a = tx.insert("edge", vec![i(1), i(2)])?; let b = tx.insert("edge", vec![i(2), i(3)])?; let committed = tx.commit()?; // For KV backends pending IDs equal real IDs, so resolve is the identity. (committed.resolve(&a), committed.resolve(&b)) }; let rows = storage.scan("edge")?; assert_eq!(rows, vec![(a, vec![i(1), i(2)]), (b, vec![i(2), i(3)])]); Ok(()) } ``` Swapping `MemoryStorage` for any other adapter (for example `adapters::sqlite::SqliteStorage::open(":memory:")?`) needs no other code changes. How a backend is used (logically):
Workflow
### Run the Tests ```sh cargo test -p storage --all-features ``` ### Notes - **Opaque row IDs.** A `RowId` is a backend-assigned byte sequence; callers do not interpret the bytes. KV adapters use big-endian `u64`; the `geomerge` adapter encodes a `(CommitHash, counter)` pair. Hand a `RowId` back to the same backend to reference an existing row. - **Pending row IDs.** `Transaction::insert` may return a pending `RowId` that the backend cannot stabilize until commit; this is the case for `geomerge`, where the final ID depends on the resulting `CommitHash`. Resolve such IDs through the `CommittedTx` returned by `commit`. For all KV backends the pending ID is already the real one and `CommittedTx::resolve` is the identity. - **Streaming first.** `scan_iter` is the primary scan operation; `scan` defaults to collecting it. In-memory and LSM backends stream natively; B-tree and SQL backends materialize a `Vec` internally and yield from it to avoid self-referential iterators. - **Atomic transactions.** Adapters with native write transactions (LMDB, redb, `SQLite`, `geomerge`) wrap the engine's transaction directly. Adapters without (memory, fjall) buffer pending operations and apply them on commit. Dropping a transaction without calling `commit` rolls back any pending operations. - **Deletion support.** Most adapters implement `delete`. The `geomerge` adapter does not: its append-only commit log returns `StorageError::Unsupported("row deletion")`. - **Geomerge is alpha.** The upstream `geomerge` crate is prototype-status and its API may change without notice; treat breakage in `adapters::geomerge` as expected churn rather than regression. - **Feature gates.** `MemoryStorage` is always available. Every other adapter is feature-gated (`lmdb`, `redb`, `fjall`, `sqlite`, and `geomerge`) so callers only pay for what they need.