Storage Abstraction Layer
This crate is an implementation of a storage access layer.
It defines an interface for storing and retrieving data from a storage backend, in a generic way.
Higher-level crates such as query-ops should use this crate to access the storage.
This crates helps with decoupling the query execution logic from the underlying storage implementation.
Public API
| Item | Kind | Description |
|---|---|---|
Storage |
trait | Backend-agnostic interface for storing and retrieving rows. Required methods: create_relation, arity, scan_iter, and transaction. The rest (scan, scan_where, insert, delete) have default implementations. |
Transaction |
trait | Atomic batch of inserts and deletes against a Storage. insert returns a pending RowId; commit consumes the boxed transaction and returns a CommittedTx; dropping without committing rolls back. |
CommittedTx |
struct | Result of a successful Transaction::commit. Resolves pending RowIds returned during the transaction to their post-commit form via resolve. Empty for KV adapters where pending equals real; populated for geomerge. |
StorageError |
enum | Error type returned by every fallible method. Variants: RelationNotFound, RelationExists, ArityMismatch, Validation, Decode, Unsupported, and Backend. |
CodecError |
enum | Wire-format failure reported as StorageError::Decode. Variants describe truncation, unknown tags, length overruns, and UTF-8 errors. |
RowStream<'a> |
type alias | Box<dyn Iterator<Item = Result<(RowId, Vec<Value>), StorageError>> + 'a>. The value yielded by Storage::scan_iter and Storage::scan_where. |
RowId |
struct | Opaque, backend-assigned row identifier. Bytes are inline up to 36 bytes (covers every encoding the workspace produces today) and spill to the heap otherwise. Construct with RowId::new(bytes) or RowId::from(u64). |
Value |
enum | Single cell value. Variants: Int(i64), Str(String), and Id(RowId). Value::Id is the foreign-key reference used by geomerge and any future referencing backend. |
Table |
struct | Positional input relation with fixed arity. Produced from a backend scan by scan_as_table. Consumed by query-ops operators. |
scan_as_table(&dyn Storage, &str) -> Result<Table, StorageError> |
function | Materialize a relation from a Storage backend into a Table for query-language operators. Row IDs are dropped; only cell values remain. |
MemoryStorage |
struct | In-process backend kept in HashMap. Always available; useful for tests and snapshot oracles. |
adapters::sqlite::SqliteStorage |
struct (feat) | SQLite-backed Storage, behind the sqlite feature. Uses rusqlite with bundled libsqlite3; supports a single connection with native write transactions. |
adapters::redb::RedbStorage |
struct (feat) | Single-file B-tree backed Storage, behind the redb feature. Wraps redb::WriteTransaction for native atomic commits. |
adapters::fjall::FjallStorage |
struct (feat) | LSM-tree backed Storage, behind the fjall feature. Each relation gets a partition; transactions buffer inserts and apply them on commit. |
adapters::lmdb::LmdbStorage |
struct (feat) | mmap'd B-tree backed Storage, behind the lmdb feature. Wraps heed's RwTxn for native atomic commits. |
adapters::geomerge::GeomergeStorage |
struct (feat) | CRDT-backed Storage over the workspace's geomerge crate, behind the geomerge feature. Wraps geomerge::Transaction and resolves pending row IDs via CommittedTx. Deletion is not supported (append-only log). Construct with from_theory, from_store, or with_relations (synthesizes a theory from (name, Vec<ColumnKind>) for callers that lack a typed schema). |
adapters::geomerge::ColumnKind |
enum (feat) | Primitive column type fed to GeomergeStorage::with_relations: Int maps to geomerge PrimInt, String maps to PrimString. Exists so callers can synthesize a theory without depending on geolog-lang::ir directly. |
Data types and their relationships:
Example
The example below opens an in-memory backend, declares a relation, inserts two rows inside a single transaction, then reads the result.
use storage::value::Value;
use storage::{MemoryStorage, Storage, StorageError};
fn i(x: i64) -> Value {
Value::Int(x)
}
fn main() -> Result<(), StorageError> {
let mut storage = MemoryStorage::new();
storage.create_relation("edge", 2)?;
let (a, b) = {
let mut tx = storage.transaction()?;
let a = tx.insert("edge", vec![i(1), i(2)])?;
let b = tx.insert("edge", vec![i(2), i(3)])?;
let committed = tx.commit()?;
// For KV backends pending IDs equal real IDs, so resolve is the identity.
(committed.resolve(&a), committed.resolve(&b))
};
let rows = storage.scan("edge")?;
assert_eq!(rows, vec![(a, vec![i(1), i(2)]), (b, vec![i(2), i(3)])]);
Ok(())
}
Note that we can always swap MemoryStorage for any other adapter (for example adapters::sqlite::SqliteStorage::open(":memory:")?) without changing
anything in the code.
How a backend is used (logically):
Run the Tests
cargo test -p storage --all-features
Notes
- Opaque row IDs.
A
RowIdis a backend-assigned byte sequence; callers do not interpret the bytes. KV adapters use big-endianu64; thegeomergeadapter encodes a(CommitHash, counter)pair. Hand aRowIdback to the same backend to reference an existing row. - Pending row IDs.
Transaction::insertmay return a pendingRowIdthat the backend cannot stabilize until commit; this is the case forgeomerge, where the final ID depends on the resultingCommitHash. Resolve such IDs through theCommittedTxreturned bycommit. For all KV backends the pending ID is already the real one andCommittedTx::resolveis the identity. - Streaming first.
scan_iteris the primary scan operation;scandefaults to collecting it. In-memory and LSM backends stream natively; B-tree and SQL backends materialize aVecinternally and yield from it to avoid self-referential iterators. - Atomic transactions.
For storage backends with write transactions support (LMDB, Redb, SQLite, and geomerge) we use their transaction API directly.
Adapters without native transaction support (MemoryStorage and Fjall) implement
Transactionwith an internal buffer of pending operations that are applied oncommit. Note that dropping a transaction without callingcommitrolls back any pending operations. - Deletion support.
Most adapters implement
delete. Thegeomergeadapter does not: its append-only commit log returnsStorageError::Unsupported("row deletion"). - ⚠️ Geomerge is alpha.
The upstream
geomergecrate is prototype-status and its API can change without notice; treat breakage inadapters::geomergeas expected churn rather than regression. - Feature gates.
MemoryStorageis always available. Every other adapter is feature-gated (lmdb,redb,fjall,sqlite, andgeomerge) so callers only pay for what they need.