17 KiB
Geomerge Public API and Features
A reading note on Geomerge's current library API, REPL surface, and implemented storage behavior.
Short Answer
Geomerge is currently a Rust storage-engine prototype for compiled Geolog theories.
The implemented core is:
compiled FlatTheory
-> Store with empty tables and compiled laws
-> Add operations or transactions
-> schema validation
-> law validation
-> committed table state
It is not yet a full database engine. Native version control, large-scale concurrency, conflict resolution, general query execution, deletes, and updates are not implemented in the current API.
Terminology
Geomerge: A storage-engine prototype for compiled Geolog theories. It loads a compiled Geolog theory, creates tables, accepts row additions, checks laws, and can persist and reload store state.
Geolog: The larger database and logic system that Geomerge is intended to support. Geomerge consumes compiled Geolog IR data.
Compiled Theory: A ready-to-load description of tables and laws. In code, this is FlatTheory.
FlatTheory: The top-level IR object Geomerge consumes. It contains a list of table definitions and a list of law definitions.
IR: Intermediate representation. It is the structured data format shared by geolog-lang and geomerge, rather than the source syntax a user
might write by hand.
Path: A stable name for a table or law, such as Graphs, G.V, or Hom.E.foreignKeys. Geomerge uses paths in public operations because paths
are more portable than internal table ids.
Schema: The column layout of a table. It says how many columns a row has, what type each column expects, and whether the table has a primary key.
Table: A stored relation. Geomerge tables are columnar internally, but users usually think of them as collections of rows.
RowId: A generated identifier for a row. Entity references are represented with row ids.
CellValue: A value stored in one table cell. The current supported values are entity ids, integers, and strings.
Op: A store mutation operation. Currently the only operation is Op::Add, which inserts one row into one table.
Batch: A group of operations submitted together with Store::apply_batch. The whole batch commits only if schema and law validation succeed.
Transaction: A preview-store mutation that commits only after validation. Transactions are useful when later inserted rows need row ids from earlier inserted rows.
Law: A constraint from the compiled theory. Geomerge checks laws after proposed changes and rejects changes that make a law fail.
Binding: A set of variable assignments found while matching a law antecedent against current table rows.
Violation: A failed law check. Current violations are either missing consequent atoms or unsatisfied equalities.
Persistence: Encoding a store to bytes and decoding it later. Geomerge persists table data, schemas, row ids, and source law entries.
Crate Layout
The workspace has two main crates:
geolog-lang: shared Geolog IR definitionsgeomerge: storage engine, table layer, validation, persistence, and REPL
The geomerge crate exports:
ir: re-exported Geolog IR types fromgeolog-langops: mutation operation typespersist: binary store persistencerepl: command parsing and REPL execution helperssolver: law compilation, binding, matching, and validationstore: the mainStoreAPItable: table storage, cell values, row ids, and validation errorstransaction: owned transaction wrapper
The most important public modules for library users are ir, store, table, ops, and persist.
Architecture Diagram
flowchart TD
A[Compiled FlatTheory JSON] --> B[geolog-lang IR]
B --> C[Store::try_from_theory]
C --> D[Table Registry]
C --> E[Law Compiler]
D --> F[Table]
F --> G[Columnar Cell Storage]
F --> H[Generated Row IDs]
F --> I[Schema and Primary-Key Validation]
E --> J[Compiled Laws]
J --> K[Law Validator]
L[Op::Add Batch] --> M[Store::apply_batch]
N[Closure Transaction] --> O[Store::transact]
P[Owned Transaction] --> Q[OwnedTransaction::commit]
M --> R[Preview Store Clone]
O --> R
Q --> R
R --> I
R --> K
K --> S{Validation Result}
I --> S
S -->|ok| T[Committed Store]
S -->|error| U[Original Store Unchanged]
T --> V[Store and Table Accessors]
T --> W[persist::pst::encode_store]
W --> X[Store Bytes]
X --> Y[persist::pst::decode_store]
Y --> T
Z[REPL Commands] --> L
Z --> N
Z --> W
The key architectural point is the preview-store boundary. Mutations are applied to a clone, schema and law checks run on that clone, and the original store is replaced only after validation succeeds.
IR API
The IR crate defines the data shape Geomerge consumes.
Important types:
FlatTheory: a compiled theory containing table entries and law entriesTableEntry: a table path plus schemaLawEntry: a law path plus law definitionSchema: column types and an optional primary keyColType: entity, primitive, or tuple column typePrimType:intorstringPath: dotted table and law paths such asGraphs,G.V, orHom.E.foreignKeysLaw: variables, antecedent proposition, and consequent propositionProp: atom, equality, conjunction, or disjunction at the IR levelTerm: literal, variable, projection, or constructor at the IR level
The IR is broader than the current store and solver implementation. For example, tuple columns, disjunction, projections, and constructors exist in the IR, but they are not fully supported by the current storage and validation path.
Store API
Store is the central API.
The usual construction path is:
let theory: geomerge::ir::FlatTheory = serde_json::from_str(input)?;
let mut store = geomerge::store::Store::try_from_theory(theory)?;
Store::try_from_theory builds one empty table for each table in the theory and compiles the theory laws into solver form.
Main accessors:
Store::new: empty store constructionStore::try_from_theory: store construction fromFlatTheoryStore::tables: table iteratorStore::table_count: table countStore::resolve_table: table path to table oid lookupStore::table: table lookup by oidStore::table_mut: mutable table lookup by oidStore::table_at: table lookup by pathStore::table_at_mut: mutable table lookup by pathStore::laws: compiled law accessStore::dump: debug rendering of all tablesStore::check_laws: full law validation over current store contents
Store::insert_table is public, but schema-driven table creation through Store::try_from_theory is the intended normal path.
Public API Examples
These examples show the API shape. A concrete theory can require related rows to be inserted together so that laws hold at commit time.
Loading a Theory
use geomerge::{
ir::FlatTheory,
store::Store,
};
fn load_theory(input: &str) -> Result<Store, Box<dyn std::error::Error>> {
let theory: FlatTheory = serde_json::from_str(input)?;
let store = Store::try_from_theory(theory)?;
Ok(store)
}
This is the normal entry point for library use. The theory supplies table schemas and laws; the resulting store starts with empty tables.
Adding Rows with apply_batch
use geomerge::{
ir::Path,
store::Store,
table::CellValue,
};
fn add_graph(store: &mut Store) -> Result<u64, Box<geomerge::store::StoreIntError>> {
let graphs = store
.table_at(&Path::from("Graphs"))
.expect("Graphs table exists");
let op = graphs.add(vec![]);
let row_ids = store.apply_batch(vec![op])?;
Ok(row_ids[0])
}
fn add_vertex(
store: &mut Store,
graph_id: u64,
) -> Result<u64, Box<geomerge::store::StoreIntError>> {
let vertices = store
.table_at(&Path::from("G.V"))
.expect("G.V table exists");
let op = vertices.add(vec![CellValue::Id(graph_id)]);
let row_ids = store.apply_batch(vec![op])?;
Ok(row_ids[0])
}
Table::add constructs an operation. Store::apply_batch is the call that validates and commits it.
Adding Related Rows in One Transaction
use geomerge::{
ir::Path,
store::{Store, StoreIntError},
table::CellValue,
};
fn add_graph_with_vertices(store: &mut Store) -> Result<(u64, u64, u64), Box<StoreIntError>> {
store.transact(|preview| {
let graph_id = preview
.table_at_mut(&Path::from("Graphs"))
.expect("Graphs table exists")
.append_row_validated(vec![])?;
let v0 = preview
.table_at_mut(&Path::from("G.V"))
.expect("G.V table exists")
.append_row_validated(vec![CellValue::Id(graph_id)])?;
let v1 = preview
.table_at_mut(&Path::from("G.V"))
.expect("G.V table exists")
.append_row_validated(vec![CellValue::Id(graph_id)])?;
Ok((graph_id, v0, v1))
})
}
This style is useful when later rows need row ids produced earlier in the same transaction.
Reading Table Contents
use geomerge::{
ir::Path,
store::Store,
};
fn print_table_cells(store: &Store) {
let table = store
.table_at(&Path::from("G.V"))
.expect("G.V table exists");
for row_idx in 0..table.row_count() {
let row_id = table.row_id_at(row_idx).expect("row id exists");
let graph = table.cell_at(row_idx, 0).expect("graph column exists");
println!("row #{row_id}: graph={graph}");
}
}
The table API exposes physical row indexes, generated row ids, and per-column cell access. It does not expose a general query language.
Explicit Law Checking
use geomerge::store::{Store, StoreIntError};
fn validate_store(store: &Store) -> Result<(), Box<StoreIntError>> {
store.check_laws()
}
Store::apply_batch and Store::transact already call check_laws before committing. Calling it directly is useful after manual store construction
or debugging.
Persistence Round Trip
use geomerge::{
persist::pst::{decode_store, encode_store},
store::Store,
};
fn round_trip(store: &Store) -> Result<Store, geomerge::persist::error::PersisError> {
let bytes = encode_store(store)?;
decode_store(&bytes)
}
Decoded stores reconstruct table state and recompile laws from the persisted law entries.
Mutation API
The mutation operation type currently has one variant:
Op::Add {
table: ir::Path,
values: Vec<CellValue>,
}
The table path is used instead of an internal table id, which keeps operations stable across stores.
The most direct insertion flow is:
let table = store.table_at(&geomerge::ir::Path::from("G.V")).unwrap();
let op = table.add(vec![geomerge::table::CellValue::Id(0)]);
let row_ids = store.apply_batch(vec![op])?;
Store::apply_batch validates the entire batch first. It then applies operations to a cloned preview store, checks all laws, and commits the preview
only if validation succeeds. On validation failure, the original store is unchanged.
The returned Vec<RowId> is in the same order as the input operations.
Table API
Tables are columnar stores. A table owns:
- its
Path - its
Schema - generated row ids
- one column vector per schema column
Rows are identified by generated RowId values. Row ids are managed by the store layer and are exposed as entity references.
Main table methods:
Table::new: table construction from path and schemaTable::schema: schema accessTable::path: path accessTable::row_count: row countTable::row_id_at: row id at a physical row indexTable::cell_at: cell lookup by physical row index and column indexTable::dump: debug renderingTable::validate: schema validation for a candidate rowTable::primary_key_values: primary-key extraction from candidate valuesTable::validate_new_row: schema and primary-key validationTable::append_row_validated: direct validated appendTable::add: operation construction forStore::apply_batch
Library code should prefer Store::apply_batch or Store::transact for durable changes, because those paths also check laws.
Cell Values and Validation
The supported cell values are:
CellValue::Id(RowId)
CellValue::Int(i64)
CellValue::Str(String)
Validation checks:
- column count
- entity column values as
CellValue::Id - integer columns as
CellValue::Int - string columns as
CellValue::Str - tuple column rejection
- primary-key duplication
- unknown table paths in batch application
Primary-key behavior:
None: no primary-key uniqueness checkSome(columns): uniqueness over listed columnsSome([]): singleton table, meaning at most one row
Transactions
Geomerge has two transaction styles.
Store::transact takes a closure over a preview store:
store.transact(|preview| {
let table = preview.table_at_mut(&Path::from("Graphs")).unwrap();
let row_id = table.append_row_validated(vec![])?;
Ok(row_id)
})?;
The closure can perform multiple direct table appends. After the closure succeeds, Geomerge checks laws on the preview and commits only if they hold.
Store::into_transaction consumes a store and returns an OwnedTransaction.
Its commit method returns either:
Ok((output, committed_store))
Err((error, original_store))
That shape is useful when the caller needs ownership of the original store after a failed transaction.
One implementation detail to watch: Store::apply_batch checks primary-key conflicts within the whole batch before applying. The closure-based
transaction path relies on table-level validated appends and has a TODO around primary-key checking at the store transaction layer.
Law Solver
The solver compiles IR laws into a restricted execution form.
Supported compiled proposition forms:
- atom
- equality
- conjunction
Unsupported or incomplete forms include:
- disjunction
- projected terms
- constructed terms
- tuple values
Runtime law checking has two phases:
- Antecedent binding: scan relevant tables and produce variable bindings.
- Consequent validation: check that required atoms and equalities hold for each binding.
Law violations identify:
- missing consequent atoms
- unsatisfied consequent equalities
- the law involved
- the binding that triggered the failure
The solver is direct and full-store oriented. It is not incremental.
Persistence API
The persistence API is under persist::pst.
Main functions:
encode_store(&Store) -> Result<Vec<u8>, PersisError>decode_store(&[u8]) -> Result<Store, PersisError>
The persisted data includes:
- format magic and version
- next table oid
- table metadata
- table schemas
- row ids
- column payloads
- source law entries
On decode, Geomerge reconstructs tables and recompiles laws from the persisted law entries.
REPL API
The binary starts the REPL. The REPL commands are:
/help
/exit
/quit
load-schema <path>;
load-store <path>;
list-schema;
add <table> values (...), (...);
dump-table <table>;
dump-store;
persist <path>;
begin transact; name = add <table> values (...); ... commit;
REPL transaction bindings let one inserted row refer to another row inserted earlier in the same transaction:
begin transact;
g = add Graphs values ();
v = add G.V values (g);
commit;
For non-transactional add, entity values use #id syntax. Inside transaction blocks, entity columns can use either #id or a previous binding
name.
Key Features
Implemented features:
- compiled theory loading from JSON-compatible IR
- schema-defined table creation
- columnar row storage
- generated row ids
- entity references through row ids
- primitive int and string cells
- table path lookup
- validated row insertion
- batch add operations
- atomic batch commit after validation
- closure-based preview transactions
- owned transactions with original-store recovery on failure
- primary-key validation for batch adds and validated table appends
- law compilation for atoms, equality, and conjunction
- full-store law checking
- binary persistence and reload
- REPL schema loading, insertion, dumping, transactions, persistence, and reload
Not implemented or limited:
- deletes
- updates
- general query execution
- incremental validation
- native version-control semantics
- merge and conflict-resolution semantics
- concurrency control
- tuple cell insertion
- disjunction in compiled laws
- projected or constructed terms in compiled laws
- ad hoc indexes for efficient law checking
Practical Mental Model
Geomerge currently behaves like:
compiled Geolog theory
-> typed table registry
-> validated row additions
-> full-store law checks
-> persisted store snapshots
The most relevant integration boundary for an incremental engine such as DBSP is law checking. Today, Geomerge validates laws by scanning the current store. A future incremental layer could maintain violation queries and let the store ask whether any violations exist after a proposed transaction.
Changelog
- May 13, 2026 -- First version of this document.