# Geomerge Public API and Features A reading note on Geomerge's current library API, REPL surface, and implemented storage behavior. --- ## Short Answer Geomerge is currently a Rust storage-engine prototype for compiled Geolog theories. The implemented core is: ```text compiled FlatTheory -> Store with empty tables and compiled laws -> Add operations or transactions -> schema validation -> law validation -> committed table state ``` It is not yet a full database engine. Native version control, large-scale concurrency, conflict resolution, general query execution, deletes, and updates are not implemented in the current API. --- ## Terminology **Geomerge**: The storage-engine prototype in this repository. It loads a compiled Geolog theory, creates tables, accepts row additions, checks laws, and can persist and reload store state. **Geolog**: The larger database and logic system that Geomerge is intended to support. In this repo, Geolog mainly appears through compiled IR data that Geomerge can load. **Compiled Theory**: A ready-to-load description of tables and laws. In code, this is `FlatTheory`. **FlatTheory**: The top-level IR object Geomerge consumes. It contains a list of table definitions and a list of law definitions. **IR**: Intermediate representation. It is the structured data format shared by `geolog-lang` and `geomerge`, rather than the source syntax a user might write it by hand. **Path**: A stable name for a table or law, such as `Graphs`, `G.V`, or `Hom.E.foreignKeys`. Geomerge uses paths in public operations because paths are more portable than internal table ids. **Schema**: The column layout of a table. It says how many columns a row has, what type each column expects, and whether the table has a primary key. **Table**: A stored relation. Geomerge tables are columnar internally, but users usually think of them as collections of rows. **RowId**: A generated identifier for a row. Entity references are represented with row ids. **CellValue**: A value stored in one table cell. The current supported values are entity ids, integers, and strings. **Op**: A store mutation operation. Currently the only operation is `Op::Add`, which inserts one row into one table. **Batch**: A group of operations submitted together with `Store::apply_batch`. The whole batch commits only if schema and law validation succeed. **Transaction**: A preview-store mutation that commits only after validation. Transactions are useful when later inserted rows need row ids from earlier inserted rows. **Law**: A constraint from the compiled theory. Geomerge checks laws after proposed changes and rejects changes that make a law fail. **Binding**: A set of variable assignments found while matching a law antecedent against current table rows. **Violation**: A failed law check. Current violations are either missing consequent atoms or unsatisfied equalities. **Persistence**: Encoding a store to bytes and decoding it later. Geomerge persists table data, schemas, row ids, and source law entries. --- ## Crate Layout The workspace has two main crates: - `geolog-lang`: shared Geolog IR definitions - `geomerge`: storage engine, table layer, validation, persistence, and REPL The `geomerge` crate exports: - `ir`: re-exported Geolog IR types from `geolog-lang` - `ops`: mutation operation types - `persist`: binary store persistence - `repl`: command parsing and REPL execution helpers - `solver`: law compilation, binding, matching, and validation - `store`: the main `Store` API - `table`: table storage, cell values, row ids, and validation errors - `transaction`: owned transaction wrapper The most important public modules for library users are `ir`, `store`, `table`, `ops`, and `persist`. --- ## Architecture Diagram ```mermaid flowchart TD A[Compiled FlatTheory JSON] --> B[geolog-lang IR] B --> C[Store::try_from_theory] C --> D[Table Registry] C --> E[Law Compiler] D --> F[Table] F --> G[Columnar Cell Storage] F --> H[Generated Row IDs] F --> I[Schema and Primary-Key Validation] E --> J[Compiled Laws] J --> K[Law Validator] L[Op::Add Batch] --> M[Store::apply_batch] N[Closure Transaction] --> O[Store::transact] P[Owned Transaction] --> Q[OwnedTransaction::commit] M --> R[Preview Store Clone] O --> R Q --> R R --> I R --> K K --> S{Validation Result} I --> S S -->|ok| T[Committed Store] S -->|error| U[Original Store Unchanged] T --> V[Store and Table Accessors] T --> W[persist::pst::encode_store] W --> X[Store Bytes] X --> Y[persist::pst::decode_store] Y --> T Z[REPL Commands] --> L Z --> N Z --> W ``` The key architectural point is the preview-store boundary. Mutations are applied to a clone, schema and law checks run on that clone, and the original store is replaced only after validation succeeds. --- ## IR API The IR crate defines the data shape Geomerge consumes. Important types: - `FlatTheory`: a compiled theory containing table entries and law entries - `TableEntry`: a table path plus schema - `LawEntry`: a law path plus law definition - `Schema`: column types and an optional primary key - `ColType`: entity, primitive, or tuple column type - `PrimType`: `int` or `string` - `Path`: dotted table and law paths such as `Graphs`, `G.V`, or `Hom.E.foreignKeys` - `Law`: variables, antecedent proposition, and consequent proposition - `Prop`: atom, equality, conjunction, or disjunction at the IR level - `Term`: literal, variable, projection, or constructor at the IR level The IR is broader than the current store and solver implementation. For example, tuple columns, disjunction, projections, and constructors exist in the IR, but they are not fully supported by the current storage and validation path. --- ## Store API `Store` is the central API. The usual construction path is: ```rust let theory: geomerge::ir::FlatTheory = serde_json::from_str(input)?; let mut store = geomerge::store::Store::try_from_theory(theory)?; ``` `Store::try_from_theory` builds one empty table for each table in the theory and compiles the theory laws into solver form. Main accessors: - `Store::new`: empty store construction - `Store::try_from_theory`: store construction from `FlatTheory` - `Store::tables`: table iterator - `Store::table_count`: table count - `Store::resolve_table`: table path to table oid lookup - `Store::table`: table lookup by oid - `Store::table_mut`: mutable table lookup by oid - `Store::table_at`: table lookup by path - `Store::table_at_mut`: mutable table lookup by path - `Store::laws`: compiled law access - `Store::dump`: debug rendering of all tables - `Store::check_laws`: full law validation over current store contents `Store::insert_table` is public, but the project notes indicate schema-driven table creation is the intended normal path. --- ## Public API Examples These examples show the API shape. A concrete theory can require related rows to be inserted together so that laws hold at commit time. ### Loading a Theory ```rust use geomerge::{ ir::FlatTheory, store::Store, }; fn load_theory(input: &str) -> Result> { let theory: FlatTheory = serde_json::from_str(input)?; let store = Store::try_from_theory(theory)?; Ok(store) } ``` This is the normal entry point for library use. The theory supplies table schemas and laws; the resulting store starts with empty tables. ### Adding Rows with `apply_batch` ```rust use geomerge::{ ir::Path, store::Store, table::CellValue, }; fn add_graph(store: &mut Store) -> Result> { let graphs = store .table_at(&Path::from("Graphs")) .expect("Graphs table exists"); let op = graphs.add(vec![]); let row_ids = store.apply_batch(vec![op])?; Ok(row_ids[0]) } fn add_vertex( store: &mut Store, graph_id: u64, ) -> Result> { let vertices = store .table_at(&Path::from("G.V")) .expect("G.V table exists"); let op = vertices.add(vec![CellValue::Id(graph_id)]); let row_ids = store.apply_batch(vec![op])?; Ok(row_ids[0]) } ``` `Table::add` constructs an operation. `Store::apply_batch` is the call that validates and commits it. ### Adding Related Rows in One Transaction ```rust use geomerge::{ ir::Path, store::{Store, StoreIntError}, table::CellValue, }; fn add_graph_with_vertices(store: &mut Store) -> Result<(u64, u64, u64), Box> { store.transact(|preview| { let graph_id = preview .table_at_mut(&Path::from("Graphs")) .expect("Graphs table exists") .append_row_validated(vec![])?; let v0 = preview .table_at_mut(&Path::from("G.V")) .expect("G.V table exists") .append_row_validated(vec![CellValue::Id(graph_id)])?; let v1 = preview .table_at_mut(&Path::from("G.V")) .expect("G.V table exists") .append_row_validated(vec![CellValue::Id(graph_id)])?; Ok((graph_id, v0, v1)) }) } ``` This style is useful when later rows need row ids produced earlier in the same transaction. ### Reading Table Contents ```rust use geomerge::{ ir::Path, store::Store, }; fn print_table_cells(store: &Store) { let table = store .table_at(&Path::from("G.V")) .expect("G.V table exists"); for row_idx in 0..table.row_count() { let row_id = table.row_id_at(row_idx).expect("row id exists"); let graph = table.cell_at(row_idx, 0).expect("graph column exists"); println!("row #{row_id}: graph={graph}"); } } ``` The table API exposes physical row indexes, generated row ids, and per-column cell access. It does not expose a general query language. ### Explicit Law Checking ```rust use geomerge::store::{Store, StoreIntError}; fn validate_store(store: &Store) -> Result<(), Box> { store.check_laws() } ``` `Store::apply_batch` and `Store::transact` already call `check_laws` before committing. Calling it directly is useful after manual store construction or debugging. ### Persistence Round Trip ```rust use geomerge::{ persist::pst::{decode_store, encode_store}, store::Store, }; fn round_trip(store: &Store) -> Result { let bytes = encode_store(store)?; decode_store(&bytes) } ``` Decoded stores reconstruct table state and recompile laws from the persisted law entries. --- ## Mutation API The mutation operation type currently has one variant: ```rust Op::Add { table: ir::Path, values: Vec, } ``` The table path is used instead of an internal table id, which keeps operations stable across stores. The most direct insertion flow is: ```rust let table = store.table_at(&geomerge::ir::Path::from("G.V")).unwrap(); let op = table.add(vec![geomerge::table::CellValue::Id(0)]); let row_ids = store.apply_batch(vec![op])?; ``` `Store::apply_batch` validates the entire batch first. It then applies operations to a cloned preview store, checks all laws, and commits the preview only if validation succeeds. On validation failure, the original store is unchanged. The returned `Vec` is in the same order as the input operations. --- ## Table API Tables are columnar stores. A table owns: - its `Path` - its `Schema` - generated row ids - one column vector per schema column Rows are identified by generated `RowId` values. Row ids are managed by the store layer and are exposed as entity references. Main table methods: - `Table::new`: table construction from path and schema - `Table::schema`: schema access - `Table::path`: path access - `Table::row_count`: row count - `Table::row_id_at`: row id at a physical row index - `Table::cell_at`: cell lookup by physical row index and column index - `Table::dump`: debug rendering - `Table::validate`: schema validation for a candidate row - `Table::primary_key_values`: primary-key extraction from candidate values - `Table::validate_new_row`: schema and primary-key validation - `Table::append_row_validated`: direct validated append - `Table::add`: operation construction for `Store::apply_batch` Library code should prefer `Store::apply_batch` or `Store::transact` for durable changes, because those paths also check laws. --- ## Cell Values and Validation The supported cell values are: ```rust CellValue::Id(RowId) CellValue::Int(i64) CellValue::Str(String) ``` Validation checks: - column count - entity column values as `CellValue::Id` - integer columns as `CellValue::Int` - string columns as `CellValue::Str` - tuple column rejection - primary-key duplication - unknown table paths in batch application Primary-key behavior: - `None`: no primary-key uniqueness check - `Some(columns)`: uniqueness over listed columns - `Some([])`: singleton table, meaning at most one row --- ## Transactions Geomerge has two transaction styles. `Store::transact` takes a closure over a preview store: ```rust store.transact(|preview| { let table = preview.table_at_mut(&Path::from("Graphs")).unwrap(); let row_id = table.append_row_validated(vec![])?; Ok(row_id) })?; ``` The closure can perform multiple direct table appends. After the closure succeeds, Geomerge checks laws on the preview and commits only if they hold. `Store::into_transaction` consumes a store and returns an `OwnedTransaction`. Its `commit` method returns either: ```text Ok((output, committed_store)) Err((error, original_store)) ``` That shape is useful when the caller needs ownership of the original store after a failed transaction. One implementation detail to watch: `Store::apply_batch` checks primary-key conflicts within the whole batch before applying. The closure-based transaction path relies on table-level validated appends and has a TODO around primary-key checking at the store transaction layer. --- ## Law Solver The solver compiles IR laws into a restricted execution form. Supported compiled proposition forms: - atom - equality - conjunction Unsupported or incomplete forms include: - disjunction - projected terms - constructed terms - tuple values Runtime law checking has two phases: 1. Antecedent binding: scan relevant tables and produce variable bindings. 2. Consequent validation: check that required atoms and equalities hold for each binding. Law violations identify: - missing consequent atoms - unsatisfied consequent equalities - the law involved - the binding that triggered the failure The solver is direct and full-store oriented. It is not incremental. --- ## Persistence API The persistence API is under `persist::pst`. Main functions: - `encode_store(&Store) -> Result, PersisError>` - `decode_store(&[u8]) -> Result` The persisted data includes: - format magic and version - next table oid - table metadata - table schemas - row ids - column payloads - source law entries On decode, Geomerge reconstructs tables and recompiles laws from the persisted law entries. --- ## REPL API The binary starts the REPL. The REPL commands are: ```text /help /exit /quit load-schema ; load-store ; list-schema; add values (...), (...); dump-table
; dump-store; persist ; begin transact; name = add
values (...); ... commit; ``` REPL transaction bindings let one inserted row refer to another row inserted earlier in the same transaction: ```text begin transact; g = add Graphs values (); v = add G.V values (g); commit; ``` For non-transactional `add`, entity values use `#id` syntax. Inside transaction blocks, entity columns can use either `#id` or a previous binding name. --- ## Key Features Implemented features: - compiled theory loading from JSON-compatible IR - schema-defined table creation - columnar row storage - generated row ids - entity references through row ids - primitive int and string cells - table path lookup - validated row insertion - batch add operations - atomic batch commit after validation - closure-based preview transactions - owned transactions with original-store recovery on failure - primary-key validation for batch adds and validated table appends - law compilation for atoms, equality, and conjunction - full-store law checking - binary persistence and reload - REPL schema loading, insertion, dumping, transactions, persistence, and reload Not implemented or limited: - deletes - updates - general query execution - incremental validation - native version-control semantics - merge and conflict-resolution semantics - concurrency control - tuple cell insertion - disjunction in compiled laws - projected or constructed terms in compiled laws - ad hoc indexes for efficient law checking --- ## Practical Mental Model Geomerge currently behaves like: ```text compiled Geolog theory -> typed table registry -> validated row additions -> full-store law checks -> persisted store snapshots ``` The most relevant integration boundary for an incremental engine such as DBSP is law checking. Today, Geomerge validates laws by scanning the current store. A future incremental layer could maintain violation queries and let the store ask whether any violations exist after a proposed transaction. --- ## Changelog * **May 13, 2026** -- First version of this document.