Add notes about Subduction project
This commit is contained in:
parent
4009315500
commit
9ae3bc9b14
492
hassan/004-subduction-notes.md
Normal file
492
hassan/004-subduction-notes.md
Normal file
@ -0,0 +1,492 @@
|
||||
## Subduction Project Notes
|
||||
|
||||
### Overview
|
||||
|
||||
**Subduction** is a peer-to-peer synchronization protocol and implementation for CRDTs (Conflict-free Replicated Data Types) that enables efficient
|
||||
synchronization of encrypted, partitioned data between peers without requiring a central server.
|
||||
|
||||
- **Repository:** https://github.com/inkandswitch/subduction
|
||||
- **Developed by:** Ink & Switch
|
||||
- **License:** MIT OR Apache-2.0
|
||||
- **Language:** Rust (with WebAssembly bindings)
|
||||
- **Status:** Early release preview (unstable API)
|
||||
- **Version:** 0.5.0
|
||||
|
||||
---
|
||||
|
||||
### Core Purpose & Features
|
||||
|
||||
#### Key Capabilities
|
||||
|
||||
1. **Efficient Sync Protocol** — Uses Sedimentree for history sharding, diffing, and incremental synchronization
|
||||
2. **Encryption-Friendly** — Works with encrypted data partitions without requiring decryption during sync
|
||||
3. **Decentralized** — True peer-to-peer synchronization via pluggable transports
|
||||
4. **Multi-Platform** — Runs on native Rust, WebAssembly (browser & Node.js), and provides a CLI tool
|
||||
5. **Automerge Integration** — While protocol-agnostic, was originally designed for Automerge documents
|
||||
|
||||
#### Design Principles
|
||||
|
||||
- **no_std Compatible** — Core logic works without standard library
|
||||
- **Transport Agnostic** — Protocol-independent message format
|
||||
- **Policy Separation** — Authentication separate from authorization
|
||||
- **Subscription-Based** — Efficient update forwarding
|
||||
- **Content-Addressed** — All data identified by BLAKE3 hash
|
||||
- **Idempotent** — Receiving same data twice is safe
|
||||
- **Compile-Time Validation** — Types make invalid states unrepresentable
|
||||
- **Newtypes for Domain Concepts** — Prevent mixing different semantic types
|
||||
|
||||
---
|
||||
|
||||
### Architecture & Components
|
||||
|
||||
The project is organized as a Rust workspace with 16 member crates:
|
||||
|
||||
#### Core Layer (3 crates)
|
||||
|
||||
| Crate | Description |
|
||||
|---------------------|------------------------------------------------------------------------------------------------------------------|
|
||||
| `sedimentree_core` | Core data partitioning scheme using depth-based hierarchical layers for efficient metadata-based synchronization |
|
||||
| `subduction_crypto` | Cryptographic types: Ed25519-signed payloads with type-state pattern for verification status |
|
||||
| `subduction_core` | Main synchronization protocol implementation with pluggable storage, connections, and policies |
|
||||
|
||||
#### Storage Layer (1 crate)
|
||||
|
||||
| Crate | Description |
|
||||
|--------------------------|----------------------------------------------------------|
|
||||
| `sedimentree_fs_storage` | Filesystem-based persistent storage for Sedimentree data |
|
||||
|
||||
#### Transport Layer (3 crates)
|
||||
|
||||
| Crate | Description |
|
||||
|----------------------------|---------------------------------------------------------------------|
|
||||
| `subduction_http_longpoll` | HTTP long-poll transport for restrictive network environments |
|
||||
| `subduction_iroh` | Iroh (QUIC) transport for direct P2P connections with NAT traversal |
|
||||
| `subduction_websocket` | WebSocket transport for browser and Node.js environments |
|
||||
|
||||
#### Integration Layer (4 crates)
|
||||
|
||||
| Crate | Description |
|
||||
|-----------------------------|--------------------------------------------------------------------|
|
||||
| `automerge_sedimentree` | Adapter for synchronizing Automerge CRDT documents via Sedimentree |
|
||||
| `subduction_keyhive` | Integration with Keyhive access control system |
|
||||
| `subduction_keyhive_policy` | Keyhive-based authorization policy for connections |
|
||||
| `bijou64` | Utility crate (64-bit optimizations) |
|
||||
|
||||
#### WebAssembly Bindings (4 crates)
|
||||
|
||||
| Crate | Description |
|
||||
|------------------------------|---------------------------------------------------|
|
||||
| `sedimentree_wasm` | Wasm bindings for Sedimentree |
|
||||
| `subduction_wasm` | Wasm bindings for browser and Node.js |
|
||||
| `automerge_sedimentree_wasm` | Wasm wrapper for Automerge + Sedimentree |
|
||||
| `automerge_subduction_wasm` | Full sync stack (Automerge + Subduction) for Wasm |
|
||||
|
||||
#### Tools (1 crate)
|
||||
|
||||
| Crate | Description |
|
||||
|------------------|--------------------------------------------------------------|
|
||||
| `subduction_cli` | Command-line tool for running sync servers and managing data |
|
||||
|
||||
---
|
||||
|
||||
### Technical Architecture
|
||||
|
||||
#### Key Abstractions
|
||||
|
||||
##### FutureForm Trait
|
||||
|
||||
Enables portable async code across native Rust (Tokio) and WebAssembly (single-threaded):
|
||||
|
||||
- Provides two implementations: `Sendable` (Send + Sync) for multi-threaded and `Local` for single-threaded environments
|
||||
- Uses macro-based code generation to support both forms
|
||||
|
||||
##### Generic Parameters on Subduction
|
||||
|
||||
```rust
|
||||
pub struct Subduction<'a, F: FutureForm, S: Storage<F>, C: Connection<F>,
|
||||
P: ConnectionPolicy<F> + StoragePolicy<F>,
|
||||
M: DepthMetric, const N: usize>
|
||||
```
|
||||
|
||||
Compile-time configuration enables type-safe instantiation without runtime dispatch overhead.
|
||||
|
||||
##### Policy Traits (Capability-Based Access Control)
|
||||
|
||||
| Trait | Purpose |
|
||||
|--------------------|-------------------------------------------------------------------------------|
|
||||
| `ConnectionPolicy` | Authorization at the connection level (is this peer allowed?) |
|
||||
| `StoragePolicy` | Authorization at the document level (can this peer read/write this document?) |
|
||||
| `OpenPolicy` | Permissive default (allows everything) |
|
||||
| `KeyhivePolicy` | Real authorization with Keyhive integration |
|
||||
|
||||
##### Cryptographic Types
|
||||
|
||||
| Type | Description |
|
||||
|------------------------|-----------------------------------------------------------|
|
||||
| `Signed<T>` | Payload with Ed25519 signature (unverified) |
|
||||
| `VerifiedSignature<T>` | Witness that signature has been verified |
|
||||
| `VerifiedMeta<T>` | Witness that signature is valid AND blob matches metadata |
|
||||
|
||||
The type-state pattern prevents "verify and forget" bugs at compile time.
|
||||
|
||||
##### NonceCache
|
||||
|
||||
- Replay protection for handshake protocol
|
||||
- Time-based bucket system (4 buckets × 3 min = 12 min window)
|
||||
- Lazy garbage collection, no background task needed
|
||||
|
||||
#### Sedimentree Data Structure
|
||||
|
||||
Organizes CRDT data into depth-stratified layers based on content hash:
|
||||
|
||||
| Depth | Contains |
|
||||
|----------|------------------------------------|
|
||||
| Depth 0 | All commits |
|
||||
| Depth 1 | Commits with 1+ leading zero bytes |
|
||||
| Depth 2 | Commits with 2+ leading zero bytes |
|
||||
| Depth 3+ | Further filtering |
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- Efficient sync through fingerprint-based reconciliation
|
||||
- Compare summaries at higher depths first
|
||||
- Drill down only where differences exist
|
||||
- ~75% bandwidth reduction via 8-byte SipHash fingerprints instead of 32-byte digests
|
||||
|
||||
#### Connection Lifecycle
|
||||
|
||||
```
|
||||
Client ─(TCP/WebSocket)─→ Server
|
||||
Client ─(Signed Challenge)─→ Server
|
||||
Server ─(Signed Response)─→ Client
|
||||
Both verify signatures and check ConnectionPolicy
|
||||
↓
|
||||
Authenticated connection established
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Protocol Layers
|
||||
|
||||
| Layer | Component | Description |
|
||||
|---------|-------------|--------------------------------------------------------------------------|
|
||||
| Layer 1 | Transport | WebSocket, HTTP Long-Poll, or Iroh/QUIC |
|
||||
| Layer 2 | Connection | Handshake with mutual Ed25519 authentication, policy-based authorization |
|
||||
| Layer 3 | Sync | Batch sync (pull), Incremental sync (push), Subscriptions |
|
||||
| Layer 4 | Application | Automerge, custom CRDTs, or other data structures |
|
||||
|
||||
---
|
||||
|
||||
### Sync Protocols
|
||||
|
||||
#### Batch Sync
|
||||
|
||||
- **Type:** Pull-based (request/response)
|
||||
- **Mechanism:** Uses fingerprint-based reconciliation for compact diffs
|
||||
- **Delivery:** Guaranteed
|
||||
- **Use Cases:** Initial sync, reconnection, consistency checks
|
||||
|
||||
#### Incremental Sync
|
||||
|
||||
- **Type:** Push-based (fire-and-forget)
|
||||
- **Mechanism:** Immediate updates for active editing
|
||||
- **Delivery:** Best-effort
|
||||
- **Use Cases:** Real-time collaboration
|
||||
|
||||
#### Subscriptions
|
||||
|
||||
- Optional per-document subscription
|
||||
- Updates forwarded only to authorized subscribed peers
|
||||
- Efficient batch update forwarding
|
||||
|
||||
#### Reconnection
|
||||
|
||||
- Automatic detection and recovery
|
||||
- Batch sync used to catch up after missed incremental updates
|
||||
|
||||
---
|
||||
|
||||
### Security Model
|
||||
|
||||
#### Trust Assumptions
|
||||
|
||||
**Trusted:**
|
||||
|
||||
- Ed25519 signatures are unforgeable
|
||||
- BLAKE3 is collision-resistant and preimage-resistant
|
||||
- TLS provides transport encryption when used
|
||||
- Private keys remain secret
|
||||
- Local storage is not compromised
|
||||
|
||||
**Not Trusted:**
|
||||
|
||||
- Network (attacker can observe, delay, drop, inject)
|
||||
- Peers (may be malicious, compromised, buggy)
|
||||
- Clocks (tolerate ±10 minutes drift)
|
||||
- Server operators (may attempt unauthorized access)
|
||||
|
||||
#### Security Goals
|
||||
|
||||
1. **Authentication** — Know who you're talking to
|
||||
2. **Integrity** — Detect message tampering
|
||||
3. **Replay Protection** — Reject replayed handshakes
|
||||
4. **Authorization** — Enforce access control per document
|
||||
5. **Confidentiality** — Data encrypted at rest and in transit
|
||||
|
||||
---
|
||||
|
||||
### CLI Tool (`subduction_cli`)
|
||||
|
||||
#### Commands
|
||||
|
||||
| Command | Description |
|
||||
|----------|-------------------------------------------------|
|
||||
| `server` | Starts a sync node with configurable transports |
|
||||
| `purge` | Deletes all stored data |
|
||||
|
||||
#### Features
|
||||
|
||||
- Multi-transport support (WebSocket, HTTP long-poll, Iroh)
|
||||
- Flexible key management (command-line, file-based, ephemeral)
|
||||
- Peer connection configuration
|
||||
- Metrics export (Prometheus)
|
||||
- Reverse proxy support
|
||||
- NixOS and Home Manager integration
|
||||
|
||||
---
|
||||
|
||||
### Technologies & Dependencies
|
||||
|
||||
#### Core Rust
|
||||
|
||||
| Dependency | Purpose |
|
||||
|----------------------|----------------------------------|
|
||||
| `tokio` | Async runtime for native targets |
|
||||
| `futures` | Async abstractions |
|
||||
| `ed25519-dalek` | Ed25519 signatures |
|
||||
| `blake3` | Content hashing |
|
||||
| `serde` + `ciborium` | CBOR serialization |
|
||||
|
||||
#### Networking
|
||||
|
||||
| Dependency | Purpose |
|
||||
|---------------------|---------------------------------|
|
||||
| `async-tungstenite` | WebSocket implementation |
|
||||
| `axum` | HTTP server framework |
|
||||
| `iroh` | QUIC protocol and NAT traversal |
|
||||
| `hyper` | HTTP client/server |
|
||||
|
||||
#### WebAssembly
|
||||
|
||||
| Dependency | Purpose |
|
||||
|------------------------|------------------------|
|
||||
| `wasm-bindgen` | JS↔Wasm FFI |
|
||||
| `wasm-bindgen-futures` | Async support for Wasm |
|
||||
| `wasm-tracing` | Logging for browser |
|
||||
|
||||
#### Testing & Quality
|
||||
|
||||
| Tool | Purpose |
|
||||
|--------------|------------------------|
|
||||
| `bolero` | Property-based fuzzing |
|
||||
| `criterion` | Benchmarking |
|
||||
| `playwright` | E2E testing for Wasm |
|
||||
|
||||
---
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
subduction/
|
||||
├── sedimentree_core/ ## Core partitioning & metadata
|
||||
├── sedimentree_fs_storage/ ## Filesystem storage
|
||||
├── sedimentree_wasm/ ## Wasm bindings
|
||||
├── subduction_crypto/ ## Signed types & crypto
|
||||
├── subduction_core/ ## Sync protocol
|
||||
├── subduction_{http_longpoll,iroh,websocket}/ ## Transports
|
||||
├── subduction_wasm/ ## Full Wasm bindings
|
||||
├── subduction_keyhive/ ## Keyhive types
|
||||
├── subduction_keyhive_policy/ ## Keyhive authorization
|
||||
├── automerge_sedimentree/ ## Automerge integration
|
||||
├── automerge_*_wasm/ ## Automerge Wasm bindings
|
||||
├── subduction_cli/ ## CLI server tool
|
||||
├── design/ ## Protocol documentation
|
||||
│ ├── security/ ## Threat model
|
||||
│ └── sync/ ## Sync protocols
|
||||
└── HACKING.md ## Contributor guide
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Applications
|
||||
|
||||
Subduction's architecture makes it suitable for a wide range of decentralized, collaborative applications:
|
||||
|
||||
#### Real-Time Collaborative Editing
|
||||
|
||||
- **Document Editors** — Google Docs-style collaborative editing without central servers
|
||||
- **Code Editors** — Pair programming and collaborative coding environments
|
||||
- **Design Tools** — Multi-user design applications (like Figma) with local-first architecture
|
||||
- **Whiteboards** — Real-time collaborative diagramming and brainstorming tools
|
||||
|
||||
#### Local-First Applications
|
||||
|
||||
- **Note-Taking Apps** — Obsidian-like applications with seamless multi-device sync
|
||||
- **Task Management** — Todo lists and project management tools that work offline
|
||||
- **Personal Knowledge Bases** — Roam Research or Notion alternatives with true data ownership
|
||||
- **Journaling Apps** — Private journals with end-to-end encryption and cross-device sync
|
||||
|
||||
#### Decentralized Social & Communication
|
||||
|
||||
- **Chat Applications** — End-to-end encrypted messaging without central servers
|
||||
- **Social Networks** — Federated or fully decentralized social platforms
|
||||
- **Forums & Discussion Boards** — Community platforms with no single point of failure
|
||||
- **Email Alternatives** — Decentralized messaging systems
|
||||
|
||||
#### Data Synchronization Infrastructure
|
||||
|
||||
- **Database Replication** — Syncing distributed databases across nodes
|
||||
- **Configuration Management** — Distributing configuration across a fleet of servers
|
||||
- **CDN-like Content Distribution** — Efficient content propagation across edge nodes
|
||||
- **IoT Device Sync** — Synchronizing state across IoT devices and gateways
|
||||
|
||||
#### Privacy-Focused Applications
|
||||
|
||||
- **Healthcare Records** — Patient-controlled medical records with selective sharing
|
||||
- **Financial Data** — Personal finance apps with encrypted cloud backup
|
||||
- **Legal Documents** — Secure document sharing for legal proceedings
|
||||
- **Whistleblower Platforms** — Secure, anonymous document sharing
|
||||
|
||||
#### Gaming & Virtual Worlds
|
||||
|
||||
- **Multiplayer Game State** — Synchronizing game world state without dedicated servers
|
||||
- **Virtual Worlds** — Decentralized metaverse applications
|
||||
- **Persistent Worlds** — MMO-style games with community-run infrastructure
|
||||
|
||||
#### Developer Tools
|
||||
|
||||
- **Distributed Version Control** — Git-like systems with better merge semantics
|
||||
- **API Mocking & Testing** — Sharing API state across development teams
|
||||
- **Feature Flags** — Distributed feature flag management
|
||||
- **Configuration Sync** — Developer environment configuration sharing
|
||||
|
||||
#### Enterprise Applications
|
||||
|
||||
- **Offline-First Field Apps** — Applications for workers in low-connectivity environments
|
||||
- **Multi-Region Deployment** — Consistent state across globally distributed systems
|
||||
- **Compliance & Audit** — Tamper-evident logs with cryptographic verification
|
||||
- **Disaster Recovery** — Resilient data storage with no single point of failure
|
||||
|
||||
#### Research & Academic
|
||||
|
||||
- **Collaborative Research** — Sharing datasets and findings across institutions
|
||||
- **Lab Notebooks** — Electronic lab notebooks with provenance tracking
|
||||
- **Reproducible Science** — Versioned, content-addressed research artifacts
|
||||
|
||||
---
|
||||
|
||||
### Why Subduction Over Alternatives?
|
||||
|
||||
| Feature | Subduction | Traditional Sync | Blockchain |
|
||||
|-------------------------|--------------------|------------------|-----------------------|
|
||||
| Decentralized | Yes | No | Yes |
|
||||
| Efficient Bandwidth | Yes (fingerprints) | Varies | No (full replication) |
|
||||
| Works with Encryption | Yes | Varies | Limited |
|
||||
| Real-time Updates | Yes | Yes | No (consensus delay) |
|
||||
| Offline Support | Yes | Limited | No |
|
||||
| No Token/Cryptocurrency | Yes | Yes | Usually No |
|
||||
| Flexible Authorization | Yes (Keyhive) | Centralized | Smart contracts |
|
||||
|
||||
---
|
||||
|
||||
### Development & Deployment
|
||||
|
||||
#### Build Requirements
|
||||
|
||||
- Rust 1.90+
|
||||
- For Wasm: wasm-pack
|
||||
- For browser testing: Node.js 22+, pnpm
|
||||
|
||||
#### Deployment Options
|
||||
|
||||
- Standalone binary
|
||||
- Nix flake (with NixOS/Home Manager modules)
|
||||
- Docker containers (via CLI)
|
||||
- System services (systemd, launchd)
|
||||
|
||||
#### Reverse Proxy Support
|
||||
|
||||
- Caddy integration
|
||||
- Custom TLS/HTTPS setup
|
||||
|
||||
---
|
||||
|
||||
### Testing Strategy
|
||||
|
||||
| Type | Tool/Approach |
|
||||
|----------------------|---------------------------------------|
|
||||
| Unit Tests | Standard `#[test]` inline in modules |
|
||||
| Property-Based Tests | `bolero` for fuzz testing |
|
||||
| E2E Tests | Playwright tests for Wasm bindings |
|
||||
| Integration Tests | Round-trip transport connection tests |
|
||||
|
||||
---
|
||||
|
||||
### Current Status & Roadmap
|
||||
|
||||
- **Version:** 0.5.0 (core crates)
|
||||
- **Maturity:** Early release preview with unstable API
|
||||
- **Production Use:** NOT recommended at this time
|
||||
- **Active Development:** Regular updates and bug fixes
|
||||
|
||||
#### Known Limitations
|
||||
|
||||
- API is unstable and may change
|
||||
- Documentation is still evolving
|
||||
- Some edge cases may not be fully handled
|
||||
- Performance optimization ongoing
|
||||
|
||||
---
|
||||
|
||||
### Getting Started
|
||||
|
||||
#### Basic Usage (Rust)
|
||||
|
||||
```rust
|
||||
// Example usage with Automerge
|
||||
use automerge_sedimentree::AutomergeSedimentree;
|
||||
use subduction_core::Subduction;
|
||||
|
||||
// Initialize with your chosen transport and storage
|
||||
let subduction = Subduction::new(storage, transport, policy);
|
||||
|
||||
// Sync documents
|
||||
subduction.sync(document_id).await?;
|
||||
```
|
||||
|
||||
#### CLI Server
|
||||
|
||||
```bash
|
||||
## Start a WebSocket sync server
|
||||
subduction server --websocket-port 8080
|
||||
|
||||
## Start with multiple transports
|
||||
subduction server --websocket-port 8080 --http-port 8081
|
||||
|
||||
## Connect to peers
|
||||
subduction server --peer ws://other-node:8080
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### References
|
||||
|
||||
- [Ink & Switch Research](https://www.inkandswitch.com/)
|
||||
- [Local-First Software](https://www.inkandswitch.com/local-first/)
|
||||
- [Automerge](https://automerge.org/)
|
||||
- [CRDTs](https://crdt.tech/)
|
||||
- [Iroh](https://iroh.computer/)
|
||||
|
||||
## Changelog
|
||||
|
||||
* **Mar 4, 2026** -- The first version was created.
|
||||
Loading…
x
Reference in New Issue
Block a user