DATA MODEL
PROVIDERS
PolyDB ships 15 providers callable today (the 16th, Transaction, is coming soon) covering every major data paradigm. Each provider exposes a consistent MCP tool interface and runs identically across all three storage backends. Mix and match freely — they share one connection. SQL operations get full PostgreSQL ACID; cross-model atomic transactions are coming soon.
Jump To Provider
1. SQL — Relational
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The SQL provider exposes a full relational interface: schema creation, DML, and parameterised queries. It sits directly on top of the backend's native SQL engine, so you get dialect-correct DDL and full JOIN support. Use it whenever your data has a clear schema and you need the expressiveness of SQL — aggregations, window functions, complex predicates.
Key operations
- query_sql
query_sql is read-only — it executes parameterised SELECT, WITH (CTEs), EXPLAIN, and SHOW statements against the tenant schema. Write operations (INSERT, UPDATE, DELETE) are rejected; use the dedicated provider tools instead (e.g. store_document, set_keyvalue, store_vector). DDL (CREATE TABLE, etc.) is issued via the backend's query-builder AST, not as raw SQL strings.
When to use vs alternatives
Prefer SQL when your data is structured and schema-stable. Reach for Document when fields vary per record or evolve frequently. Use Analytics instead of raw SQL GROUP BY when you need multi-dimensional OLAP slicing.
2. Document — NoSQL JSON
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Document provider stores arbitrary JSON documents in named collections, with MongoDB-compatible filter operators ($eq, $gt, $lt, $in, $regex). Documents are indexed by a generated _id and stored in the backend's native JSONB column, so filter queries are backend-accelerated. No schema declaration is needed — just insert and query.
Key operations
- store_document
- search_documents
- get_document
- delete_document
When to use vs alternatives
Use Document for semi-structured or heterogeneous data — user profiles, event payloads, product catalogues with varying attributes. If the shape is uniform and query patterns are clear, SQL will give better query performance. For full-text search over document fields, combine with Full-Text.
3. Key-Value — Redis-style cache
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Key-Value provider offers a Redis-style set/get/delete/exists interface with optional namespacing and per-key TTL expiry. Keys are arbitrary strings; values are JSON-serialisable objects. Expired keys are lazily purged on access. Use it for caching computed results, storing session tokens, feature flags, or any data with a natural lifetime.
Key operations
- set_keyvalue
- get_keyvalue
- delete_keyvalue
- list_keyvalues
When to use vs alternatives
Choose Key-Value when lookup is always by exact key and TTL matters. For richer querying (find by value, range scans), use Document. For time-series metrics use Time Series instead.
4. Vector — AI embeddings & similarity search
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Vector provider stores high-dimensional embedding vectors in a FAISS index backed by the configured storage engine. It supports cosine and L2 similarity search, metadata filtering, and batch upsert. Vectors are identified by a vector_id string and carry a JSON metadata payload. Designed for RAG pipelines, semantic search, recommendation engines, and any workflow that involves embedding models.
Key operations
- store_vector
- search_vectors
- delete_vector
When to use vs alternatives
Vector is the right choice whenever you need nearest-neighbour retrieval over embedding space. For structured lookup by ID, use Key-Value. Combine with Memory when building LLM agent memory that requires semantic recall.
5. Graph — Nodes, edges & traversal
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Graph provider exposes a property-graph model powered by NetworkX. Nodes carry typed properties; edges carry directional relationships with their own properties. Traversal operations support BFS/DFS, shortest-path, neighbour enumeration, and subgraph extraction. Graph state is persisted to the backing store between sessions.
Key operations
- add_graph_node
- add_graph_edge
- query_graph
- delete_graph_node
- delete_graph_edge
Traversal and shortest-path queries are expressed through query_graph (supports neighbour enumeration, BFS/DFS, path queries).
When to use vs alternatives
Use Graph for relationship-heavy data — social networks, knowledge graphs, dependency trees, access control hierarchies. For simpler parent–child relationships that only need depth-1 traversal, a Document with embedded references is lighter. For recommendation graphs that rely on vector similarity, combine Graph with Vector.
6. Stream — Event streams
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Stream provider implements an append-only event log with named streams. Producers publish JSON event payloads; consumers read from a stream with a configurable limit and optional offset cursor. Events carry monotonically increasing sequence numbers and wall-clock timestamps. Use it for activity feeds, audit logs, change data capture, and lightweight pub/sub within a single PolyDB instance.
Key operations
- publish_stream
- consume_stream
When to use vs alternatives
Choose Stream for ordered, time-sequenced events where consumers need replay. For durable task queues with acknowledgement, use Document with a status field. For metrics derived from events, pipe into Time Series.
7. Spatial — Geospatial data
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Spatial provider stores geometric shapes as WKT or GeoJSON with arbitrary attribute payloads. Queries include bounding-box intersection, radius/nearby search (metres), and contains/intersects predicates. Geometry operations are handled by Shapely for correctness; results are returned with the original WKT and all attributes. Suitable for store locators, delivery zones, asset tracking, and environmental datasets.
Key operations
- store_spatial
- search_spatial_nearby
- search_spatial_bbox
- delete_spatial
When to use vs alternatives
Use Spatial when geometry queries are a first-class concern (radius search, polygon intersection). For simple lat/lng storage without geometric operations, a Document with a coordinates field is sufficient. Combine with Time Series for moving-asset tracking.
8. Time Series — Metrics & temporal data
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Time Series provider stores named numeric metrics with ISO 8601 timestamps and free-form tag dictionaries. Range queries return raw data points or aggregated series (avg, sum, min, max, count) with optional downsampling intervals. Tags enable multi-dimensional filtering — for example, querying CPU usage by host and region simultaneously.
Key operations
- store_timeseries
- query_timeseries
When to use vs alternatives
Use Time Series for any numeric metric that needs time-range queries and aggregation — infrastructure monitoring, IoT sensor data, business KPIs. For events with rich payloads (non-numeric), use Stream. For multidimensional BI-style roll-ups, use Analytics.
9. Analytics — OLAP cubes
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Analytics provider builds OLAP-style cubes over fact tables with named dimensions and measures. Cube definitions declare the fact source and how dimensions map to columns; query operations slice and dice using one or more dimensions and aggregate measures with SUM, AVG, COUNT, MIN, or MAX. Results are returned as tabular JSON ready for charting libraries.
Key operations
- create_analytics_cube
- query_analytics
- delete_analytics_cube
When to use vs alternatives
Use Analytics when you need pre-modelled dimensional aggregation — revenue by region and quarter, churn by cohort. For ad-hoc SQL aggregation, use the SQL provider directly. For numeric metric time-ranges, use Time Series.
10. S3 — Object storage
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The S3 provider exposes a bucket/key object storage API, mirroring the AWS S3 interface. Start by creating a bucket with create_bucket, then write objects with put_s3_object. Objects are stored with arbitrary metadata and MIME type. Operations cover bucket lifecycle (create, delete, list) plus object put, get, delete, and list with prefix filtering.
Key operations
- create_bucket
- delete_bucket
- list_buckets
- put_s3_object
- get_s3_object
- list_s3_objects
- delete_s3_object
When to use vs alternatives
Use S3 for large, opaque objects identified by a key — uploaded files, rendered reports, ML model artefacts. For small binary blobs that need metadata queries, use Blob. For structured data at scale, use Iceberg.
11. Memory — LLM agent memory
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Memory provider implements a structured memory store for LLM agents, automatically classifying each interaction into episodic (specific events), semantic (general facts), or procedural (how-to knowledge) memory types. It tracks session context, supports semantic recall via embedding similarity, and provides recency-weighted retrieval so agents can surface the most relevant context for a given query.
Key operations
- store_memory
- recall_memory
- delete_memory
- store_knowledge
- search_knowledge
- delete_knowledge
When to use vs alternatives
Use Memory for any AI agent that needs persistent cross-session recall. For a simple conversation log without classification, use Document. For semantic similarity retrieval without session structure, use Vector directly.
12. Temporal — Versioned data & as-of queries
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Temporal provider stores every write as a new immutable version rather than overwriting in place. Any past state can be retrieved with an as-of timestamp query. The complete version history for any entity is available as an ordered list. This enables full audit trails, configuration rollback, and compliance with data retention requirements — without any application-level versioning logic.
Key operations
- store_temporal
- query_temporal_at
- query_temporal_history
When to use vs alternatives
Use Temporal for any data where "what did this look like at time T?" is a valid query — configuration, pricing, access policies, compliance records. For event streams where ordering matters more than entity identity, use Stream. Iceberg also provides time-travel but at the table level rather than per-entity.
13. Full-Text — FTS5/BM25 search
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Full-Text provider builds inverted indexes using FTS5 (SQLite) or tsvector (PostgreSQL/CockroachDB) and ranks results with BM25 relevance scoring. Documents are indexed with an ID and one or more text fields. Queries support phrase matching, boolean operators, and prefix search. Highlights with match snippets are optionally returned alongside results.
Key operations
- index_fulltext
- search_fulltext
- delete_fulltext
When to use vs alternatives
Use Full-Text for keyword and phrase search over human-readable content — articles, support tickets, product descriptions. For semantic/concept search (find documents by meaning not exact words), use Vector. Combine both for a hybrid search pipeline.
14. Blob — Binary large objects
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Blob provider stores arbitrary binary data alongside a structured metadata envelope (MIME type, size, checksums, custom tags). Unlike the S3 provider, Blob is optimised for smaller objects that need queryable metadata — find all images tagged with a given label, or list all PDFs uploaded by a user. Binary content is base64-encoded in transit and stored as BYTEA or BLOB natively.
Key operations
- store_blob
- get_blob
- delete_blob
When to use vs alternatives
Use Blob when you need both binary storage and metadata queries on the same object. For large, opaque files where key-based access is sufficient, use S3. For storing generated text artefacts, Document is lighter weight.
15. Iceberg — Apache Iceberg tables
OSS: SQLite, Stoolap, CockroachDB · Cloud: Neon (CockroachDB on Large)The Iceberg provider exposes Apache Iceberg table semantics: schema evolution, snapshot-based time-travel, partition pruning, and metadata-layer management. Tables accumulate snapshots on each write; any previous snapshot can be queried as a read-only view. Schema changes (add/drop/rename column) are tracked in the metadata layer without rewriting data files, making it suitable for long-lived analytical tables in a data lake architecture.
Key operations
- create_iceberg_table
- append_iceberg
- get_iceberg_snapshot_as_of
- add_iceberg_column
- expire_iceberg_snapshots
When to use vs alternatives
Use Iceberg for large analytical datasets that need schema evolution without downtime and point-in-time queries over the whole table. For per-entity version history, use Temporal. For OLAP aggregations over existing tables, use Analytics.
16. Transaction — Cross-model ACID
Coming soonCross-model ACID transactions — wrapping operations from SQL, Document, Key-Value, Vector, and other providers in a single commit-or-rollback boundary — are under active development. Single-operation SQL transactions are available today via query_sql. Full multi-provider transaction MCP tools will ship in an upcoming release.