🦀 Built with Rust

AI powered cloud personal data storage.

Folders are out. Introducing Fold DB

Schema-based storage, intelligent ingestion, and real-time processing.
Build smarter data applications with less code.

50μs Query Latency
100% Type Safe
0 Runtime Deps
fold_db — zsh
$ datafold_cli ingest data.json
Schema auto-generated: user_events
42 records ingested
$ datafold_cli llm-query "Show purchases over $50"
Querying with AI...
Found 7 results in schema user_events
$ datafold_cli smart-folder-scan ~/Documents
12 files detected, 3 schemas recommended
$
Download

Get FoldDB in seconds

Download pre-built binaries for your platform or build from source.

Install with one command
curl -fsSL https://raw.githubusercontent.com/shiba4life/fold_db/master/install.sh | sh

Auto-detects macOS (Apple Silicon / Intel) and Linux x86_64

or download for your platform

macOS Apple Silicon

M1, M2, M3, M4

macOS Intel

x86_64

Linux

x86_64

Or build from source:

cargo install --git https://github.com/shiba4life/fold_db --bin datafold_cli
View all releases
Features

Everything you need for
modern data applications

From AI-powered ingestion to distributed queries, FoldDB handles the complexity so you can focus on building.

🤖

AI-Powered Ingestion

Drop any JSON and let AI automatically generate schemas, map fields, and structure your data. No manual schema definitions required.

let result = ingestion
    .process_json(any_json_data)
    .await?;
// Schema auto-generated ✨
💬

Natural Language Queries

Ask questions in plain English. AI interprets your intent and returns structured results.

Real-Time Processing

Event-driven architecture with automatic transform execution as data flows through.

🌐

Distributed P2P

Built-in peer discovery and networking. Scale horizontally without infrastructure changes.

🔐

Fine-Grained Permissions

Trust-based access control at the field level. Multi-tenant isolation out of the box.

☁️

Serverless Ready

S3-backed storage and DynamoDB support. Deploy to AWS Lambda with zero modifications.

📊

Dynamic Schemas

Schemas evolve with your data. Add fields, update validation, all without migrations.

🔌

Extensible Adapters

Plugin system for Twitter, Reddit, webhooks, and custom data sources.

Code Examples

Developer-first API design

Clean, intuitive APIs that get out of your way. Ship faster with less boilerplate.

Ingest any data with AI

Just send your JSON data — FoldDB automatically analyzes it, generates an optimized schema, and stores it.

  • Automatic schema generation
  • Intelligent field mapping
  • Type inference and validation
  • Async processing with progress tracking
rust
use datafold::{IngestionCore, IngestionConfig};
use serde_json::json;

let config = IngestionConfig::from_env()?;
let ingestion = IngestionCore::new(config)?;

// Drop any JSON — AI handles the rest
let data = json!({
    "user": "alice",
    "action": "purchase",
    "total": 99.99
});

let result = ingestion
    .process_json_ingestion(data).await?;

Query with natural language

Ask questions in plain English and get structured results.

  • Natural language understanding
  • Schema-aware query planning
  • Automatic result formatting
rust
use datafold::DataFoldNode;

let node = DataFoldNode::new_with_defaults().await?;

// Natural language query
let response = node.ai_query(
    "Show me all purchases over $50"
).await?;

for item in response.results {
    println!("{}: ${}", item.user, item.total);
}

Deploy to AWS Lambda

First-class serverless support with DynamoDB multi-tenant isolation.

  • DynamoDB multi-tenant backend
  • S3 sync for persistence
  • User isolation out of the box
rust
use datafold::lambda::{LambdaConfig, LambdaContext};

let config = LambdaConfig::new(
    LambdaStorage::DynamoDb(DynamoDbConfig {
        region: "us-east-1".to_string(),
        tables: ExplicitTables::from_prefix("MyApp"),
        auto_create: true,
    }),
);

LambdaContext::init(config).await?;

Type-safe frontend client

Full TypeScript support with intelligent caching and standardized error handling.

  • 100% type coverage
  • Intelligent request caching
  • Automatic retry with backoff
typescript
import { schemaClient } from '@datafold/client';

const response = await schemaClient.getSchemas();

if (response.success) {
  const schemas = response.data;
  schemas.forEach(s => console.log(s.name));
}
Quick Start

Up and running in under 2 minutes

01

Install

Install via Cargo or download a pre-built binary.

cargo install datafold
02

Configure

Set your OpenRouter API key for AI features (optional).

export OPENROUTER_API_KEY="sk-..."
03

Launch

Start the HTTP server with the web UI.

datafold_http_server --port 9001
04

Build

Open localhost:9001 and start ingesting data.

open http://localhost:9001
CLI

Full control from your terminal

Ingest data, run queries, and manage schemas without leaving the command line.

Check status and explore

Verify your FoldDB instance is running and browse available schemas.

  • Pretty-printed JSON output with -p
  • List all loaded schemas
  • Check node health and configuration
terminal
$ datafold_cli status -p
{
  "status": "ok",
  "schemas_loaded": 5,
  "data_root": "~/.datafold/data"
}

$ datafold_cli schema-list -p
[
  "tweets", "contacts", "notes",
  "purchases", "bookmarks"
]

Ingest any data

Feed JSON files directly, or let Smart Folders auto-detect and ingest data from directories.

  • Single-file JSON ingestion
  • Smart Folder scanning and recommendations
  • Batch ingest with --all-recommended
terminal
$ datafold_cli ingest data.json
 Schema generated: user_events
 42 records ingested

$ datafold_cli smart-folder-scan ~/Documents
 12 files detected, 3 schemas recommended

$ datafold_cli smart-folder-ingest ~/Documents --all-recommended
 Ingested 3 schemas from ~/Documents

Query your data

Run structured queries, full-text search, or ask questions in natural language with AI.

  • Field-level structured queries
  • Full-text search across schemas
  • AI-powered natural language queries
terminal
$ datafold_cli query tweets --fields text,author
{ "text": "Hello world", "author": "alice" }
{ "text": "Rust is great", "author": "bob" }

$ datafold_cli search "machine learning"
 4 results across 2 schemas

$ datafold_cli llm-query "Show me recent purchases over $50"
 Querying with AI...
Found 7 results in schema purchases

Manage schemas

Load, approve, and inspect schemas directly from the CLI.

  • Load schemas from JSON files
  • Approve pending schemas
  • Inspect schema definitions
terminal
$ datafold_cli schema-load
 Loaded schema: tweets (pending approval)

$ datafold_cli schema-approve tweets
 Schema "tweets" approved

$ datafold_cli schema-get tweets -p
{
  "name": "tweets",
  "fields": ["text", "author", "timestamp"],
  "status": "approved"
}
Registry

Global Schema Registry

Browse the shared schema registry. Click on a schema to view its complete definition including fields and topology structures.

Schemas
Total Fields
Service

Loading schemas from registry...

Architecture

Built for scale and flexibility

Clients
Web UI
REST API
TypeScript SDK
CLI
FoldDB Core
AI Ingestion
Query Engine
Transform Pipeline
Schema Manager
Storage
Sled (Local)
DynamoDB
S3
P2P Network

Ready to build something amazing?

Join developers building the next generation of data applications with FoldDB.