A Virtual Filesystem in PostgreSQL

Agents Need a Filesystem They Can't Escape From

An agent inside chat A calls writeFile('/notes.md', '...'). The system has to make that path real, durable across a process restart, and invisible to chat B — all while another tool call in the same turn might be reading /.user/memory/MEMORY.md from a per-user scope and a third call is listing /skills/onboarding/references/. The LLM expects POSIX. The backend has Postgres, a connection pool, and a tenant model that says nothing leaks across chats.

Three pressures shape the design. Per-chat isolation, because two chats sharing a user must not see each other's files. Transactional durability, because a crashed write must leave zero half-written rows behind. And a single unified tree, because the agent's tools assume one namespace where /.agent, /.user, the chat root, and a handful of skill sandboxes all coexist.

The rest of this post is the mechanism. A two-table schema that pretends to be an inode table, a readdir that runs on prefix matching, a write path that reaches the root in one transaction, a scoping wrapper that turns string concatenation into tenant isolation, a mount router that composes the unified tree, and a writeFile interceptor that validates JSONL before any bytes hit Postgres.

Two Tables Pretending to Be an Inode Table

The base storage is a single Postgres schema with two tables. fs_entries is keyed by the full POSIX path and carries the per-node metadata. fs_chunks carries the file payload, sliced into 1 MiB BYTEA rows and tied back by foreign key.

sql

CREATE TABLE agent.fs_entries (
  path           TEXT PRIMARY KEY,
  type           TEXT NOT NULL,
  mode           INT  NOT NULL,
  size           BIGINT NOT NULL,
  mtime          BIGINT NOT NULL,
  symlink_target TEXT
);

CREATE TABLE agent.fs_chunks (
  path        TEXT NOT NULL,
  chunk_index INT  NOT NULL,
  data        BYTEA NOT NULL,
  PRIMARY KEY (path, chunk_index),
  FOREIGN KEY (path) REFERENCES agent.fs_entries(path)
    ON DELETE CASCADE ON UPDATE CASCADE
);

CREATE TABLE agent.fs_entries (
  path           TEXT PRIMARY KEY,
  type           TEXT NOT NULL,
  mode           INT  NOT NULL,
  size           BIGINT NOT NULL,
  mtime          BIGINT NOT NULL,
  symlink_target TEXT
);

CREATE TABLE agent.fs_chunks (
  path        TEXT NOT NULL,
  chunk_index INT  NOT NULL,
  data        BYTEA NOT NULL,
  PRIMARY KEY (path, chunk_index),
  FOREIGN KEY (path) REFERENCES agent.fs_entries(path)
    ON DELETE CASCADE ON UPDATE CASCADE
);

There is no parent_id. There is no inode number. There is no recursion table. The hierarchy is encoded entirely in the path string and reconstructed by prefix matching at query time.

Classic inode schema

A traditional inode-style design separates the name from the link.

inodes(id, type, mode, size, mtime) — one row per file or directory
dirents(parent_id, name, child_id) — names live in the parent's directory listing
data_blocks(inode_id, block_index, bytes) — payload tied to the inode
readdir is WHERE parent_id = ?; rename is one row update
hierarchy is a real foreign-key graph; orphan rows are detectable

Path-keyed schema

The Limerence schema folds name and identity into the same column.

fs_entries(path PRIMARY KEY, type, mode, size, mtime) — full path is the identity
fs_chunks(path, chunk_index, data) — payload tied by the same string
readdir is a LIKE pair on the path column; rename is a cascading update
hierarchy is implicit; the path string is the only structure that exists
one primary key per file, one foreign key from chunks, nothing else

The simplification is load-bearing. Every other beat in the system — readdir, recursive delete, scope prefixing, mount routing, JSONL validation — is built on the assumption that a path is just a string and the table is a flat key-value store.

◆Key Takeaway

There is no parent column. The tree is encoded in the path string and the hierarchy is enforced by prefix matching on the primary key. Everything downstream — scoping, mounts, isolation — composes by concatenating prefixes onto that one string before it ever reaches the SQL layer.

`readdir` Is a Prefix Query With a "No Further Slashes" Clause

Without a parent_id, listing the immediate children of a directory becomes a question about strings: which paths start with this directory and contain no further slashes after that point. The implementation is exactly that, in two LIKE clauses.

sql

SELECT path, type FROM agent.fs_entries
WHERE path LIKE $1 || '%'
  AND path != $1
  AND path NOT LIKE $1 || '%/%'

SELECT path, type FROM agent.fs_entries
WHERE path LIKE $1 || '%'
  AND path != $1
  AND path NOT LIKE $1 || '%/%'

The LIKE $1 || '%' selects every descendant of the directory. The path != $1 excludes the directory's own row from its own listing. The NOT LIKE $1 || '%/%' filters out anything deeper than one level — a path under the directory that contains another / after the prefix is a grandchild, not a child, and gets dropped. Recursive operations use the same shape without that last clause: rm -r is one DELETE FROM fs_entries WHERE path = $1 OR path LIKE $1 || '/%', and the cascade on fs_chunks reaps the bytes.

That is why the schema can refuse to model parents. Hierarchy is not a graph the database walks. It is a regular language that two LIKE clauses already recognize.

Every Write Reaches the Root in One Transaction

Writing a file at a deep path — say /users/u-1/agents/a-1/memory/feedback_terminology.md — has to leave the table in a state where every directory above the file exists. Without it, readdir on /users/u-1/agents/a-1/memory/ would not return the new file, because the prefix-match query would have nothing to anchor on. The write path enforces the parent chain inside the same BEGIN/COMMIT as the entry itself.

1
Normalise and prefix. The incoming path is resolved to absolute POSIX form, then prefixed with the configured #root (/artifacts in this deployment). Every method in PostgresFs runs on the prefixed path; #root never appears in the API surface.
2
Open the transaction and walk up. #useTransaction checks out a client and issues BEGIN. #ensureParentExists recurses from the file's parent up to /, inserting any missing directory rows as fs_entries with type = 'directory' and mode 0755. Existing rows are left alone.
3
UPSERT the entry. A single INSERT INTO fs_entries (path, type, mode, size, mtime) VALUES (...) ON CONFLICT (path) DO UPDATE SET ... either creates the file row or refreshes its size and mtime in place. The path is the primary key, so the conflict target is unambiguous.
4
Replace the chunks. DELETE FROM fs_chunks WHERE path = $1 clears any prior payload, then the new bytes are sliced into 1 MiB chunks and inserted with chunk_index running from zero. All four steps run on the same client; COMMIT lands them as one unit.

A process crash between BEGIN and COMMIT rolls everything back. The file appears with the new content or stays at the previous content; there is no torn-write window for a single file. Multiple files written in sequence are not atomic with each other, however — there is no batching API, and the agent's tool calls are individually transactional.

The chunk-replace pattern (delete then insert) is the cheapest way to keep fs_chunks consistent with fs_entries.size. The alternative — diffing chunks against the new payload — would require extra reads on every write, and the chunk size is small enough that a full rewrite is not the bottleneck.

`ScopedFs` Turns String Concatenation Into Tenant Isolation

Per-chat, per-user, and per-agent isolation all reduce to the same trick: a thin proxy that prepends a fixed prefix to every path before delegating to the underlying filesystem. The wrapper is called ScopedFs, and it implements the full IFileSystem interface by routing each method through a #scope field.

A chat's filesystem is ScopedFs(prefix='/chat-abc') over the base PostgresFs. When the agent calls writeFile('/notes.md') inside that chat, the wrapper rewrites the path to /chat-abc/notes.md, and PostgresFs then prepends its own #root of /artifacts before the SQL runs. The agent never sees the chat id; the SQL row's primary key is /artifacts/chat-abc/notes.md. A different chat with prefix /chat-xyz cannot construct a path that resolves into chat A's subtree, because the prefix is concatenated client-side before any query is built.The concatenation is pure string work, so two prefixes that are substrings of each other — a chat id of abc and another of abcd — would let getAllPaths-style filtering treat /abcd/... as living under the /abc scope. In practice the chat ids are cuid/uuid generated by Prisma, so collisions are not reachable; the safety is operational, not enforced by code.

The same wrapper is reused for /.user (one scope per user, per agent) and /.agent (one scope per agent, shared across that agent's chats). Every isolation boundary in the system is a ScopedFs instance with a different prefix, layered over the same base.

`AgentChatFs` Mounts `/.agent`, `/.user`, and Skill Sandboxes Into One Tree

The composite tree the LLM sees is built by AgentChatFs, which extends MountableFs from the just-bash library and routes incoming paths to the correct sub-filesystem by mount prefix. The chat root is the base scope; /.agent and /.user are sibling ScopedFs instances on different prefixes; each enabled skill becomes an OverlayFs mounted at its sandbox path.

One virtual path becomes one primary key by string concatenation

01Agent tool call

fs.writeFile(path, …)

/.user/memory/feedback_terminology.md

02AgentChatFs

mount route picks .user

/memory/feedback_terminology.md

03ScopedFs

prefix concat

/users/u-1/agents/a-1/memory/feedback_terminology.md

04PostgresFs

#root prefix

/artifacts/users/u-1/agents/a-1/memory/feedback_terminology.md

05agent.fs_entries

primary key row

INSERT INTO agent.fs_entries (path, type, size, mtime) VALUES ('/artifacts/users/u-1/agents/a-1/memory/feedback_terminology.md', 'file', 420, 1731000000000) ON CONFLICT (path) DO UPDATE SET size = EXCLUDED.size, mtime = EXCLUDED.mtime;

added by this layercarried throughNo parent_id, no inode table — the path string is the key.

The agent writes /.user/memory/feedback_terminology.md. AgentChatFs picks the .user mount; ScopedFs prepends the per-user, per-agent prefix; PostgresFs prepends /artifacts. The accent fragments mark the bytes each layer adds.

The default mount is ScopedFs(prefix='/${chatId}'). A write to /notes.md becomes /chat-abc/notes.md at the scope layer and /artifacts/chat-abc/notes.md at the SQL layer. Files written here live and die with the chat — no other chat, user, or agent can see them.

Skill mounts are a fourth scope, but they use a different mechanism. Each enabled skill registers an OverlayFs at its sandbox path, with the host directory on disk as the read-mostly source and an in-memory copy-on-write layer for any writes the agent performs inside the sandbox. Skill mutations do not persist between runs; the COW layer is discarded with the agent context.

The end state is one POSIX tree from the LLM's perspective and one flat keyspace from the database's perspective. Mount routing strips the mount prefix, scope wrappers prepend their isolation prefix, the base prepends #root, and the SQL row's primary key is the result. Three string concatenations, one INSERT.

A JSONL Validator That Lives Inside `writeFile`

One file in the tree is special: /.agent/fragments.jsonl. It carries the agent's onboarding fragments — typed instruction records the agent owner authored — and the system needs every line to parse as JSON and validate against a closed schema. A single bad line in that file would corrupt the agent's prompt on the next chat. The hook that prevents it lives directly inside AgentChatFs.writeFile, before any of the layers below.

1
Intercept by path. AgentChatFs.writeFile normalises the incoming path and checks whether it equals /.agent/fragments.jsonl. Other paths flow through to the base MountableFs.writeFile unchanged.
2
Read append context if needed. In append mode, the validator reads the file's current contents through the same composite filesystem and concatenates them with the incoming bytes. This is what stops a partial append from smuggling an invalid line past a validator that only saw the new bytes.
3
Inspect the JSONL. inspectFragmentJsonl splits the candidate content by newline, drops empty lines, parses each remaining line with JSON.parse, and runs each parsed object through fragmentSchema.safeParse. A failure on any line is a failure for the whole write.
4
Throw before the transaction. A bad line raises FragmentValidationError with a human-readable pointer to the offending entry. The error is raised before super.writeFile runs — meaning before the mount router, before ScopedFs, before PostgresFs.writeFile opens its transaction. The bytes never reach Postgres.
5
Otherwise delegate. A clean validation falls through to super.writeFile. The mount router routes /.agent to ScopedFs(/agents/${agentId}), that wrapper prepends its prefix, PostgresFs opens the transaction, and the standard write path runs.

The validator runs inside the call stack of the write but outside the database transaction.In append mode the validator's exists and readFile calls happen on the routed MountableFs and are not enclosed in the same Postgres transaction as the eventual append. A concurrent writer could change the file between the read and the append, leaving the validation snapshot stale. Each chat has one running agent in practice, so concurrent writers on the fragments file are unlikely, but the gap is not enforced by code. That is the right place for it. A schema-level constraint cannot reject malformed JSONL inside a BYTEA payload, and a separate validation service would have to re-read the file the agent just wrote. Hooking the hot path catches the bad input where it originated, with one synchronous check.

Concurrent Writers, Symlink Escapes, and the 10 MiB Read Wall

Three failure windows are worth naming honestly.

Concurrent writes to the same path. PostgresFs.writeFile does an UPSERT on fs_entries followed by a DELETE FROM fs_chunks and a chunk re-insert, all inside one transaction. With Postgres' default Read Committed isolation and no SELECT ... FOR UPDATE on the entry row, two writers racing on the same path can interleave at the row-lock level. A reader executing between one writer's chunk delete and its re-insert can see an fs_entries.size that does not match the byte total of the visible chunks. The transaction guarantees self-consistency for a single connection. It does not buy strict serializability across writers.

The 10 MiB read wall. readFile and readFileBuffer always materialise the entire file in memory by concatenating chunks back into a single Uint8Array. The upload route caps inputs at 10 MiB, but the read path does not stream. A future feature that loads a large artifact — a recorded chat transcript, a generated dataset export — will pin Node memory for the duration of the read. The chunk schema would support a streaming reader; nothing currently uses one.

What This Design Does Not Buy You

The simplification is not free, and a few of the trade-offs are worth naming so a future maintainer is not surprised.

JSONL validation is hard-coded to one path. The hook in AgentChatFs.writeFile fires only when the normalised path equals /.agent/fragments.jsonl. Other JSONL files in the tree — datasets, exports, chat transcripts — are not validated by this layer. Adding validation for a second file requires editing AgentChatFs directly; there is no extensible registry.

No per-tenant quota. The upload route caps a single upload at 10 MiB, but nothing checks the cumulative size of /.user/... or /agents/${agentId}/.... A long-lived agent that writes prolifically to its shared subtree can grow without bound. The artifact UI auto-renames on collision; it does not push back on total volume.

No background reaper for orphan rows. The cascade on fs_chunks keeps payload consistent with entries. There is no equivalent for "directory rows whose subtree has been deleted by an ad-hoc SQL statement" — and there cannot be, because the schema does not model parents. readdir is a prefix query, so an orphaned directory row simply stops appearing in listings; the inconsistency is silent.

MEMORY.md is materialised on read if missing. A read of /.user/memory/MEMORY.md on a fresh user-agent pair returns an instructional placeholder string instead of throwing ENOENT. The behaviour is intentional — the agent's memory tooling expects the file to "exist" — but tools that distinguish "no file" from "file with content" need to be aware that this one path lies.

No integration tests for the composite stack inside the repo. The vendored PostgresFs is presumed to have its own coverage upstream; the specific composition of MountableFs mounts, ScopedFs wrappers, and the JSONL hook does not. The stack is exercised end-to-end by manual playground runs and by the agent itself; nothing pins the contracts in CI.

The design buys a lot. One Postgres schema with two tables holds every agent artifact in the system. Tenant isolation is three lines of string concatenation. A bad onboarding fragment never reaches the database. The cost is the list above — and the honest answer is that for the workloads this system runs today, the cost is the cheaper side of the trade.