Your database schema can tell an agent that a column is named total_amount.A schema is the structure that defines your tables, columns, and data types. An agent, in Limerence, is an AI assistant connected to your database that turns plain-language questions into SQL queries. The schema tells the agent the column exists. It cannot tell the agent whether your team means gross revenue or net
revenue when they say "revenue." It cannot tell the agent that "active customer"
means a purchase in the last 90 days, that test orders should be excluded by
default, or that sales says "pipeline" while the database calls the table
opportunities.
That missing layer is not model weakness. It is company-specific knowledge that lives in analysts' heads, dashboards, and muscle memory. If you leave it out, the model has to guess.
Typed Fragments, Not Better Prompts
Limerence separates schema groundingThe process of giving the agent awareness of your database structure. from business interpretation. Grounding tells the agent what tables and columns exist. Teaching tells it what those tables and columns mean in your business.
The important design choice is that this knowledge is structured into what we call fragments.A fragment is a typed JSON object that captures one piece of business knowledge. A glossary fragment maps a term to its SQL meaning. A hint sets a soft default. A guardrail sets a hard restriction. The type decides how the system stores, deduplicates, and injects it into the agent's prompt at query time. The system is not hoping the model remembers your jargon. It is giving the model a durable interpretation layer.
◆Key Takeaway
This is not "better prompt engineering" in the loose sense. It is a structured knowledge layer that enters the agent through two paths — manual authoring and guided onboarding — and becomes live context at query time.
Schema Stops at Column Names
Schema answers structural questions. It does not answer business ones.
That gap shows up immediately in real teams. Column names are often cryptic. Metric definitions are local. The words people use in meetings rarely match the words they used when someone designed the database three years ago.
Even when the schema is clean, the meaning can still be missing.The same term can fracture across teams: "active user" means 30 days for product and 90 days for sales. No schema captures that split.
- "Revenue" may exclude refunds.
- "Active user" may mean 30 days for one team and 90 days for another.
- "Top customers" may mean count, ARR, gross spend, or net spend.
Those are not syntax problems. They are interpretation problems.
Schema only
The agent sees tables, columns, and types.
orders.total_amountexistsusers.last_loginexistsorders.is_testexists- the phrase "active customer" still has no stable meaning
Schema plus business language
The agent gets reusable business interpretation on top of the schema.
- revenue excludes cancelled and refunded orders
- active user means last login within 30 days
- test data is excluded unless explicitly requested
- ambiguous terms can trigger clarification instead of guessing
Without that second layer, every question turns into an improvisation exercise. Sometimes the model guesses right. Sometimes it produces a perfectly valid query for the wrong definition. When it cannot guess, the better move is to stop and ask the user directly.
Typed Fragments for Different Kinds of Knowledge
The system stores different kinds of knowledge differently. Domain concepts such as terms, glossaries, aliases, explanations, and analogies are separate from rules such as hintsSoft defaults like "exclude test data unless asked." and guardrails.Hard restrictions like "never expose PII columns." Worked examples and clarification patterns are stored separately too. That matters because a formula, a soft default, and a safety rule should not behave the same way.
Domain Knowledge
Vocabulary, terminology, and conceptual understanding
"revenue" → SUM excluding cancelled/refunded
Rules & Behavior
Constraints, guidelines, and behavioral patterns
Exclude test orders unless explicitly asked
Examples & Patterns
Concrete examples, clarifications, and disambiguation patterns
"top customers" → SELECT by net spend DESC
Identity & Preferences
Persona, presentation preferences, and corrections
Analyst role, neutral tone, concise answers
The glossary type is the clearest example:
{
"type": "glossary",
"entries": {
"revenue": "SUM(orders.total_amount) WHERE orders.status NOT IN ('cancelled', 'refunded')",
"active user": "users WHERE last_login > NOW() - INTERVAL '30 days'"
}
}{
"type": "glossary",
"entries": {
"revenue": "SUM(orders.total_amount) WHERE orders.status NOT IN ('cancelled', 'refunded')",
"active user": "users WHERE last_login > NOW() - INTERVAL '30 days'"
}
}That is the bridge between the language people use and the SQL meaning the agent needs.
These fragments reach the agent through two paths: a notebook where you write them by hand, and an onboarding flow where the agent creates them by studying your schema.
Writing Fragments by Hand
This is the explicit path. You know the agent should treat "revenue" as net of refunds, so you write a glossary entry that says exactly that. You know test orders should be excluded by default, so you write a hint. The agent does not have to infer anything — you are telling it directly.
When you save, the system normalizes fragments before writing them to the agent's shared instruction store. Duplicate fragments are skipped by exact match.The canonical key is the fragment's JSON fields sorted and serialized. Two fragments match only if they produce identical keys — semantic similarity is not checked. Persona fragments behave like an upsert — updating the existing identity rather than appending a second one.
The Agent Studies the Schema First
When you connect a data source and start onboarding, the agent first asks a few questions to establish its identity — name, role, and tone — which become a persona fragment. After that, it shifts into a silent study phase. It scans your schema, identifies the central tables, samples status and type columns, inspects join tables, and looks for domain patterns. It does this without asking you anything else. From that analysis, it writes fragments it is confident about: terms for vocabulary it recognizes, glossary entries for metrics it can infer, quirks for data edge cases it finds, aliases for columns that map to common business language.
What it cannot figure out alone, it asks about. After the silent harvest, the agent presents its vocabulary discoveries for validation, then asks targeted questions about genuine gaps — metric definitions that need human judgment, ambiguous status values, team-specific terminology. Each answer becomes a fragment immediately.
The entire onboarding output lives in a working file until the user explicitly commits it. That commit merges the new fragments with any existing ones, deduplicates, and writes them to the agent's shared instruction store — the same store the notebook writes to.
Notebook path
Human writes fragments directly.
- You pick a fragment type and fill in the fields
- Save normalizes, deduplicates, and persists
- You decide what to teach and when
- Best when you know what the agent should learn
Onboarding path
The agent creates fragments from schema analysis.
- Silent harvest: reads schema, writes confident fragments
- Targeted questions: asks only about gaps it cannot resolve
- Agent proposes, you review and commit
- Best for the initial teaching pass on a new data source
Two Paths, One Prompt
Both paths converge at the same storage layer. Whether a fragment was typed into the notebook or generated during onboarding, it ends up in the same place: the agent's instruction store.
The more interesting convergence is at query time. When a user sends a question, the system builds the agent's prompt. It fetches every stored fragment, converts each one into a context fragment, and feeds them into a scheduling engine that decides which fragments appear in this turn's prompt. Some fragments fire on every turn. Others rotate on a cadence — appearing every few turns instead of every time — so that the agent's accumulated knowledge fits within context limits without dropping anything permanently.
At that point, the prompt does not know or care how a fragment was created. A glossary entry authored by hand and a glossary entry harvested from schema analysis are identical once they reach the model. The fragment's origin is irrelevant — only its content matters.
Structure Buys Inspection, Merging, and Reuse
If this were only a prompt-writing feature, the story would stop at "we added a better editor."
But structure buys three concrete things.
You can tell the difference between a term definition, a formula, a soft default, and a guardrail. Each fragment has a type, so reviewing what the agent knows is not a matter of reading a wall of prose — it is filtering by category.
It also creates responsibility. If you teach the wrong rule once and save it, that mistake can travel farther than one answer.
Shared Knowledge
Belongs to the agent. Every team member's questions benefit from it.
Personal Memory
Belongs to one user. Stays scoped to their preferences.
Durable Knowledge Still Needs a Review Gate
The onboarding flow proves the agent can generate knowledge autonomously. It reads the schema, writes fragments, and gets most of them right. But shared knowledge still passes through a review gate before it becomes part of the agent's permanent context.
That gate exists because the blast radius of a wrong fragment is larger than one chat turn. A bad glossary entry does not produce one wrong answer — it produces wrong answers for every team member on every future question that touches that term. Review is not a limitation of the system. It is load-bearing.
Teaching also does not replace schema grounding. Grounding decides what database context the agent can inspect. Teaching decides how the agent should interpret that context in business terms. You need both.
There is another trade-off: business language drifts. Teams rename metrics. Finance changes definitions. A sales org starts using "active account" to mean something different than it did six months ago. Onboarding handles re-runs with delta awareness — it preserves existing fragments and uses correction fragments for updates — but drift detection is still manual.
Exact Match Is Not Enough
The most obvious gap is deduplication. Two glossary entries that define "revenue" in slightly different words are both accepted because the system compares serialized JSON keys, not meaning. Over time, this can inflate the agent's context with near-duplicates that a human reviewer would collapse into one.
Drift detection is manual too. If your finance team redefines "active customer" from 90 days to 60 days, no part of the system notices the existing fragment is stale. Someone has to remember to update it — through the notebook, or by re-running onboarding.
The path forward is tighter feedback: semantic similarity checks during deduplication, drift alerts when usage patterns diverge from stored definitions, and, further out, cross-agent knowledge sharing so two agents connected to the same database do not need to learn the same business language independently.
When knowledge fragments are not enough, the agent can pause and ask — here is how structured clarification closes the loop.