Teaching Agents Your Business Language

Your database schema can tell an agent that a column is named total_amount.A schema is the structure that defines your tables, columns, and data types. An agent, in Limerence, is an AI assistant connected to your database that turns plain-language questions into SQL queries. The schema tells the agent the column exists. It cannot tell the agent whether your team means gross revenue or net revenue when they say "revenue." It cannot tell the agent that "active customer" means a purchase in the last 90 days, that test orders should be excluded by default, or that sales says "pipeline" while the database calls the table opportunities.

That missing layer is not model weakness. It is company-specific knowledge that lives in analysts' heads, dashboards, and muscle memory. If you leave it out, the model has to guess.

Typed Fragments, Not Better Prompts

Limerence separates schema groundingThe process of giving the agent awareness of your database structure. from business interpretation. Grounding tells the agent what tables and columns exist. Teaching tells it what those tables and columns mean in your business.

The important design choice is that this knowledge is structured into what we call fragments.A fragment is a typed JSON object that captures one piece of business knowledge. A glossary fragment maps a term to its SQL meaning. A hint sets a soft default. A guardrail sets a hard restriction. The type decides how the system stores, deduplicates, and injects it into the agent's prompt at query time. The system is not hoping the model remembers your jargon. It is giving the model a durable interpretation layer.

◆Key Takeaway

This is not "better prompt engineering" in the loose sense. It is a structured knowledge layer that enters the agent through two paths — manual authoring and guided onboarding — and becomes live context at query time.

Schema Stops at Column Names

Schema answers structural questions. It does not answer business ones.

That gap shows up immediately in real teams. Column names are often cryptic. Metric definitions are local. The words people use in meetings rarely match the words they used when someone designed the database three years ago.

Even when the schema is clean, the meaning can still be missing.The same term can fracture across teams: "active user" means 30 days for product and 90 days for sales. No schema captures that split.

"Revenue" may exclude refunds.
"Active user" may mean 30 days for one team and 90 days for another.
"Top customers" may mean count, ARR, gross spend, or net spend.

Those are not syntax problems. They are interpretation problems.

Schema only

The agent sees tables, columns, and types.

orders.total_amount exists
users.last_login exists
orders.is_test exists
the phrase "active customer" still has no stable meaning

Schema plus business language

The agent gets reusable business interpretation on top of the schema.

revenue excludes cancelled and refunded orders
active user means last login within 30 days
test data is excluded unless explicitly requested
ambiguous terms can trigger clarification instead of guessing

Without that second layer, every question turns into an improvisation exercise. Sometimes the model guesses right. Sometimes it produces a perfectly valid query for the wrong definition. When it cannot guess, the better move is to stop and ask the user directly.

Typed Fragments for Different Kinds of Knowledge

The system stores different kinds of knowledge differently. Domain concepts such as terms, glossaries, aliases, explanations, and analogies are separate from rules such as hintsSoft defaults like "exclude test data unless asked." and guardrails.Hard restrictions like "never expose PII columns." Worked examples and clarification patterns are stored separately too. That matters because a formula, a soft default, and a safety rule should not behave the same way.

Four categories of business knowledge, each stored and applied differently

Domain Knowledge

Vocabulary, terminology, and conceptual understanding

TermGlossaryAliasExplainAnalogy

"revenue" → SUM excluding cancelled/refunded

Rules & Behavior

Constraints, guidelines, and behavioral patterns

HintGuardrailStyle GuideQuirkPrinciplePolicy

Exclude test orders unless explicitly asked

Examples & Patterns

Concrete examples, clarifications, and disambiguation patterns

ExampleClarification

"top customers" → SELECT by net spend DESC

Identity & Preferences

Persona, presentation preferences, and corrections

PersonaPreferenceCorrection

Analyst role, neutral tone, concise answers

Each fragment type has specific fields and behavior. A formula, a soft default, and a safety rule are stored differently because they serve different purposes at runtime.

The glossary type is the clearest example:

json

{
  "type": "glossary",
  "entries": {
    "revenue": "SUM(orders.total_amount) WHERE orders.status NOT IN ('cancelled', 'refunded')",
    "active user": "users WHERE last_login > NOW() - INTERVAL '30 days'"
  }
}

{
  "type": "glossary",
  "entries": {
    "revenue": "SUM(orders.total_amount) WHERE orders.status NOT IN ('cancelled', 'refunded')",
    "active user": "users WHERE last_login > NOW() - INTERVAL '30 days'"
  }
}

That is the bridge between the language people use and the SQL meaning the agent needs.

These fragments reach the agent through two paths: a notebook where you write them by hand, and an onboarding flow where the agent creates them by studying your schema.

Writing Fragments by Hand

This is the explicit path. You know the agent should treat "revenue" as net of refunds, so you write a glossary entry that says exactly that. You know test orders should be excluded by default, so you write a hint. The agent does not have to infer anything — you are telling it directly.

When you save, the system normalizes fragments before writing them to the agent's shared instruction store. Duplicate fragments are skipped by exact match.The canonical key is the fragment's JSON fields sorted and serialized. Two fragments match only if they produce identical keys — semantic similarity is not checked. Persona fragments behave like an upsert — updating the existing identity rather than appending a second one.

The Agent Studies the Schema First

When you connect a data source and start onboarding, the agent first asks a few questions to establish its identity — name, role, and tone — which become a persona fragment. After that, it shifts into a silent study phase. It scans your schema, identifies the central tables, samples status and type columns, inspects join tables, and looks for domain patterns. It does this without asking you anything else. From that analysis, it writes fragments it is confident about: terms for vocabulary it recognizes, glossary entries for metrics it can infer, quirks for data edge cases it finds, aliases for columns that map to common business language.

What it cannot figure out alone, it asks about. After the silent harvest, the agent presents its vocabulary discoveries for validation, then asks targeted questions about genuine gaps — metric definitions that need human judgment, ambiguous status values, team-specific terminology. Each answer becomes a fragment immediately.

The entire onboarding output lives in a working file until the user explicitly commits it. That commit merges the new fragments with any existing ones, deduplicates, and writes them to the agent's shared instruction store — the same store the notebook writes to.

Notebook path

Human writes fragments directly.

You pick a fragment type and fill in the fields
Save normalizes, deduplicates, and persists
You decide what to teach and when
Best when you know what the agent should learn

Onboarding path

The agent creates fragments from schema analysis.

Silent harvest: reads schema, writes confident fragments
Targeted questions: asks only about gaps it cannot resolve
Agent proposes, you review and commit
Best for the initial teaching pass on a new data source

Two Paths, One Prompt

Both paths converge at the same storage layer. Whether a fragment was typed into the notebook or generated during onboarding, it ends up in the same place: the agent's instruction store.

The more interesting convergence is at query time. When a user sends a question, the system builds the agent's prompt. It fetches every stored fragment, converts each one into a context fragment, and feeds them into a scheduling engine that decides which fragments appear in this turn's prompt. Some fragments fire on every turn. Others rotate on a cadence — appearing every few turns instead of every time — so that the agent's accumulated knowledge fits within context limits without dropping anything permanently.

At that point, the prompt does not know or care how a fragment was created. A glossary entry authored by hand and a glossary entry harvested from schema analysis are identical once they reach the model. The fragment's origin is irrelevant — only its content matters.

Structure Buys Inspection, Merging, and Reuse

If this were only a prompt-writing feature, the story would stop at "we added a better editor."

But structure buys three concrete things.

You can tell the difference between a term definition, a formula, a soft default, and a guardrail. Each fragment has a type, so reviewing what the agent knows is not a matter of reading a wall of prose — it is filtering by category.

It also creates responsibility. If you teach the wrong rule once and save it, that mistake can travel farther than one answer.

Knowledge travels with the agent. Memory stays with the user.

Shared Knowledge

Belongs to the agent. Every team member's questions benefit from it.

User teaches

glossary: revenuehint: exclude test ordersterm: active user

All team members

Personal Memory

Belongs to one user. Stays scoped to their preferences.

User teaches

prefers bar chartsdefault range: 90 daystimezone: PST

Single user

Shared knowledge belongs to the agent and shapes every team member's answers. Personal memory belongs to one user and never leaks into someone else's results. Mixing them would make the agent unpredictable for teams.

Durable Knowledge Still Needs a Review Gate

The onboarding flow proves the agent can generate knowledge autonomously. It reads the schema, writes fragments, and gets most of them right. But shared knowledge still passes through a review gate before it becomes part of the agent's permanent context.

That gate exists because the blast radius of a wrong fragment is larger than one chat turn. A bad glossary entry does not produce one wrong answer — it produces wrong answers for every team member on every future question that touches that term. Review is not a limitation of the system. It is load-bearing.

Teaching also does not replace schema grounding. Grounding decides what database context the agent can inspect. Teaching decides how the agent should interpret that context in business terms. You need both.

There is another trade-off: business language drifts. Teams rename metrics. Finance changes definitions. A sales org starts using "active account" to mean something different than it did six months ago. Onboarding handles re-runs with delta awareness — it preserves existing fragments and uses correction fragments for updates — but drift detection is still manual.

Exact Match Is Not Enough

The most obvious gap is deduplication. Two glossary entries that define "revenue" in slightly different words are both accepted because the system compares serialized JSON keys, not meaning. Over time, this can inflate the agent's context with near-duplicates that a human reviewer would collapse into one.

Drift detection is manual too. If your finance team redefines "active customer" from 90 days to 60 days, no part of the system notices the existing fragment is stale. Someone has to remember to update it — through the notebook, or by re-running onboarding.

The path forward is tighter feedback: semantic similarity checks during deduplication, drift alerts when usage patterns diverge from stored definitions, and, further out, cross-agent knowledge sharing so two agents connected to the same database do not need to learn the same business language independently.

The Agent Has Questions For You

When knowledge fragments are not enough, the agent can pause and ask — here is how structured clarification closes the loop.