One Adapter Layer, Seven Data Source Types

Supporting many databases sounds like a connector problem. In practice, it is a contract problem. The hard part is not counting how many logos you can list on a pricing page. The hard part is making one SQL agent behave predictably across six direct database adapters while also handling spreadsheet uploads through a SQLite bridge.A preprocessor is a function that rewrites or rejects a query before it hits the database. Every adapter chains them in order before executing anything.

Without a shared lifecycle, every feature that touches a database starts accumulating per-engine branches. Connection management, validation, read-only enforcement, retry logic — each one gets reimplemented in seven unrelated places. The product moves fast at first and then stops moving because every change touches seven files.

Limerence solves that by centering everything on one shared adapter lifecycle. A single factory dispatches to the correct adapter, and from there the base class owns the pipeline: preprocess the query, execute or validate it, and optionally recover from known failure shapes. The subclasses fill in how connections work, how validation works, and which failures are repairable. The base class decides what the lifecycle even is.

◆Key Takeaway

The adapter pattern here is not seven independent connectors. It is one shared lifecycle contract, six direct engine adapters, and one spreadsheet-to-SQLite bridge.

One Factory Chooses the Dialect

The first important choice is having one dispatch point. The backend looks at the data source type once, chooses a concrete adapter, and hands the rest of the lifecycle to that object.

That sounds small, but it changes the architecture. Without a single dispatch point, every feature that touches a database starts accumulating if postgres, if mysql, if bigquery branches of its own. With a factory, those branches stay concentrated at the edge.

switch (dataSource.type) {
  case 'POSTGRESQL':
    return new PostgresAdapter(dataSource);
  case 'MYSQL':
    return new MysqlAdapter(dataSource);
  case 'MARIADB':
    return new MariadbAdapter(dataSource);
  case 'SQLSERVER':
    return new SqlServerAdapter(dataSource);
  case 'SQLITE':
    return new SqliteAdapter(dataSource);
  case 'BIGQUERY':
    return new BigQueryAdapter(dataSource);
  case 'SPREADSHEET':
    return new SqliteAdapter({
      ...dataSource,
      filePath: dataSource.generatedDbPath,
    });
}

switch (dataSource.type) {
  case 'POSTGRESQL':
    return new PostgresAdapter(dataSource);
  case 'MYSQL':
    return new MysqlAdapter(dataSource);
  case 'MARIADB':
    return new MariadbAdapter(dataSource);
  case 'SQLSERVER':
    return new SqlServerAdapter(dataSource);
  case 'SQLITE':
    return new SqliteAdapter(dataSource);
  case 'BIGQUERY':
    return new BigQueryAdapter(dataSource);
  case 'SPREADSHEET':
    return new SqliteAdapter({
      ...dataSource,
      filePath: dataSource.generatedDbPath,
    });
}

That last branch is the honest caveat. Spreadsheet support exists, but not as a native seventh query engine. Uploading a spreadsheet first materializes a SQLite database from the file, and runtime queries then reuse the SQLite adapter path. The conversion happens once at upload time. After that, the spreadsheet is indistinguishable from any other SQLite data source.

The Base Class Owns the Query Lifecycle

Once the factory chooses an adapter, the rest of the flow is uniform. The abstract base class owns the steps that should work the same way everywhere. Each concrete adapter supplies its own connection logic, its own validation method, and its own text-to-SQL grounding adapter. But the base class decides the order of operations and where recovery happens.

That shared lifecycle matters more than the list of subclasses. It is the reason new features do not need to relearn how every engine behaves. They enter one contract.

1
Choose the concrete adapter from the data source type.
2
Preprocess the query through the adapter's preprocessor chain before any engine call happens.
3
Validate or execute with dialect-specific logic and connection handling.
4
Recover narrowly — only when the adapter exposes a recoverer that can repair a known failure shape, and only for that specific shape.

The important part is what is not duplicated. Read-only enforcement, retry wiring, and the high-level execution contract live in the base class. The subclasses fill in the engine-specific pieces — how to open connections, how to run validation, which failure shapes are repairable at all — but the lifecycle itself is defined once.

Recovery deserves specific attention. The base class iterates through an adapter's recoverers in order. The first one that returns a recovery action wins. If the recovery action specifies retries, pRetry handles them with exponential backoff. In practice, SQLite's recoverer sets retries to zero — it rewrites the query and attempts it exactly once. If the rewrite fails, a RepairExhausted error wraps the original cause so the caller sees both what went wrong and what was attempted.

Only two adapters opt into recovery at all. SQLite installs a recoverer that detects column alias errors, malformed JSON, and syntax errors, then calls an external repair function to rewrite the query. BigQuery installs a cross-project recoverer that handles access-denied and not-found errors by remapping dataset references to the correct project. Every other adapter fails fast.

Shared Contract, Dialect-Specific Edges

The cleanest way to understand the pattern is to separate the code that stays centralized from the code that intentionally does not.

Shared in the base class

The base adapter owns the stable lifecycle.

preprocess before validate and execute
run recoverers only when the adapter opts in
retry repaired queries through the same preprocessing path
expose common helpers for table and view grounding

Implemented per adapter

Each adapter owns the engine edge.

how to open and close connections
how validation works for that backend
which text-to-SQL adapter to instantiate
which failure shapes are repairable at all

That split keeps the abstraction honest. The base class is not pretending SQL Server and BigQuery validate queries the same way. SQL Server uses SET PARSEONLY ON inside a transaction, explicitly resetting it in a finally block to avoid poisoning the connection. BigQuery uses a dry run on createQueryJob. SQLite runs EXPLAIN on a prepared statement. PostgreSQL, MySQL, and MariaDB lean on EXPLAIN through pooled connections. MariaDB is worth calling out specifically: it uses the MySQL driver and MySQL text-to-SQL adapter with no MariaDB-specific dialect handling. That works for common SQL but does not account for MariaDB syntax extensions.

Connection management is equally varied. PostgreSQL, MySQL, MariaDB, and SQL Server use connection pools capped at 10 with 30-second idle timeouts. SQL Server has its own pool manager with a race-condition guard that prevents duplicate simultaneous pool creation — a detail the other pool managers skip. SQLite opens a fresh database handle per query in immutable read-only mode. BigQuery creates a client per adapter instance.

That is the real value of the pattern. It does not erase dialect behavior. It contains it.

Read-Only Lives Inside the Adapter Contract

The strongest invariant in this layer is also the most important one: user-submitted SQL is forced through read-only validation before it reaches any database. Every concrete adapter installs validateReadOnly as the first preprocessor in its chain. The guard catches non-SELECT statements, though it applies to the first statement only — multi-statement inputs are not yet blocked, which is a gap covered in the missing guards section below.

That design is still stronger than relying on conventions in calling code. If read-only were a courtesy handled by whichever route happened to submit the query, every new feature would be another chance to forget it. By attaching the guard to the adapter lifecycle itself, the rule travels with the connection logic.

The implementation goes deeper than the preprocessor for several engines. PostgreSQL sets default_transaction_read_only = on as a connection-level hook, applied to every new pool connection. MySQL and MariaDB set transaction_read_only = ON at the session level per query execution. SQLite opens the database file with ?mode=ro&immutable=1 URL parameters, making write attempts fail at the filesystem level.PostgreSQL's connection hook is logged rather than treated as fatal if it fails. This means part of the read-only story on Postgres depends on operational discipline — the preprocessor guard remains, but the connection-level belt is advisory.

The architectural point holds regardless: the query contract begins with "reject unsafe SQL first," not with "hope the caller remembered."

BigQuery and Spreadsheets Are the Honest Exceptions

The adapter pattern would be far less credible if it claimed full symmetry. The code does not support that story, and the real engineering story is better anyway.

BigQuery is the clearest asymmetric case. It does not use pooled SQL connections. Its validation path is a dry run rather than EXPLAIN or PARSEONLY. Its repair logic only handles narrow access-denied and not-found errors for tables and datasets, remapping them to the correct project using a cross-project map derived from the data source's dataset configuration.

On the grounding side, BigQuery exposes less than the relational engines. The config interface supports columnStats and columnValues, but BigQuery never adds them to its grounding array. The agent runtime gets info, tables, and rowCount for BigQuery — not the full column-level statistics that PostgreSQL, MySQL, and the others provide.

Spreadsheets are the other exception. They appear in the product as a first-class data source type, but they become SQLite before any query executes. Uploading a spreadsheet triggers a conversion step that materializes a SQLite database, stores the file path, and marks the data source as connected. From that point on, the system reuses the SQLite adapter entirely.

That bridge creates a concrete failure window. If the generated SQLite file is missing — deleted after upload, moved, or not cleaned up properly — the adapter has no database to open. There is no pre-flight check that the file exists; the error surfaces only when DatabaseSync fails at construction time.

The abstraction works because it has room for these exceptions. A good adapter layer does not hide where the edges are rough. It gives those edges a stable place to live.

One Adapter Contract, Three Product Surfaces

The adapter layer is not only about executing SQL. The same contract is reused by the agent runtime, dataset validation, and schema introspection through createText2SqlAdapter. For each data source type, the concrete adapter wires its own text-to-SQL grounding adapter with the capabilities that engine actually supports.

That is a subtle but important design choice. The product is not promising that every backend supports identical schema introspection. It is promising that every backend enters the same agent pipeline and advertises the capabilities it actually has. BigQuery advertises table names and row counts but not column-level statistics. Spreadsheets inherit the SQLite adapter but their grounding is limited to the tables and schema information materialized at upload time.

In other words, the shared layer gives Limerence one runtime story across multiple product surfaces without forcing one fake capability matrix. The contract is "same lifecycle, explicit limits," not "same metadata everywhere."

The Shared Layer Still Has Missing Guards

A strong abstraction does not end the work. It usually makes the unfinished parts easier to see.

Several preprocessors are planned but not yet implemented. A table allow-list filter would reject queries that reference tables outside the grounding set — right now, a model can hallucinate a table name and the query will reach the database. A multi-statement blocker would prevent queries like SELECT 1; DROP TABLE x from bypassing the read-only check on the second statement. A SELECT * guard would block unbounded result sets. A cost estimator would flag expensive queries before execution, which matters most for BigQuery where cost is monetary.

Beyond preprocessors, the quoteIdentifiers step only runs on PostgreSQL. Other engines do not quote identifiers, which means reserved keywords in column or table names can cause unexpected failures on MySQL, MariaDB, SQL Server, and BigQuery.

Those are not reasons to distrust the adapter pattern. They are evidence that the pattern is doing its job. Once the lifecycle is centralized, the missing protections stop being scattered mysteries. They become concrete items in one shared layer, with a clear place to add them and a clear pipeline that will carry them across every backend.

That is the real payoff of this design. Supporting seven data source types did not produce seven separate systems that happen to live in one repository. It produced one contract, six direct adapters, one spreadsheet bridge, and a deliberately asymmetric BigQuery path — enough centralization to keep the product coherent and enough honesty to make the gaps visible instead of scattered.