First, if you want reliable answers about your business, the model has to see your business. That starts with retrieval-augmented generation (RAG) that feeds the model the right slices of data and metadata—DDL, schema diagrams, DBT models, even a few representative row samples—before it answers. For text-to-SQL specifically, include table/column descriptions, lineage notes, and known join keys. Retrieval should include governed sources (catalogs, metric stores, lineage graphs), not just a vector soup of PDFs. Spider 2.0’s results make a simple point that when models face unfamiliar schemas, they guess. So, we need to reduce unfamiliarity for the models.
Second, most AI apps are amnesiacs. They start fresh each request, unaware of what came before. You thus need to add layered memory (working, long-term, and episodic memory). The heart of this memory is the database. Databases, especially ones that can store embeddings, metadata, and event logs, are becoming critical to AI’s “mind.” Memory elevates the model from pattern-matching to context-carrying.
Third, free-form text invites ambiguity; structured interfaces reduce it. For text-to-SQL, consider emitting an abstract syntax tree (AST) or a restricted SQL dialect that your execution layer validates and expands. Snap queries to known dimensions/measures in your semantic layer. Use function/tool calling—not just prose—so the model asks for get_metric('active_users', date_range="Q2")
rather than guessing table names. The more you treat the model like a planner using reliable building blocks, the less it hallucinates.