Databricks fires back at Snowflake with SQL-based AI document parsing

According to analysts, Databricks and Snowflake’s offerings would help enterprises cut down the complexity of workflows required to analyze unstructured data, especially documents.

Enterprises, historically, have had to build complex, slow, brittle OCR pipelines if they want to bring data from documents, such as PDFs, into an AI workflow, resulting in the culmination of RAG, which enabled semantic search over parsed text but still struggled with nuanced document structures like tables, said Bradley Shimmin, practice lead of data, analytics, and infrastructure at The Futurum Group.

To handle documents with tables, enterprises often chained additional LLM calls to extract and reconstruct tables as JSON, which was effective but risky due to hallucinations, Shimmin said, adding that instead of stitching together OCR, RAG, and custom extraction logic, Databricks’ ai_parse collapses the entire workflow into a single declarative SQL statement.