Why LLMs Struggle to Integrate with Corporate Systems

Large language models (LLMs) have been heavily promoted as productivity tools for enterprises in a range of use cases, such as customer chatbots, content creation for marketing material, and enabling employees to easily access and summarise documents like product documentation or product manuals. There are other use cases, such as automation of business processes like document processing of contracts or RFPs. Beyond this, there are some use cases within compliance and audit. This all makes sense since LLMs are just that: language models, designed for processing text (and also other media like images, audio and video). Many vendors have also been plugging LLMs for use cases that involve structured data, such as decision support and analysis, and software code generation and debugging. Leaving software code generation aside for a moment, let’s consider the issues an LLM has to deal with accessing and integrating with existing corporate systems.

An LLM like ChatGPT or Claude is trained on a vast amount of data, mostly textual, from which it learns to understand language and how to write it, whether that be in English, Spanish or a range of other languages on which it has been trained. An LLM does not, by default, have any understanding of the systems, data and policies of your particular company, so it needs somehow to be briefed on that information. One of the first approaches was the technique called “retrieval augmented generation” (RAG). In this, company documents like policy manuals and technical documentation were put into a form that an LLM can understand, such as a vector database, and then used to supplement its base training data. Provided that the data is of decent quality and is accurate, then the LLM can combine its existing knowledge with the company-specific information, for example, to answer a query from a customer about the company policy on order returns or whatever. However, this assumes that the corporate information is in a form like a text or image file, which can be readily converted to a vector format that the LLM can grasp. RAG is less suited to structured data unless it is set up to operate on descriptions or metadata, not on rows of raw data. So what about structured data, like an accounts receivable database or a materials management system?

An LLM has no inherent understanding of the relationships between structured data that govern the world of corporate databases. Databases have tables and columns, and relationships between these. For example, there may be a customer table with various columns like customer name and customer address, a separate product table with columns like product name and category, and an orders database connecting products ordered by customers. The corporate world is full of systems like this, sometimes of considerable complexity. A multinational company may have hundreds or even thousands of separate applications, of which its ERP system is usually the largest, usually based on a package like SAP (or Oracle or another competitor like Microsoft Dynamics, Workday or Infor).

The SAP application has a huge number of tables. The exact number varies by the instance, but 50,000 tables is common, and I have seen over 120,000 tables in a single SAP instance. These tables are littered with opaque names for columns: the customer master table is “KNA1“, for example. Products are held in a table called “MARA”, and customer orders are in tables like “VBAK” and ”VBAP”. SAP has a catalog where all this is documented, including the relationships between these tables and their columns, but an LLM needs to somehow be made aware of this database schema, the formal structure of a database, and the semantics that interpret the data within the schema. For example, there may be a database column related to order dates in the schema. In SAP, this is the AUDAT field within the VBAK table to represent the order date, and the column BSTDK for the customer purchase order date to reflect the date from the customer purchase order. Those are the schema representations and are hardly intuitive to most people. Separate semantics explain that a field represents the date that an order is placed by a customer. In the case of SAP, such semantics are held in the data dictionary, which contains metadata about tables, fields, domain descriptions, labels, value ranges and data definitions. The dictionary, with its semantics, acts as a kind of map of the database schema.

So, how well do you think your LLM is going to navigate those 120,000 tables, the opaque schema table and column names and the entirely separate dictionary? By default, not well. It needs help in the form of some sort of semantic layer like a knowledge graph, which uses business terminology and hides the physical structure of obscure table and column names. Corporations can build such a semantic layer by adding additional database views, and in fact in the case of SAP, the vendor itself has built some tools to help. There is the SAP HANA Cloud vector engine, which transforms organisational data into AI-processable formats that offer relevant business context for LLMs. I have gone to this level of detail not to pick on SAP, who are actually taking some steps to improve the situation, but to illustrate the complexity of navigating a large corporate system in the corporate world. Similar issues will occur with other large applications, whether packages or ones built in-house. In order for an LLM to make sense of them, a semantic layer needs to be built. Even if there is a semantic layer, though, there is a further problem: the context window.

An LLM has no inherent memory between sessions. Its short-term memory is the “context window”, which defines how much information it can “see” when generating a response. Context windows depend on specific LLM models and are finite. Some might be 8,000 tokens in size (a token is the unit that LLMs use, and is like a small word). Some models may have a 32,000 token context window. However, you would need an immense context window to take into account something like the full SAP dictionary, with its tens of thousands of tables, each of which may have many columns. Around 50 columns or more is normal for a typical business table; the customer master table KNA1 in SAP has 246 columns. Even the Claude Sonnet 4 context window of a million tokens would be inadequate to deal with the whole SAP dictionary, which would take up tens of millions of tokens.

In practice, there are various shortcuts that can be taken. The RAG approach means that an LLM would only query a vector database that had the SAP semantic information, just retrieving what is needed to include in each prompt. However, even this would be millions of tokens in size. You can also split the SAP information into more manageable chunks and summarise them, so the LLM can use the summaries. You can augment session memory by storing long-term context externally, then re-inject it across user sessions. All these tricks are possible, but each requires careful setup and management. Remember that ERP is just one application in an enterprise amongst hundreds, albeit usually the largest one. Now you can see the scale of the problem if someone asks you to “just attach an LLM to the corporate databases for analytics”. Incidentally, the above issue is just about navigating corporate databases. The actual quality of data inside these corporate systems is another matter entirely, and one that has plagued corporations for decades.

The sheer complexity of corporate databases and the inherent textual nature of LLMs are major barriers to corporations trying to use LLMs to access structured data systems. There are plenty of other issues too, such as LLM hallucinations, security weaknesses and the lack of explainability of LLM decisions. This is quite apart from the unpredictable, probabilistic nature of LLMs in a corporate world used to predictability and reliability from their computer systems. Even if none of these issues existed, the complexity of navigating the structured data landscape is a major barrier to widespread LLM rollout. These issues may help explain why the failure rate of AI projects in corporations was 95% in 2025, according to MIT.

Over time, things will improve. Vendors will build semantic layers and knowledge graphs on top of their products that LLMs can comprehend. Context windows will continue to expand in size. However, the other issues mentioned above that come with LLMs (lack of explainability, hallucinations, security flaws) are not easily fixed. Indeed, in some cases they may never be fully addressed. Corporations need to get better at selecting use cases that are well-suited to the strengths of LLMs rather than seeing AI as the solution to every problem. In many cases, they will be better off with a machine learning model, some other strand of AI or indeed no AI at all. Integration of AI with existing corporate systems is a thorny issue that is not discussed as much as more glamorous areas of AI, but tackling some of the hard problems examined in this article will go a long way toward determining how much AI will truly be embraced by the corporate world.

Why LLMs Struggle to Integrate with Corporate Systems

Related Posts