It is now almost three years since ChatGPT was unleashed on an unsuspecting world. Two months later, it had 100 million users, making it the fastest adopted consumer application in history. A torrent of money was thrown at AI, from venture capital firms backing AI start-ups (a third of all venture capital went into AI in 2024, and 53% in the first half of 2025) to corporates exploring how best to take advantage of the new technology. So, how many of those corporate AI projects actually achieved the productivity advantages that they promised?
In July 2025, a major MIT study of over 300 AI projects and interviews with 153 executives and 300 corporate employees explored this subject in depth. The findings caused headlines in major media outlets as well as the technology press when it was released in August. No fewer than 95% of these projects failed to deliver any return on investment at all. The day after the report was published, NVIDIA stock dropped 3.5%, and the NASDAQ index fell over 1%. The report was quite damning, but had some interesting detail below the headline. Just 5% of custom-built enterprise tools even make it into production compared to 40% of general-purpose LLMs. Purchased AI solutions were three times more likely to succeed than ones built in-house. Just two industries (technology and media) showed any signs of structural disruption. Large enterprises (over $100 million in annual revenue) took nine months or more to get their systems into production, compared to a 90-day average for smaller companies. Interviews with customers found that the biggest obstacles to success were the quality of the models, legal issues, data quality and risk. Over half of the funding was allocated to sales and marketing applications. A majority of respondents stated that they would trust AI for simple tasks such as generating emails or summarising documents, but only 10% would trust AI for complex projects. Now, a 95% failure rate is hardly great, but a lot of IT projects of all kinds fail. A 2020 Standish Group study found a 66% failure rate for software projects, and this figure is roughly in line with several other studies of IT project success and failure.
The MIT AI study has been widely quoted, but it is not wildly out of line with others that have been done. A Rand study in 2024 found an 80% failure rate for AI projects, for example. However, because the MIT study was large and thorough, and this was done in mid-2025 rather than early in the adoption phase of generative AI. Thus it is hard to blame the high failure rate on teething troubles, as may have been encountered in early adopter AI projects carried out in 2023. It has been almost three years since generative AI has been widely available, so you would expect that lessons have been learnt and that there has been a reasonable penetration of AI skills, training and experience. The MIT study itself examined the causes of failure in its interviews, identifying a number of factors that contributed to projects not succeeding. One was the difficulty in integrating AI with existing applications; another was that many projects were directed at sales and marketing, where returns were relatively low compared to the automation of back-office operations. Survey respondents reported that models struggled to adapt to highly specific corporate needs, while larger companies, in particular, have business processes that made it hard for rapid adoption. Other studies into AI project failure have also cited overestimating AI capabilities, low-quality corporate data, unclear objectives and also privacy and ethical concerns. A recurring theme of concern is the rate of hallucinations by large language models, which undermines confidence in the technology. Although hallucinations may not matter in some use cases, they are a major issue in any application that requires reliability and consistency. Traditional business-critical systems in the corporate world are highly reliable and entirely consistent, but LLMs are inherently probabilistic, leading to variation in their outputs. Incidentally, in case you think that the answer is agentic AI, these are at an even earlier level of maturity. One 2025 Carnegie Mellon study found that even the best-performing out of thirteen models tested could manage a 30% success rate at 175 realistic work tasks. Nine of the models tested could not even hit a 10% success rate.
Corporations that are considering applying AI to production use cases need to learn from the issues identified by the MIT study and others. Clear objectives, good staff training, an emphasis on data quality and a choice of project areas where there are clear and likely benefits from automation of processes will all help. Above all, a realistic assessment of what areas generative AI is suitable for, and which it is not, is critical. In all the excitement about generative AI, it is easy to forget that this is just one flavour of AI. There are many others, and some of these approaches may be better suited to your particular need than generative AI. Indeed, whisper it, but in some cases, AI may not be the ideal tool for your project at all. The old saying is that if all you have is a hammer, then everything looks like a nail. Software projects need to explore the full toolbox of options, both AI and non-AI, to ensure that they use the best tool for the job at hand.







