A lot of boardrooms are running on a seductive assumption right now “if AI can write code, then custom software development must be on borrowed time”, soon enough this assumption start falling apart the moment they start comparing it with what other companies experienced.
Based on Stanford’s 2025 AI Index found that 78% of organizations reported using AI in 2024, up from 55% the year before, yet IBM’s 2025 CEO study found that only 25% of AI initiatives had delivered expected ROI and just 16% had scaled enterprise-wide. This gap is the story, it is simply the AI hype-meets-reality friction.
If you business is manufacturing, healthcare, retail, logistics, finance, or another non-software industry, I believe this might matter to you more than the latest product demo, because the real question is not whether AI can generate code snippets, as it can for sure, the question here is whether AI can replace the hard, expensive, politically messy, architecture-heavy work of building production systems that fit your data, your workflows, your compliance obligations, your legacy systems, and your customers.
In most enterprises, unfortunately this answer is still no, and pretending otherwise is how budgets get burned, teams get cynical, and transformation programs stall out halfway across the bridge.
The boardroom problem behind AI hype-meets-reality friction
The reason this debate feels so heated is simple, AI is genuinely impressive, but enterprise software is genuinely difficult, those two things can be true at once.
Stanford’s latest index shows business adoption rising fast, while McKinsey reports that less than one-third of respondents say their organizations follow most of the adoption and scaling practices needed to capture value from generative AI…in plain English, companies are buying the tools faster than they are redesigning the business around them.
But here’s why that matters, a polished demo creates the impression that implementation risk has simply vanished, unfortunately it has not! A chatbot can summarize a maintenance SOP in seconds, but this does not mean it can be trusted to integrate with your ERP, respect your permissions model, preserve audit trails, produce deterministic outcomes where regulators expect them, and fail safely when the underlying data is incomplete.
That is where executive confusion starts, leaders see AI collapse the cost of producing text and assume it has collapsed the cost of producing systems, but those are not the same thing for sure.
Here’s what this looks like in practice, a manufacturing firm asks an AI assistant to generate a quality-inspection dashboard…in a week, the prototype looks good enough to impress the COO, by month three, the team realizes the plant data lives across MES logs, Excel exports, vendor APIs, and a ten-year-old on-prem database that nobody wants to touch. The dashboard was the easy part, the system behind it is the project, and that’s where custom software teams earn their keep.
What most guides get wrong about this is that they frame the issue as “AI versus development team or solution providers” but in reality that is way too shallow. The real contest is between two kinds of work, low-context code generation on one side, and high-context system design on the other. AI is getting very good at the first, enterprise value still lives mostly in the second.
Has AI replaced custom software development?
No, AI has for sure reduced the cost of some coding tasks, especially routine or well-scoped ones, but it has not replaced the need for custom software development in enterprises where success depends on integration, domain accuracy, governance, reliability, and change management. In fact, the more critical the system, the more human judgment and tailored engineering still matter.
That short answer deserves a longer one.
Research is remarkably consistent on one point, AI can raise productivity, but unevenly.
An NBER field study on more than 5,000 customer support agents found a 14% average productivity increase from a generative AI assistant, with a 34% boost for novice and lower-skilled workers and minimal impact on experienced, highly skilled workers. That is a powerful finding, and it tells us something many executives miss, AI often compresses routine work first, but it does not automatically replace expert judgment.
Software follows the same pattern more often than vendors admit, Google Cloud’s DORA research found AI adoption may negatively affect software delivery performance, associating increased adoption with a 1.5% drop in delivery throughput and a 7.2% reduction in delivery stability. That surprised me when I first encountered it, but the logic is straightforward, if teams ship AI-generated code faster than they can review, test, secure, and operationalize it, they simply move the bottleneck downstream.
And the nuance gets sharper, METR reported in 2025 that experienced open-source developers working on their own repositories actually took 19% longer when using early-2025 AI tools.
That does not mean AI is useless, it means context-rich engineering work behaves differently from repetitive work, so when the codebase is large, the architecture is quirky, the domain is specialized, and the consequences of being wrong are expensive, autocomplete is not the whole answer.
Here’s what this looks like in practice, a retail company decides to rebuild parts of its promotions engine with AI assistance, so junior developers move faster on boilerplate APIs and test scaffolding. The hard part turns out to be promotion precedence rules, regional tax quirks, pricing overrides, fraud flags, and syncing real-time inventory across stores and e-commerce.
AI helps produce pieces of the implementation, it does not resolve the business logic disputes that determine whether the system actually works.
Bottom line, yes, custom software development is changing, the bottom end of the market is already under pressure, simple CRUD apps, brochure portals, internal admin tools, and one-off scripts are becoming cheaper and faster to produce. But that is not the same as saying custom development is dying, it is more accurate to say the market is splitting, commodity builds are getting compressed, and high-value systems work is becoming more strategic.
That distinction matters to management because it changes what they should buy, so if you hire a vendor for pure code-monkey output, AI absolutely threatens that model, but if you hire a partner to solve system-level business problems, the value proposition is still very much alive.
Why code-monkey work is shrinking but systems work is not
This is where the conversation gets uncomfortable for both optimists and skeptics.
The optimists want to say AI replaces developers, while the skeptics want to say nothing changes, neither position survives contact with reality.
The more useful way to think about it is this, AI is reducing the value of isolated code-monkey tasks while increasing the value of architecture, product judgment, data modeling, test design, security review, orchestration, and domain translation. Or, put more bluntly, the market is paying less for typing and more for thinking.
MIT Sloan put the distinction neatly in 2025, generative AI is often a strong fit for everyday-language tasks and broad pattern recognition, but traditional approaches still make more sense when privacy is sensitive, the domain is highly specific, or accuracy depends on specialized internal context, and that describes a large percentage of enterprise software.
Here’s what this looks like in practice, a hospital group wants to automate patient communication, so AI can draft outreach messages, summarize internal policies, and help generate front-end components. But the real engineering problem sits elsewhere…consent management, protected health information, integration with EHR systems, localization, escalation logic, patient identity verification, and auditability. No serious executive wants a beautifully phrased hallucination entering that workflow.
The honest answer is, it depends. If your software need is generic, AI may let a smaller team deliver it, but if your software need is a business-critical workflow embedded in regulation, operations, or proprietary data, AI changes the toolchain more than the need for expert engineering.
This is also why the phrase code-monkey deserves a little scrutiny and I understand why some executives use it. They are reacting to a world in which labor-intensive coding felt expensive, slow, and sometimes over-engineered. Fair enough, but reducing software teams to code-monkey labor misses where enterprise value actually comes from. Great teams do not just produce code, they surface edge cases early, shape workable requirements, challenge risky assumptions, design fallback paths, and translate strategy into dependable systems.
That last part is not glamorous, it is, however, the part that keeps factories running, claims processing compliant, and digital channels from becoming PR incidents.
A small personal side note, I used to think executives would quickly separate “prompt-generated app” from “production-grade system”, yet in practice, many do not. A demo is emotional, systems engineering is administrative…guess which one wins the room at first glance.
Where AI hype-meets-reality friction gets expensive: The integration boundary
This is the section many companies discover too late. When leaders say, “AI already writes the code,” what they usually mean is, “AI already writes some code”, the money, delay, and risk tend to collect at the seams between old and new systems, between departments, between structured and unstructured data, between models and business rules, and between what the demo did once and what production must do every day.
That seam is where AI hype-meets-reality friction turns into budget friction.
McKinsey’s 2025 survey is especially useful here, as it found the practices most associated with bottom-line impact include tracking well-defined KPIs and establishing a clear roadmap for adoption, while fewer than one in five respondents said their organizations were tracking KPIs for generative AI solutions. In other words, many firms are still treating AI as a tooling decision when it is really an operating-model decision.
IBM’s CEO research tells a similar story from the executive side, only a quarter of AI initiatives delivered expected ROI, and only 16% scaled enterprise-wide! That does not sound like a world where software engineering has become irrelevant, it sounds like a world where implementation discipline became more important.
Here’s what this looks like in practice, a logistics company deploys an AI assistant for dispatch planning. In pilot mode, it performs well using a cleaned sample dataset, yet once connected to live operations, it hits inconsistent geocodes, missing driver statuses, third-party carrier delays, permission mismatches, and conflicting rules between regions. None of those failures are “the AI model failed to autocomplete”, they are systems failures at the integration boundary.
MIT researchers made a similar point in their 2025 work on autonomous software engineering, the roadblocks are not just about producing code, but about the broader challenges that make real software work hard in the first place. That includes scoping, debugging, adaptation, evaluation, and dealing with messy real environments.
This is why custom software often becomes more valuable after AI arrives, not because humans type faster than models as they do not for sure, but it becomes more valuable because enterprises need new connective tissue orchestration layers, guardrails, retrieval systems, workflow engines, private-model integrations, evaluation pipelines, monitoring, and rollback mechanisms. That is custom work, and sometimes deeply custom work.
One more nuance worth mentioning, METR’s research on task completion horizons suggests frontier AI systems are improving rapidly, with the length of tasks they can complete autonomously doubling roughly every seven months. That is real progress, and dismissing it would be foolish, but “rapidly improving” is not the same as “ready to replace enterprise software design today”. The future curve matters. So does the current reliability threshold.
What companies should measure instead of demo magic
Once you accept that AI is neither fake nor fully substitutive, a better executive question appears…what should we actually measure?.
Not prompt cleverness. Not lines of code. Not how quickly someone produced a prototype on stage.
Measure cycle time to production, defect escape rate, integration coverage, workflow adoption, compliance exceptions, and measure whether the tool reduced manual handoffs, shortened revenue cycles, lowered service costs, or improved quality outcomes in the business function that funded it.
McKinsey’s data is blunt here, organizations that define roadmaps and track KPIs are more likely to see bottom-line impact.
Here’s what this looks like in practice, suppose a food manufacturer wants an AI-enabled supplier portal. The old buying logic would ask, “How many development hours did AI save?”, but the smarter question is, “Did onboarding time drop from 12 days to 4?, Did document validation accuracy improve?, Did procurement exceptions decrease?, Did audit preparation get easier?” Those are business metrics, they travel well from the IT steering committee to the CFO’s office.
Google’s DORA work offers another cautionary lesson, developers may feel AI improves code quality, and 67% of respondents reported that sentiment, yet aggregate software delivery outcomes do not always move the same way. That gap matters because perception can mask operational drag, and executives should be very wary of productivity stories that are measured only at the keyboard.
A practical framework helps, for every proposed AI-enabled software initiative, ask five questions:
• What workflow changes if this works?
• What data sources and permissions does it require?
• What is the failure mode, and can the business tolerate it?
• What measurable business outcome justifies the spend?
• What custom integration or governance work still remains after the model is selected?
If answers are fuzzy, the issue is probably not engineering capacity, it is strategic immaturity.
That is also the moment when a serious software partner earns more value than a cheap builder, a good software partner should be able to tell you, gently but firmly, that your AI idea is under-scoped, over-scoped, or pointed at the wrong bottleneck.
Sometimes the highest-ROI move is not “build the AI product”, it is clean the data, redesign the workflow, then add AI where it removes friction instead of amplifying it.
The new custom software playbook is smaller teams, stronger engineers, better outcomes
By this point, the false binary should be gone, the future is not “AI replaces software development” or “AI changes nothing”, the future is that smaller, stronger teams will build way more ambitious systems, way faster, but only if they combine AI with rigorous engineering and domain expertise.
That pattern is already visible in the research, AI tends to help novices more than experts in some environments, compressing routine effort and making baseline output easier to generate. At the same time, context-heavy work can still slow down or demand substantial oversight, put those together and you get an important strategic implication, enterprises may need fewer pure code-monkey hours, but they need for sure more senior judgment per project, not less.
Here’s what this looks like in practice, a mid-sized insurer in 2026 may not need the same number of developers to ship customer-service improvements as it did in 2022, but it may need stronger solution architects, product owners, platform engineers, security reviewers, and AI evaluation specialists. The organizational chart changes, and the need for tailored software does not disappear.
MIT Sloan’s experts make a point that fits here, generative AI is often the first thing to try for everyday information tasks, while more specialized, domain-specific, or privacy-sensitive problems still call for other techniques and tighter controls, enterprise software portfolios are full of those specialized problems.
This is where I think many non-software companies should update their mental model, as valuable custom software vendor of the next five years will not sell raw programming hours, they will sell business process redesign, integration strategy, governed AI enablement, and production accountability. Yes, there will still be coding, plenty of it, but the client should care less about how many keystrokes were human and more about whether the resulting system is secure, stable, explainable, and economically useful.
That is a healthier market, frankly. It pushes everyone up the stack.
The best way to understand AI hype-meets-reality friction is not as a problem with AI, it is a problem with category confusion.
Leaders are seeing extraordinary progress in model capability and assuming that software production has become a solved commodity, some of it has, and a lot of it has not. AI can draft code, summarize documentation, speed up prototyping, reduce some low-value engineering effort, and help smaller teams move faster.
Those gains are real, so are the limits, integration still matters, domain context still matters, security still matters, data quality still matters, and in regulated or operationally complex businesses, those are not side notes. They are the work!
That is why the companies that win will probably not be the ones that declare custom software development dead, they will be the ones that redefine it, they will use AI aggressively where it lowers toil, but they will still invest in architecture, governance, workflow design, testing discipline, and human decision-making where the business risk demands it, in other words, they will treat AI as a force multiplier, not a magic eraser.
For companies, the next step is practical, audit your current software roadmap and separate commodity builds from system-critical initiatives. The first category should get cheaper with AI, the second should get smarter, not sloppier. That is the real response to AI hype-meets-reality friction and it is a much better strategy than mistaking a brilliant demo for a finished business system.


