Industrializing AI in Networks: The Domain Knowledge Limit

AI-driven impact in network operations scales only when workflows are redesigned end-to-end, not simply automated.
The difference between pilots and production lies in operating discipline across data, architecture, governance, and delivery.
Closed-loop execution tied to measurable outcomes is what moves AI programs into real operational impact.
Progress toward autonomous networks is concentrated in domains where AI is embedded into core operations.
Industrializing AI requires grounding it in the operator’s network data and scaling it through incremental, production-focused delivery.

I remember, a few years ago, as an industry we were super excited about 5G monetization, the benefits it would bring for customers, and the revenue uplift that it would create for equipment vendors and telecom service providers. You already know how that story unfolded. In most markets, the promised improvement of ARPU did not show up as expected. But, on the brighter side, we learnt an important albeit expensive lesson about who should set the agenda for a major transformation like this one.

It is a pattern we see forming again today around AI-driven network operations. The good news is, we can avoid repeating the same loop.

Where real progress is being made

As the industry steadily moves towards fully autonomous networks, there are a few observations to note.

As part of the TM Forum’s November 2025 update, 37 Level 4 certifications were announced, and most operators sit between Levels 2 and 3. When we see the Level 4 progress, it is concentrated in specific domains that includes fault management, energy optimization, and certain transport layers. Telefónica is running 12 Level 4 use cases across three countries. China Mobile reports¹ an 80% drop in major network faults across eight Level 4 domains. While considered solid outcomes, the pace of improvements across the entire operator base is still not progressing fast enough.

Around 20% of operators have reached Level 4 or 5 in selected domains according to Bain-TM Forum 2025 survey². McKinsey’s research³ points out that AI-driven operational use cases are cutting network OpEx by 15-30%, but only when workflows are redesigned end-to-end and not just automated as they are today.

Three traits only operators can own

When I look at the operators who’ve moved AI from proof of concept to production, three traits show up repeatedly.

The operator owns the decisions.
The operator owns scenario selection while the vendor executes. Telcos hitting Level 4 today have picked two or three scenarios where they could measure a closed loop (MTTR, truck rolls, energy, SLA breaches) and run those as legitimate production programs rather than extended pilots.
AI learns from the operator’s network memory.
In our experience, operators who redesign workflows around AI capture lasting value. In comparison, those who simply embed AI into existing processes mainly do the same things a bit faster. McKinsey’s 2026 research study⁴ highlights that three-quarters of the surveyed telco executives cite weak change management is their biggest barrier to scaling AI.
Delivery always starts now.
Successful AI programs are built for incremental delivery and never for multi-year transformation projects. On average, European operations carry 56% tech debt. Some of the OSS systems have been running for more than two decades. This is never a question of whether modernization is overdue but about how to get this done without taking on too much risk.

Telefónica’s Autonomous Network Journey (ANJ) program offers the insight that one can embed AI directly into network platforms rather than running it as a separate five-year program. A number of studies show that 80–95% of AI pilots in large enterprises never reach production-grade impact. The root cause is attributed to a delivery rhythm that doesn’t tie each release to a specific, measurable outcome.

What telcos must do in the next two years

In the first phase, which is the next 90 days, inventory the network knowledge that AI will depend on. This includes topology, configuration history, fault hierarchies, change records, and vendor-specific operational quirks. This will build a strong foundation that determines if the AI investments made are actually producing reliable results.

In the next six to 12 months, pick two high-value scenarios and run them as real production programs. You can find a closed loop that can be measured. The most common starting points are fault management and service assurance among leading operators, but TM Forum’s high-value scenario list has plenty of others.

Rather than launch yet another monolithic OSS transformation, develop an incremental modernization roadmap where each phase is tied directly to a P&L target. At Tech Mahindra, we are working for a Tier‑1 European operator, wherein our closed-loop automation approach is targeting a 60% reduction in field costs and a 55% reduction in MTTR. These numbers come from scoping tightly, grounding AI in the operator’s own network data, and letting go of the five‑year program model.

Pitfalls to navigate and the value unlocked from scaling AI

While moving from proof-of-concept to production, there are a few pitfalls to navigate, and this can be done if we keep the following three factors in mind.

First, it helps to build an underlying data fabric that connects network topology, configuration history, fault hierarchies, change records, and even vendor-specific operational quirks. Without it, even the best AI models struggle to deliver reliably. Second, a ‘North Star’ architecture for scaled deployment is essential. This cannot be a rigid, multi-year blueprint. It has to be something sustainable and flexible, which can absorb new use cases, models, and operational demands over time. And third, a robust governance process around AI, including responsible AI principles. This makes sure automation behaves predictably when it meets the complexity of a live network.

When you scale AI thoughtfully across network operations, the benefits become tangible.

Workforce: A different kind of workforce becomes possible, where AI agents and human teams work together, each doing what they do best, enabling focus on higher-value decisions.
MTTR: Mean time to repair improves dramatically when you embed AI into fault management and service assurance processes. We see faster detection, better diagnosis, and in some cases, automated healing.
Agility: The ability to launch new products and services accelerates when AI is embedded in network operations. This kind of speed significantly benefits customers, given the quick responses, reliable experiences, and services that adapt to real demand.

The real focus area

The difference between operators scaling AI successfully and those still working through pilots isn’t fundamentally a technology gap. The platforms are widely available, and the models are increasingly commoditized. The real opportunity lies in the operating discipline that includes data fabric, architecture, governance, and incremental delivery.

TAGS: Network Operations Artificial Intelligence Communications

Frequently Asked Questions

Our FAQ section is designed to guide you through the most common topics and concerns.

AI pilots often fail because they are disconnected from end-to-end workflow redesign and measurable business metrics. Weak change management and treating AI as an overlay rather than embedding it into daily operations prevent pilots from achieving production-grade impact.

Operators should start by inventorying critical network knowledge such as topology, configuration changes, and fault patterns. This should be followed by selecting a small number of high-value, measurable scenarios and delivering them as production programs instead of long-term transformation initiatives.

When scaled effectively, AI reduces mean time to repair, lowers operational costs, and improves service agility. Embedding AI into fault management and service assurance enables faster detection, better diagnosis, and more predictable network performance, as demonstrated in large-scale operator deployments.

References

About the Author

Amol Phadke

Chief Transformation Officer, Tech Mahindra

Amol is the Chief Transformation Officer at Tech Mahindra, with a mandate covering enterprise-wide transformation strategy, with a particular focus on the communications industry vertical.Read More

Amol is the Chief Transformation Officer at Tech Mahindra, with a mandate covering enterprise-wide transformation strategy, with a particular focus on the communications industry vertical. His background spans three distinct vantage points on the telecom and AI infrastructure question: as a network architect and operator (BT, Alcatel-Lucent, Accenture, where he led global network services across the US, India, Singapore, and the UK), as a hyperscaler working with carriers at scale (Google Cloud, where he led the global communications service provider vertical), and as a carrier CTO responsible for deploying AI infrastructure in practice (Telenor). That combination of perspectives shapes how he approaches questions about where value is created and captured in an AI-native network architecture. At Telenor, where he served as Group CTO and EVP from 2023 to 2024, he led the operator's AI-first strategy and was directly involved in establishing one of the first sovereign AI factories launched by a European carrier, built in partnership with AWS and Nvidia.

At Tech Mahindra he has continued to work on the practical side of agentic AI in telecom, including the Large Telco Model developed with Nvidia and deployed with O2 Telefónica in Germany, and a recent collaboration with Microsoft on an ontology-driven agentic platform for network operations. He has been a regular speaker at MWC Barcelona, Nvidia GTC, DSP Leaders Forum, and TM Forum events, and contributes to the Nvidia developer blog on agentic AI in telecom. Amol sits on the TM Forum Board of Directors and its Autonomous Networks Mission Board.

Read Less

Know More

Our Promise

Featured Report

Featured News

Featured White Paper

Featured Event

Featured Case Study

Featured Case Study

Industrializing AI in Network Operations: Why Domain Knowledge is the Limiting Factor

Where real progress is being made

Three traits only operators can own

What telcos must do in the next two years

Pitfalls to navigate and the value unlocked from scaling AI

The real focus area

Frequently Asked Questions

References

Author(s)

Amol Phadke

Related Insights

AI Agent Identity: Securing Autonomous Systems

The Next Transformation for Water Utilities: Autonomous Operation

Why Enterprises Need the Digital Core Service Advantage?

Surpassing the Carbon Barrier: Cloud Sustainability as the New Competitive Edge

Our Promise

Featured Report

Featured News

Featured White Paper

Featured Event

Featured Case Study

Featured Case Study

Industrializing AI in Network Operations: Why Domain Knowledge is the Limiting Factor

Key Takeaways

Where real progress is being made

Three traits only operators can own

What telcos must do in the next two years

Pitfalls to navigate and the value unlocked from scaling AI

The real focus area

Frequently Asked Questions

References

Author(s)

Amol Phadke

AI Agent Identity: Securing Autonomous Systems

The Next Transformation for Water Utilities: Autonomous Operation

Why Enterprises Need the Digital Core Service Advantage?

Surpassing the Carbon Barrier: Cloud Sustainability as the New Competitive Edge