Courses & Documentary

AI for Networking

Organizations across the globe are on a quest for autonomous networks—intelligent systems capable of managing themselves—yet the reality of today's IT infrastructure has not quite reached this futuristic goal. Martin Keen, in a discussion for IBM Technology, outlines that while current networks utilize some degree of automation, machine learning, and AI, they are still a considerable distance from true autonomy. The fundamental challenge, as Keen explains, is the massive quantity of data produced by IT networks, a volume so large that it is impossible for humans to analyze it in real time. This data is often fragmented, moving across different domains and getting trapped in network silos, which obstructs visibility and accessibility, thereby hindering the evolution of network operations to meet modern demands.

Keen highlights a critical problem for network operations teams: the difficulty of distinguishing meaningful signals from a constant barrage of noise. In a typical network operations center, teams are flooded with alerts, the vast majority of which are false positives and are subsequently ignored. This is not a reflection of carelessness, but rather a symptom of teams being "drowning in noise," making it nearly impossible to pinpoint the genuine issues that demand urgent attention. This situation forces teams into a reactive state of guessing which signals are important. The problem is exacerbated by the sheer volume, velocity, and complexity of telemetry data, which is often siloed across various vendor platforms and network domains, making comprehensive, cross-domain analysis extremely difficult.

WhatsApp Image 2025-10-01 at 23.48.47.jpeg

Related article - Uphorial Radio 

What is AI in networking? | Glossary | HPE

As presented by IBM Technology, the solution is not a "magic" black box, but a strategic integration of AI, automation, and analytics, a combination referred to as "AI for networking". The objective, Keen clarifies, is to build networks that can understand their own status, decide on a course of action, and then execute that decision independently. This process is effectively structured within a "day zero," "day one," and "day two" operational framework. Day zero, the planning and design phase, uses AI to analyze historical patterns to optimize network design and facilitate smarter capital expenditure (CapEx) decisions on hardware like routers and switches. The outcome is a "right-sized" network performance and improved cost efficiency, avoiding the common pitfall of overbuilding infrastructure.

Next, day one focuses on the building and deployment of the network, where new services are rolled out and devices are configured. AI accelerates this stage through dynamic network optimization, which includes validating configurations before they are implemented, optimizing service paths in real time, and learning from each deployment to enhance future actions. However, according to Keen, the most substantial impact of AI is realized in day two operations, the phase where most AI work is currently concentrated. This is where a more advanced concept, agentic AI, plays a pivotal role. Unlike conventional systems that just flag alerts, agentic AI can reason about problems. It ingests data from all the siloed domains and vendors and employs domain-tuned models—AI specifically trained on network data—to accurately identify the root cause of an issue. This system provides a clear chain of reasoning for its findings and, because these agents have "agency," they can trigger remediation actions through existing automation tools to resolve the problem.

Keen notes that most organizations begin their AI journey at day two because it directly addresses the most pressing operational pains, such as service tickets, outages, and disruptive late-night calls. A key element of this model, as discussed in the IBM Technology video, is the creation of a continuous feedback loop. As day two operations become more efficient through AI, the operational data and intelligence gathered—including patterns of what typically fails—are fed back to enhance day zero planning and day one building. The AI learns the network's specific behaviors, making capacity models more intelligent and performance more accurately "right sized" over time. The ultimate aim is network autonomy: a network that can be given a high-level objective, such as prioritizing specific traffic, and will independently determine how to achieve it. This vision does not necessarily eliminate human involvement; instead, AI can manage the repetitive "grunt work," freeing human teams to concentrate on more complex, strategic decisions. AI for networking, Keen concludes, is fundamentally about pattern recognition at a scale no human team could ever match, enabling networks that learn, adapt, and resolve issues on their own—a capability that might just be all the magic needed in a world of siloed data and false alarms.

site_map