During COVID-19, Comcast ups its virtualization game from core to cable modems with AI and ML

Comcast's investment in the development of artificial intelligence (AI) and machine learning (ML) technologies paid off in spades during the coronavirus pandemic and both are key elements of its virtualization efforts going forward.

Using AI, Comcast's Octave platform, which it started to develop last year, can detect network anomalies, such as external LTE noise, across the 50 million modems that the cable operator has deployed. Comcast's Smart Networking Platform uses AI and ML to examine the core network in order to enable orchestration, automation and closed-loop delivery, as well as self-healing.

Under the leadership of Comcast's Tony Werner, president of technology, product and "Xperience," Comcast has spent the past 10 years or so honing its software capabilities as networks became more complex.

RELATED: Comcast turns up AI and ML for network insights and to improve customer experience

Octave and the Smart Networking Platform benefited from Comcast's early work on its X1 platform, which included using machine learning for its voice-controlled remote.

While telcos such as AT&T, Verizon and Orange, to name just a few, have received a lot of virtual ink for their virtualization efforts, Comcast's work on Octave and the Smart Networking Platform are notable on several fronts, including that both were developed internally by merging the skill sets of its data scientists, network engineers and software programmers across its Philadelphia, Denver and San Jose, Calif. locations.

It's worth noting that while Octave and SNP examine network data, neither looks at customers' specific data. While both also espouse automation, a Comcast spokesperson said they were "force multipliers" when asked if either would lead to job cuts going forward.

Here's a look at how Comcast utilized Octave and the Smart Networking Platform (SNP) tool, NetIQ, during the pandemic in which Comcast saw internet traffic spike by as much as 60%.

How Octave modulates traffic on cable modems

During the coronavirus pandemic, Comcast, which has 29 million broadband subscribers, was able to increase its download capacity by 36%. Comcast started to deploy its first version of Octave for increased downstream capacity in January on a rolling basis across its footprint. In electronic terms, Octave is defined as a logarithmic unit for ratios between frequencies with one octave corresponding to a doubling of frequency

With millions of employees working remotely from their homes, along with more video conference calls and remote learning, Comcast sped up the deployment of Octave and now has it largely deployed across its footprint.

Modulation varies across networks and cable modems based on plant conditions and factors such as outside noise. Octave was developed to detect when the DOCSIS 3.1 cable modems weren't using all of the bandwidth available to them.

Instead of the traditional static taps and alarms located throughout the cable plant, Octave polls more than 4,000 network telemetry points every 20 minutes or so. Among other results, Comcast can see how many channels a cable modem is bonding with Octave.

With Octave, Comcast is able to adjust the plant conditions on the fly with modulation. Once the data was gathered by the software, a decision engine analyzed the information down to each specific modem before making changes to improve the capacity and speeds in each household.

"We're now going to be able to deliver more bits to your house in real time and be able to flex that as plant conditions are changing," said Comcast's Elad Nafshi, senior vice president for next generation access networks. "We can basically then turn that into a decision engine with knobs on the actual CMTS (cable modem termination system) for a modulation profile for each cable modem."

In the throes of COVID-19, 25 Comcast engineers worked seven days a week to increase the upstream capacity on the cable modems. Nafshi said what normally would have taken months was accomplished in six weeks with Octave 2.0

Typically in cable plants, upstream capacity is enabled across the 5 Megahertz to 42 Megahertz range. With Octave 2.0, Nafshi said Comcast was able to add additional upstream channels below 5 megahertz on DOCSIS 3.0 upstream channels.

"If I'm able to now modulate to the point where I could get around the interferences that are below there (5 megahertz), I could put a fifth or maybe even a sixth upstream channel out there," he said. "We're going to continue to see how we expand the use of the optimization across the board. I can't go into those type of details yet, but to suffice to say that there's a lot more to come."

NetIQ for a smarter network

Comcast's Smart Network  Platform, which has been three years in the making, uses AI and ML to create a suite of software tools that automate and orchestrate a large number of Comcast's core network functions. Like Octave, SNP was developed internally by Comcast, and it leverages open source components such as Ansible and OpenConfig, according to Comcast's Noam Raffaelli, senior vice president for network and communications engineering.

One of the tools for SNP is NetIQ, which uses machine learning to continuously scan Comcast's core network. Previously, cable operators often found out about network issues, such as fiber cuts, when customers called in to complain.  With NetIQ, Comcast engineers can spot outages instantaneously.

With a network full of thousands of routers, load balancers and firewalls, among other components, Comcast had to work with each vendor to enable the reporting of network issues.

"A lot of network engineering up until now was we used to basically work with a lot of command line interfaces or graphical user interfaces across monolithic stacks, or vertical stacks that we got for a router or for other network equipment," Raffaelli said. "We decided to take things into our own hands. We understood that with the scale of Comcast, we have to create a management layer that will be ours in terms of having visibility of the network, in terms of ability to automate the network, and in terms of the ability to use all of this intelligence data that we're collecting in order to be proactive and actually let the network self heal."

Noam Raffaelli, Comcast

In order to break down the monolithic stacks, Comcast employed software to create an abstraction layer that puts the data into a data lake and then standardizes it with Comcast's data analytics engine. While Big Data and analytics have been around for sometime, extracting meaningful, actionable data has been the Holy Grail.

Comcast set up key performance indicators (KPIs) for NetIQ to mitagate and measure the time for repairs.

"It enables us to then run correlations of the operational state of our network, in real time," Raffaelli said. "Unlike what we used to do, which was polling intervals and stuff like that, it enables us to do incident detection and all sorts of things that give us so much more unprecedented visibility into the network.

"Coupled with that, we have the software capabilities to program this layer. Then comes the layer of the orchestration and automation."

Raffaelli said Comcast had seen double digit, year-over-year declines in its KPIs with its Smart Network Platform.

"Our biggest problem at times was obviously change related stuff," Raffaelli said. "So it was just people touching the network. The automation enables us to limit the number of hands that are actually getting access to those routers, and everything is automated. We are immediately minimizing the number of change related outages on top of that.

"The second thing is how quickly can we get to an epicenter of an outage? Is it a router? Is it a security policy change? The ability for us to have this unprecedented visibility across our footprint enables us to very quickly detect the problem. What's the epicenter of the issue? What is the vendor or equipment? This enables us to very quickly mitigate the problem."

With NetIQ, Comcast has been able to quickly identify router problems, fiber cuts or optical issues over the past several months.

"I think COVID-19 has proven to all of us that we have taken the right measures in order to increase capacity and make sure that customers, with all the increased working from home, continued to get the service level and the speeds that they are used to," he said. "It was seamless to our customers.

"Fiber cuts are becoming the bulk of the issues and you want to make sure that you are able to have the real time telemetry to address it and we've done that."

While its early days for SNP and Octave, there's a possibility that one or both could be syndicated, similar to X1, for use by business customers or service providers.