By Lily Hay Newman for Wired
For two hours on Monday, internet traffic that was supposed to route through Google’s Cloud Platform instead found itself in quite unexpected places, including Russia and China. But while the haphazard routing invoked claims of traffic hijacking—a real threat, given that nation states could use the technique to spy on web users or censor services—the incident turned out to be a simple mistake with outsized impacts.
Google noted that almost all traffic to its services is encrypted, and wasn’t exposed during the incident no matter what. As traffic pinballed across ISPs, though, some observers, including the monitoring firm ThousandEyes, saw signs of malicious BGP hijacking—a technique that manipulates the web’s Border Gateway Protocol, which helps ISPs automatically collaborate to route traffic seamlessly across the web.
ThousandEyes saw Google traffic rerouting over the Russian ISP TransTelecom, to China Telecom, toward the Nigerian ISP Main One. “Russia, China, and Nigeria ISPs and 150-plus [IP address] prefixes—this is obviously very suspicious,” says Alex Henthorne-Iwane, vice-president of product marketing at ThousandEyes. “It doesn’t look like a mistake.”
Malicious BGP hijacking is a serious concern, and can be exploited by criminals or nation state actors to intercept traffic or disrupt a target service—like Google. But the technique also has a dopey, well-intentioned cousin known as a prefix leak, or sometimes “accidental BGP hijacking.”
In both cases, rerouting occurs when an ISP declares that it owns blocks of IP addresses that it doesn’t actually control. This can be an intentional deception, but can also simply come down to a configuration error that, while disruptive, is not intentional. On Monday, a Google spokesperson said that the company didn’t see signs of malicious hijacking, and instead suspected that the Nigerian ISP Main One had accidentally caused the problem.
“The problem here is a failure to apply basic best current practices to these routing sessions.”
There are minimum best practices that ISPs should implement to keep BGP routes on the up and up. These are important, because they apply filters that catch errors in the event of a route leak and block problematic routes. Not all ISPs implement these protections, though, and in a prefix leak like the one that affected Google, traffic will flow chaotically across networks, not based on efficiency or established paths, but based on which networks haven’t put the BGP safeguards in place and will therefore accept the rogue routing.
Indeed, on Tuesday morning Main One said in a statement that, “This was an error during a planned network upgrade due to a misconfiguration on our BGP filters. The error was corrected within 74mins.”
In this case, it appears that the Russian and Chinese ISPs, and perhaps others as well, offered a path to the Google traffic because they hadn’t implemented protective configurations.
The protocols underlying the internet were written decades ago, in a different era of computing, and many have needed major security overhauls and additions to improve trust and reliability around the web. There was the effort to encrypt web traffic with HTTPS, and the growing movement to secure the internet’s Domain Name System address lookup process so it can’t be used to spy on users, or for malicious rerouting.
Similarly, ISPs and internet infrastructure providers are starting to implement a protection called Resource Public Key Infrastructure that can virtually eliminate BGP hijacking, by creating a mechanism to cryptographically confirm the validity of BGP routes. Like HTTPS and DNSSEC, RPKI will only start to provide true customer protection when a critical mass of internet infrastructure providers implement it.
“This incident had a non-trivial impact because Google and some other prominent network routes were accidentally leaked,” says Roland Dobbins, a principal engineer at the network analysis firm Netscout. “But the problem here, as it is in most of these cases, is a failure to apply basic best current practices to these routing sessions. The key is for network operators to participate in the global operational community, get these kinds of filters put in place, and move to implement RPKI.”
While Google’s incident wasn’t a hack and instead gets into obscure internet protocol drama, the impact for users on Monday was apparent—and shows the pressing need to resolve issues with BGP trust. The flaw has been maliciously hijacked before, and could be again.