Subscribe
About

In the face of hurricanes and natural disasters, architecture matters for business continuity

Netskope recognises that architecture matters.
Netskope recognises that architecture matters.

Over a decade ago, when cloud-delivered security services like secure web gateway (SWG) began emerging, the focus was mainly on protecting a small segment of the workforce – remote users plus contractors, suppliers and partners. The challenge was ensuring these off-premises users had the same security protections as those on-site, especially when endpoint agents were insufficient, impractical, onerous or prone to being disabled!

At that time, many high-profile enterprises – for example, in banking and financial services – also considered cloud services as a backstop in case of dreaded “black swan” events. For example, extreme weather events like Hurricane Helene or Hurricane Milton, or unexpected crises such as civil unrest, terror attacks or war, could take an entire data centre or head office offline. In both cases, the objective was to ensure security coverage was always in place, since security had become so integral to the business and the consequences so great if there was a gap (eg, loss of valuable data or intellectual property, fines and penalties from non-compliance, or a negative impact on a company’s public brand and reputation).

These enterprises were ahead of their time, as we learned during the COVID pandemic, with the remote work paradigm being flipped. Today, workers will sometimes be in the office and other times working from home. It’s the norm, not the exception (and in some cases, the promise of hybrid work is even being used to aid in recruiting talent). This has obviously accelerated the adoption of cloud security services, including a security service edge (or SSE) that combines firewall, web and SaaS security, as well as zero trust network access (ZTNA). Today, the application of cloud security extends far beyond remote workers. It now encompasses all enterprise traffic – from users, sites and machines – routing through provider infrastructures. For providers like Netskope, this shift brings immense responsibility, but for enterprises, it also unlocks major advantages related to infrastructure consolidation, cost savings and enhanced security. Networking, infrastructure and operations (NI&O) teams that have adopted Netskope have also benefited from a streamlined “direct-to-net” approach, removing latency and the performance hit that comes with it.

Now a new conundrum is manifesting itself and forcing many enterprises who have standardised on a cloud provider to re-evaluate their architecture. Because the risk of a “black swan” is still ever present, the enterprise expects service providers to protect them against it. In large part, service level agreements are provided as the “insurance” from the vendor that they can deliver, but it is not enough. Customers are increasingly asking the hard questions and wanting to get into the details of the providers’ architecture. They want to understand, for example, how Netskope has built and designed its own infrastructure to be ready to handle the unexpected – while continuing to deliver always on, always available security that is fast and with a great digital experience.

Why 'architecture matters' with Netskope NewEdge

Early on, Netskope recognised that “architecture matters” and prioritised the right architectural approach for security traffic processing at the edge, investing more than $250 million in its NewEdge infrastructure. The Netskope Platform Engineering team – composed of veterans who have built and scaled out some of the largest clouds and content delivery networks (CDNs) in the world – draw from their real-world experiences solving some of the internet’s toughest challenges. The team designed NewEdge with both performance and resilience in mind, ensuring it meets the stringent business continuity demands of enterprises. This includes a focus on more than just the end-user experience, but also an obsession with customers’ internal operations. At the end of the day, NewEdge must not create a burden for network operators. Instead, the goal is for NewEdge to simplify and improve operations, compared with previously orthogonal security solutions that forced trade-offs.

NewEdge platform.
NewEdge platform.

In order to deliver on this mission, the building blocks of resiliency within the NewEdge architecture begin with the sheer volume of NewEdge data centres, or “network edges”, strategically positioned across 76 regions globally, including a presence inside Mainland China. Every data centre is built with complete service parity to the others, offering full compute and the complete set of SSE services in every location. The NewEdge capacity management strategy is also key to reliability and resiliency, as scaling is done intentionally, well ahead of demand. Because NewEdge is built on replicable units of capacity, new capacity can be added with ease, while reducing the potential blast radius. The Netskope “data centre factory” approach also ensures infrastructure consistency and enables rapid deployment of new data centres, often in just a few weeks.

The aggressive interconnection strategy of NewEdge is another critical building block. Every data centre has redundant, premium transit links as well as extensive peering – in total representing more than 4 000 network adjacencies and 700 unique ASNs today. In fact, Netskope is consistently in the “top 15” organisations globally, across all industries, in terms of global IX participation. For example, Netskope aggressively peers with Microsoft, Google, Salesforce, AWS and others at every location where there is a common interconnection point. With this focus on “connectedness” built into NewEdge, the Platform Engineering team has invested heavily in developing in-house software to maximise the use of the most highly-available and high-performing connectivity in real-time.

How NewEdge helps in the face of natural disasters and 'black swan' events

Returning to the discussion around “Black Swan” events, Netskope has consistently demonstrated the architectural advantages of NewEdge in the face of natural disasters and other unexpected events (eg, internet cable cuts or backbone router failures). Recently, in the wake of Hurricane Helene and prior to Hurricane Milton making landfall in the United States, aggressive actions were taken to bolster connectivity, specifically with Starlink. The Platform Engineering team predicted, with the likely impairment of critical internet resources in the affected areas, there would be a heightened usage of Starlink service. As evidence of the connectivity changes was quickly implemented over a 48-hour period, on 7 October, only 16% of outbound traffic from NewEdge data centres to the Starlink network were sent through direct connections, while after the deployment and by 9 October, that number grew significantly to 96%! This allowed Netskope customers to leverage the strengths of the Starlink backbone for reliable, low-latency routing to the nearest NewEdge data centre and onward to their ultimate destination (web, cloud, SaaS or private app) – all while maintaining their critical security coverage.

The story doesn’t end there, though. At Netskope, we obsess over identifying and remediating problems before our customers are impacted or have to open a support ticket. This requires that we think about more than “traditional” monitoring when it comes to understanding the health of our services and infrastructure. Accordingly, the Platform Engineering team has developed and deployed extensive synthetic (in addition to real-time) monitoring, which seeks to replicate user experience instead of relying on devices (or our customers) to tell us whether they’re healthy or not. Additionally, we collect and analyse endpoint and network telemetry data to automatically tune and improve NewEdge performance, as well as react when unexpected events occur. For example, underpinning the decision of which data centre a user will connect to is intelligence that factors in not only the nearest, but also the best performing and lowest latency onramp. At the same time, the Netskope client is aware of up to 10 nearby data centres, so if there is a network glitch or unplanned maintenance, that user will be connected in seconds to the next best data centre.

Leveraging this rich client and network telemetry data, event correlation and automated actions can then be taken quickly within NewEdge and often in a fully automated fashion (aka, without “hands on keyboards”). For example there have been numerous regional cable cuts this year – including a major event in March with the West Africa Cable System, MainOne, South Atlantic 3 and ACE sea cables – whereby auto failout of a specific data centre was necessary based on suboptimal performance and triggers like high latency, packet loss or jitter being identified. Of course, in this instance, with the aforementioned NewEdge Traffic Management 2.0 technology for data centre selection (also known as GSLB), users were automatically reconnected (to the next best data centre) with no service disruption.

There still remains the problem of other connectivity issues in a region – well outside of Netskope control – that cannot be solved by failing out a data centre, nor do they necessitate such bold measures being taken. This is where a final building block for NewEdge resilience comes into play with Route Control, which is already in the process of being rolled out across NewEdge globally. This technology kicks in to enable dynamic selection and prioritisation of optimised routes for end-to-end performance, for example, between two data centres as part of a ZTNA deployment or to optimise performance to an internet or cloud destination. The selection of premium transit and an aggressive interconnection strategy, mentioned earlier in this press release, play a key contributing role alongside Route Control. Ultimately, real-time analysis of routes combined with automation ensures performance and resilience at all times, to essentially identify and route around congestion or connectivity issues – events well outside Netskope control – even micro-outage events that only last for a few minutes.

Conclusion

Going forward and building on the unique innovations that are only possible with a private cloud approach, the Platform Engineering team at Netskope will continue to actively embrace AIOps for an anomaly-based, predictive approach to how it operates the NewEdge infrastructure. This will further enrich the building blocks of resiliency that make up NewEdge and the Netskope approach to business continuity. And it will give customers, especially NI&O teams, further confidence sending their traffic over NewEdge, viewing it as an extension of their own enterprise network, architected to handle the next hurricane or feared “black swan” event that might be on the horizon.

To learn more about Netskope, its industry-leading SSE, the NewEdge private cloud or get connected with a member of the Netskope Platform Engineering team, please visit: https://www.netskope.com/newedge

Share

Editorial contacts