This article describes the challenges that internet scale enterprises face, and gives an overview of an architecture to deal with those challenges. We will delve into the components of this architecture in subsequent articles.
At the end of 2018, three of the top five companies in the world by market capitalisation are Amazon, Alphabet (Google) and Facebook. Blockbuster, a once mighty chain of video stores, has been driven by relentless competition from Netflix, to close down it’s stores until only a single store remained in Bend, Oregon. The disruption caused by Netflix to traditional brick and mortar stores is not unique, similar havoc has been caused by Uber, a taxi company that does not own any taxis, and by AirBnB, a hotel company that does not own any hotels. The thing the FANGS (Facebook, Amazon, Netflix and Google) all have in common is that they are technology companies, and in particular internet companies, that have only come in to being in the last decade or so, and that they operate at internet scale.
Operating at internet scale means
■ Very high transaction volumes, usually thousands of transactions per second
■ Geographic distribution – these companies operate with a global customer base
■ Big data – exabytes of data or more
■ Having to scale rapidly on demand – Amazon Black Friday or Amazon Prime days
■ Safely and quickly bringing new products to market
■ Protecting against the threats of a global army of bad guys
■ Being “Always On” – there is no room for maintenance windows
If you already work for an internet scale enterprises (a ‘Unicorn’), you should have a good idea of how to architect for internet scale. If you are an architect, who works for a traditional enterprise (a ‘Horse’), you might think that you don’t need to concern yourself with architecting for internet scale – but you’d be wrong! Traditional enterprises embark on digital evolution journeys that integrate currently disparate services globally across a large number of customers, they may be faced with operating 24/7 or opening up closed internal APIs publicly.
This article references a number of open source and commercial tools. They are examples with which I gained experience in a specific project context. It is not meant to favour one tool over another.
Internet Scale Deployment Stack
The last few years I have been involved in the ongoing development of an internet scale architecture for a traditional enterprise. In this article, I will give an overview of the logical building blocks of an internet scale architecture based on projects that I have been involved in. In subsequent articles, I will look at particular solution choices, using solutions that I have direct experience employing. I’m assuming the use of the REST architectural style, implemented using JSON / HTTP(s). The constraints of the REST architectural style lead naturally to scalability being a property as shown in . The layers of the logical architecture are shown in Figure 1.
A possible deployment is shown in Figure 2
In the following sections I present the different elements of the internet at scale deployment stack.
Our architecture is based on clients consuming REST services. Broadly speaking the clients will fall into one of the following categories
■ Native mobile applications running under iOS or Android
■ Other native API clients, for example, server applications that consume third party services
You’re architecting for internet scale, so that means a worldwide user base who expect their applications to load and respond quickly. You need to serve up content and data as close as possible to the user. Figure 2 illustrates how we distribute our core services and data over a number of geographic regions, so we require some form of global load balancing. It is useful though to do as much processing as we can even closer to the user than merely being in the same geographic region, for example, serving up static content such as images or perhaps slowly changing reference data.
Your applications are undoubtedly going to be very attractive to a large class of bad hats. I’ve seen an API go live and get hit with a DDOS attack in mere minutes. A good way to deal with these sorts of threats is to head them off as close to their source as possible.
In a subsequent article, I will look at a suite of services from Akamai that I class as edge services. These include:
■ Distributed reverse proxies
■ Content Delivery
■ Global Traffic Management / Load balancing
■ DDOS protection
■ Bot protection
These Akamai services are executed relatively close to the user, often in the same datacentre as the user’s ISP.
Enterprises I have worked with have adopted an API First approach, building product teams around the delivery of REST APIs and creating an economy of API consumers. If your product is successful, you will have many potential users who want to use your APIs. You want to be able to:
■ Provide clear and comprehensive documentation of your APIs
■ Provide a self-service capability for prospective users to obtain access to APIs by obtaining credentials and authorisation grants
■ Monetise your APIs. This will likely involve creating usage plans which limit number and frequency of API requests depending on how much the customer will pay, usually with a free plan for limited numbers of requests
■ Provide a sandbox environment, for customers to test their API clients
As described in , , scaling an application in practice involves applying a scaling strategy in one or more of these three axes
■ X-Axis - run multiple copies of the same application
■ Y-Axis - decompose the application into functionally separate components
■ Z-Axis - partition the data
The most commonly used approach to Y-Axis scaling is to adopt a microservice architecture, which often incorporates a degree of Z-Axis scaling as well, since microservices often have their own distinct data stores. I’m going to assume a microservice architecture, given the desirable Y and Z Axis scaling. We’ll discuss microservices in more detail in a dedicated article but for the purposes of this article we note that a microservice is a small, independently deployable and testable services organised around business capabilities, that interact in a loosely coupled fashion. The environment in which they are deployed is the microservice runtime.
Now to make things more scalable, we want to apply some X-Axis scaling to our microservices. We want to be able to do things like:
■ Scale up as demand increases
■ Scale down as demand decreases and we can save run costs
If we’re running many thousands of replicas, then failure of a microservice or the hardware it runs on is basically guaranteed. Also, we cannot manually assign microservices to compute nodes or manually configure load balancers, so we need to have automation to do the following:
■ Restart failed microservice instances
■ Assign microservices to compute nodes
■ Automatically configure load balancers
We can pretty much take as a given that our microservice runtime will be some form of container, usually a Linux container such as Docker. There are essentially two approaches to achieve the capabilities we have listed above. The first is a Platform as a Service approach (PaaS) in which we are not aware of the containers and the other is one in which we have a platform that we use directly to schedule and manage containers. An example of the first is CloudFoundry which has a number of implementations, for example Pivotal CloudFoundry. An example of the second is Kubernetes which is available as a service on the Google Cloud Platform, Azure and IBM Cloud, as well as on premise. We will look at both CloudFoundry and Kubernetes in subsequent articles. We will also look at service meshes such as istio.
Persistence comes in various flavours from traditional relational databases to modern NoSQL datastores. Relational databases ruled the roost for a few decades but proved difficult to scale.
X-Axis scalability usually ends up being challenged by the necessity of having some form of shared storage, especially for stateful services, whilst Z-Axis scalability via sharding of data, is normally done using very application specific strategies. If you’re operating at internet scale you will likely want some or more of the following properties:
■ Zero downtime including when upgrading
■ Ability to cluster across geographic regions
■ Ability to grow to very large sizes
These are either qualities that are not normally associated with an RDMS or that are very expensive to achieve. We can achieve all of these (and more) together with the X and Z axis scaling using Cassandra, a linearly scalable, peer-to-peer, shared nothing database. We will look at this in a subsequent article (in particular at the commercially supported Datastax Enterprise version).
Highly scalable applications often employ asynchronous processing as this reduces the number of processes that are in a waiting state as requests are processed, so we include a messaging capability in our architecture. We will look at Kafka which is not only message-oriented middleware, but is also a distributed consistent log, which makes it an implementation option for an Event sourced architecture .
It is unlikely that you are going to be on an entirely green field project, so we need mechanisms to integrate with traditional enterprise applications. In a future article we will deal with building a data pipeline from enterprise applications using Oracle Goldengate, Oracle Goldengate for Big Data, and Apache Spark.
Security is a very important topic and we cover some security perspectives of an internet scale architecture in other articles. We will really only scratch the surface, looking at Oauth2, OIDC and perhaps network and container security in Kubernetes.
We’ve ended up with a very complex distributed system, which poses distinct operational challenges over and above those of a traditional monolithic system. Understanding how the system is performing and whether the system is behaving as it should is critical, as is being able to troubleshoot issues after they have occurred. We will look at Prometheus, which is a monitoring tool, particularly well-suited to Kubernetes environments, and ELK, a centralised log aggregation and analysis tool. If I have not died of old age, we will also look at Zipkin, for distributed tracing.
As we have seen, internet scale architectures are needed by more than just the unicorns of the world, and the building blocks of such architectures are now within reach of architects working within more traditional enterprises. In subsequent articles we will delve more deeply into each of the areas we touched upon in this article, drawing upon hard-won personal experience. See you soon!
 Architectural Styles and the Design of Network Based Architectures, Dissertation – Roy Fielding 2000
 The Scale Cube
 The Art of Scalability: Scalable Web Architecture, Processes and Organizations for the Modern Enterprise, Martin Abbot and Michael Fisher, 2010
 CQRS Documents, Greg Young