<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=4958233&amp;fmt=gif">
RSS Feed

Architecture | Gareth Badenhorst |
28 January 2019


This article describes the challenges that internet scale enterprises face, and gives an overview of an architecture to deal with those challenges. We will delve into the components of this architecture in subsequent articles.


At the end of 2018, three of the top five companies in the world by market capitalisation are Amazon, Alphabet (Google) and Facebook. Blockbuster, a once mighty chain of video stores, has been driven by relentless competition from Netflix, to close down it’s stores until only a single store remained in Bend, Oregon. The disruption caused by Netflix to traditional brick and mortar stores is not unique, similar havoc has been caused by Uber, a taxi company that does not own any taxis, and by AirBnB, a hotel company that does not own any hotels. The thing the FANGS (Facebook, Amazon, Netflix and Google) all have in common is that they are technology companies, and in particular internet companies, that have only come in to being in the last decade or so, and that they operate at internet scale.

Operating at internet scale means

Very high transaction volumes, usually thousands of transactions per second
Geographic distribution – these companies operate with a global customer base
Big data – exabytes of data or more
Having to scale rapidly on demand – Amazon Black Friday or Amazon Prime days
Safely and quickly bringing new products to market
Protecting against the threats of a global army of bad guys
Being “Always On” – there is no room for maintenance windows

If you already work for an internet scale enterprises (a ‘Unicorn’), you should have a good idea of how to architect for internet scale. If you are an architect, who works for a traditional enterprise (a ‘Horse’), you might think that you don’t need to concern yourself with architecting for internet scale – but you’d be wrong! Traditional enterprises embark on digital evolution journeys that integrate currently disparate services globally across a large number of customers, they may be faced with operating 24/7 or opening up closed internal APIs publicly.

This article references a number of open source and commercial tools. They are examples with which I gained experience in a specific project context. It is not meant to favour one tool over another.

Internet Scale Deployment Stack


The last few years I have been involved in the ongoing development of an internet scale architecture for a traditional enterprise. In this article, I will give an overview of the logical building blocks of an internet scale architecture based on projects that I have been involved in. In subsequent articles, I will look at particular solution choices, using solutions that I have direct experience employing. I’m assuming the use of the REST architectural style, implemented using JSON / HTTP(s). The constraints of the REST architectural style lead naturally to scalability being a property as shown in [1]. The layers of the logical architecture are shown in Figure 1.

Figure 1

A possible deployment is shown in Figure 2

Figure 2

In the following sections I present the different elements of the internet at scale deployment stack.


Our architecture is based on clients consuming REST services. Broadly speaking the clients will fall into one of the following categories

Browser based Single Page Applications (SPA) developed using a JavaScript framework such as Vue, React or Angular
Native mobile applications running under iOS or Android
Other native API clients, for example, server applications that consume third party services


You’re architecting for internet scale, so that means a worldwide user base who expect their applications to load and respond quickly. You need to serve up content and data as close as possible to the user. Figure 2 illustrates how we distribute our core services and data over a number of geographic regions, so we require some form of global load balancing. It is useful though to do as much processing as we can even closer to the user than merely being in the same geographic region, for example, serving up static content such as images or perhaps slowly changing reference data.

Your applications are undoubtedly going to be very attractive to a large class of bad hats. I’ve seen an API go live and get hit with a DDOS attack in mere minutes. A good way to deal with these sorts of threats is to head them off as close to their source as possible.

In a subsequent article, I will look at a suite of services from Akamai that I class as edge services. These include:

Distributed reverse proxies
Content Delivery
Global Traffic Management / Load balancing
DDOS protection
Bot protection

These Akamai services are executed relatively close to the user, often in the same datacentre as the user’s ISP.

API Management

Enterprises I have worked with have adopted an API First approach, building product teams around the delivery of REST APIs and creating an economy of API consumers. If your product is successful, you will have many potential users who want to use your APIs. You want to be able to:

Provide clear and comprehensive documentation of your APIs
Provide a self-service capability for prospective users to obtain access to APIs by obtaining credentials and authorisation grants
Monetise your APIs. This will likely involve creating usage plans which limit number and frequency of API requests depending on how much the customer will pay, usually with a free plan for limited numbers of requests
Provide a sandbox environment, for customers to test their API clients

Microservice Runtime

As described in [2], [3], scaling an application in practice involves applying a scaling strategy in one or more of these three axes

X-Axis - run multiple copies of the same application
Y-Axis - decompose the application into functionally separate components
Z-Axis - partition the data

The most commonly used approach to Y-Axis scaling is to adopt a microservice architecture, which often incorporates a degree of Z-Axis scaling as well, since microservices often have their own distinct data stores. I’m going to assume a microservice architecture, given the desirable Y and Z Axis scaling. We’ll discuss microservices in more detail in a dedicated article but for the purposes of this article we note that a microservice is a small, independently deployable and testable services organised around business capabilities, that interact in a loosely coupled fashion. The environment in which they are deployed is the microservice runtime.

Now to make things more scalable, we want to apply some X-Axis scaling to our microservices. We want to be able to do things like:

Scale up as demand increases
Scale down as demand decreases and we can save run costs

If we’re running many thousands of replicas, then failure of a microservice or the hardware it runs on is basically guaranteed. Also, we cannot manually assign microservices to compute nodes or manually configure load balancers, so we need to have automation to do the following:

Restart failed microservice instances
Assign microservices to compute nodes
Automatically configure load balancers

We can pretty much take as a given that our microservice runtime will be some form of container, usually a Linux container such as Docker. There are essentially two approaches to achieve the capabilities we have listed above. The first is a Platform as a Service approach (PaaS) in which we are not aware of the containers and the other is one in which we have a platform that we use directly to schedule and manage containers. An example of the first is CloudFoundry which has a number of implementations, for example Pivotal CloudFoundry. An example of the second is Kubernetes which is available as a service on the Google Cloud Platform, Azure and IBM Cloud, as well as on premise. We will look at both CloudFoundry and Kubernetes in subsequent articles. We will also look at service meshes such as istio.


Persistence comes in various flavours from traditional relational databases to modern NoSQL datastores. Relational databases ruled the roost for a few decades but proved difficult to scale.

X-Axis scalability usually ends up being challenged by the necessity of having some form of shared storage, especially for stateful services, whilst Z-Axis scalability via sharding of data, is normally done using very application specific strategies. If you’re operating at internet scale you will likely want some or more of the following properties:

Zero downtime including when upgrading
Ability to cluster across geographic regions
Ability to grow to very large sizes

These are either qualities that are not normally associated with an RDMS or that are very expensive to achieve. We can achieve all of these (and more) together with the X and Z axis scaling using Cassandra, a linearly scalable, peer-to-peer, shared nothing database. We will look at this in a subsequent article (in particular at the commercially supported Datastax Enterprise version).


Highly scalable applications often employ asynchronous processing as this reduces the number of processes that are in a waiting state as requests are processed, so we include a messaging capability in our architecture. We will look at Kafka which is not only message-oriented middleware, but is also a distributed consistent log, which makes it an implementation option for an Event sourced architecture [4].

Data Pipeline

It is unlikely that you are going to be on an entirely green field project, so we need mechanisms to integrate with traditional enterprise applications. In a future article we will deal with building a data pipeline from enterprise applications using Oracle Goldengate, Oracle Goldengate for Big Data, and Apache Spark.


Security is a very important topic and we cover some security perspectives of an internet scale architecture in other articles. We will really only scratch the surface, looking at Oauth2, OIDC and perhaps network and container security in Kubernetes.


We’ve ended up with a very complex distributed system, which poses distinct operational challenges over and above those of a traditional monolithic system. Understanding how the system is performing and whether the system is behaving as it should is critical, as is being able to troubleshoot issues after they have occurred. We will look at Prometheus, which is a monitoring tool, particularly well-suited to Kubernetes environments, and ELK, a centralised log aggregation and analysis tool. If I have not died of old age, we will also look at Zipkin, for distributed tracing.


As we have seen, internet scale architectures are needed by more than just the unicorns of the world, and the building blocks of such architectures are now within reach of architects working within more traditional enterprises. In subsequent articles we will delve more deeply into each of the areas we touched upon in this article, drawing upon hard-won personal experience. See you soon!


[1] Architectural Styles and the Design of Network Based Architectures, Dissertation – Roy Fielding 2000
[2] The Scale Cube
[3] The Art of Scalability: Scalable Web Architecture, Processes and Organizations for the Modern Enterprise, Martin Abbot and Michael Fisher, 2010
[4] CQRS Documents, Greg Young

Gareth Badenhorst

Senior Architect

Gareth Badenhorst is a Lead Solution Architect and Design Authority in Endava with more than two decades experience in IT. He has a background as an engineer and still enjoys getting his hands dirty whenever possible. He has travelled the journey from traditional service oriented architectures to microservices and has had a recent focus on the end to end architectural concerns. Gareth likes music, movies and microservices. He also rocks the best mustache this side of Movember.


From This Author

  • 30 October 2020

    API Management

  • 14 May 2019

    Edge Services



  • 13 November 2023

    Delving Deeper Into Generative AI: Unlocking Benefits and Opportunities

  • 07 November 2023

    Retrieval Augmented Generation: Combining LLMs, Task-chaining and Vector Databases

  • 19 September 2023

    The Rise of Vector Databases

  • 27 July 2023

    Large Language Models Automating the Enterprise – Part 2

  • 20 July 2023

    Large Language Models Automating the Enterprise – Part 1

  • 11 July 2023

    Boost Your Game’s Success with Tools – Part 2

  • 04 July 2023

    Boost Your Game’s Success with Tools – Part 1

  • 01 June 2023

    Challenges for Adopting AI Systems in Software Development

  • 07 March 2023

    Will AI Transform Even The Most Creative Professions?

  • 14 February 2023

    Generative AI: Technology of Tomorrow, Today

  • 25 January 2023

    The Joy and Challenge of being a Video Game Tester

  • 14 November 2022

    Can Software Really Be Green

  • 26 July 2022

    Is Data Mesh Going to Replace Centralised Repositories?

  • 09 June 2022

    A Spatial Analysis of the Covid-19 Infection and Its Determinants

  • 17 May 2022

    An R&D Project on AI in 3D Asset Creation for Games

  • 07 February 2022

    Using Two Cloud Vendors Side by Side – a Survey of Cost and Effort

  • 25 January 2022

    Scalable Microservices Architecture with .NET Made Easy – a Tutorial

  • 04 January 2022

    Create Production-Ready, Automated Deliverables Using a Build Pipeline for Games – Part 2

  • 23 November 2021

    How User Experience Design is Increasing ROI

  • 16 November 2021

    Create Production-Ready, Automated Deliverables Using a Build Pipeline for Games – Part 1

  • 19 October 2021

    A Basic Setup for Mass-Testing a Multiplayer Online Board Game

  • 24 August 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 3

  • 20 July 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 2

  • 29 June 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 1

  • 08 June 2021

    Elasticsearch and Apache Lucene: Fundamentals Behind the Relevance Score

  • 27 May 2021

    Endava at NASA’s 2020 Space Apps Challenge

  • 27 January 2021

    Following the Patterns – The Rise of Neo4j and Graph Databases

  • 12 January 2021

    Data is Everything

  • 05 January 2021

    Distributed Agile – Closing the Gap Between the Product Owner and the Team – Part 3

  • 02 December 2020

    8 Tips for Sharing Technical Knowledge – Part 2

  • 12 November 2020

    8 Tips for Sharing Technical Knowledge – Part 1

  • 30 October 2020

    API Management

  • 22 September 2020

    Distributed Agile – Closing the Gap Between the Product Owner and the Team – Part 2

  • 25 August 2020

    Cloud Maturity Level: IaaS vs PaaS and SaaS – Part 2

  • 18 August 2020

    Cloud Maturity Level: IaaS vs PaaS and SaaS – Part 1

  • 08 July 2020

    A Virtual Hackathon Together with Microsoft

  • 30 June 2020

    Distributed safe PI planning

  • 09 June 2020

    The Twisted Concept of Securing Kubernetes Clusters – Part 2

  • 15 May 2020

    Performance and security testing shifting left

  • 30 April 2020

    AR & ML deployment in the wild – a story about friendly animals

  • 16 April 2020

    Cucumber: Automation Framework or Collaboration Tool?

  • 25 February 2020

    Challenges in creating relevant test data without using personally identifiable information

  • 04 January 2020

    Service Meshes – from Kubernetes service management to universal compute fabric

  • 10 December 2019

    AWS Serverless with Terraform – Best Practices

  • 05 November 2019

    The Twisted Concept of Securing Kubernetes Clusters

  • 01 October 2019

    Cognitive Computing Using Cloud-Based Resources II

  • 17 September 2019

    Cognitive Computing Using Cloud-Based Resources

  • 03 September 2019

    Creating A Visual Culture

  • 20 August 2019

    Extracting Data from Images in Presentations

  • 06 August 2019

    Evaluating the current testing trends

  • 23 July 2019

    11 Things I wish I knew before working with Terraform – part 2

  • 12 July 2019

    The Rising Cost of Poor Software Security

  • 09 July 2019

    Developing your Product Owner mindset

  • 25 June 2019

    11 Things I wish I knew before working with Terraform – part 1

  • 30 May 2019

    Microservices and Serverless Computing

  • 14 May 2019

    Edge Services

  • 30 April 2019

    Kubernetes Design Principles Part 1

  • 09 April 2019

    Keeping Up With The Norm In An Era Of Software Defined Everything

  • 25 February 2019

    Infrastructure as Code with Terraform

  • 11 February 2019

    Distributed Agile – Closing the Gap Between the Product Owner and the Team

  • 28 January 2019

    Internet Scale Architecture