Do you own or manage a WordPress site? Is it complex, with a lot of traffic? I’m going to show you the ideal architecture to host a highly available and scalable WordPress site on AWS.
This blog is intended to show a distributed architecture in AWS for WordPress CMS websites. You will be able to identify the different layers of services needed to run a high-traffic WordPress site.
The Concepts of a Highly Scalable and Available Environment
Let’s start by looking at what a highly scalable and available environment means—and what it includes:
High Availability: On IT it means that a system can operate without any service disruption or interruption for a long time, as well as a system that has redundant components. (In our case, the redundant Infrastructure will be our high availability — thanks, AWS, for making this easier!)
Distributed Services — Loose Coupling: The art of distributing different components of a system in the same network. We will be doing this to leverage the load of any resource and have dedicated hosts for a service.
Scalability: A system’s ability to monitor the user demand and automatically increase or decrease resources. This scalability will be provided by the following AWS Resources: AWS ELB, AWS EC2, AWS S3, AWS RDS.
One essential part of innovation is always to stay up to date and implement the best technology available.
First Things First
If you are planning to follow this blog, it is because you are looking to migrate to Amazon Web Services or you already have an environment or some resources allocated there.
If you are already on AWS, we have also compiled a very concise checklist, which is available here.
AWS Services Explained
- Route53: This service will help us to manage all the upcoming requests (traffic) for our domain. AWS Route53 will also be the DNS manager, so here we will point our domain registrar to be the manager. Once it has been migrated here, we can add the A records, Cnames or TXT records and many other more.
- AWS VPC: The AWS VPC is a service that allows us to have a private network in which we will allocate our cloud computing resources. That means nobody has access to it—only ourselves.
- AWS Private Subnet: The private subnet is the subnet in which we are going to deploy the resources we don’t want outsiders to have access to. In this case, our database will be only accessible by the application.
- AWS Public Subnet: The public subnet is the subnet in which we are going to deploy the resources we want to make public — such as the server for our website.
- AWS S3: Amazon Simple Storage Service will be our content storage solution. Here, our WordPress installation(s) will have the content available whenever it scales up or off.
- AWS CloudFront: CloudFront combined with S3 will help us to spread the content faster to the end users. This way our WordPress multimedia content will be spread all over the AWS CDN network (edge locations), which will be used from our application to reduce the latency. Users will be served by the closest edge location available. So faster content = faster site.
- AWS Load Balancer: Here is the tricky part. The AWS ELB will allow us to distribute the traffic load between our instances available or in use depending on how the auto scaling is set.
- AWS EC2: Instances—also called “VMs”—the service on AWS for acquiring computing power. AWS EC2 will allow us to host our WordPress site and files we need.
- AWS Autoscaling: This will be the solution for our highly available and scalable WordPress site. AWS Autoscaling will take care of always having a minimum quantity of instances available for the public, and in case something goes wrong, it will replace the instance with a healthy one. This way our site will always be available. Autoscaling will also benefit our WordPress by creating instances based on the traffic demand and will help us to save costs while the traffic is low.
- AWS RDS and Multi A-Z: RDS will be the service in which we will host our WordPress database. The service is completely free and fully managed by AWS, which means we won’t need to worry about the database management anymore. Also useful is that RDS instances can have replication between them, which adds even more scalability. Also, enabling the Multi-Availability Zone (Multi A-Z) feature will provide us high availability. Keep an eye on this, however, since the price will is double for the RDS.
- AWS Cloudwatch: AWS Cloudwatch will be our trusted supervisor. Cloudwatch will be in charge of monitoring all the resources in our AWS account. It will keep an eye on the metrics that are predefined or as default by AWS resources, such as CPU usage, memory usage, disk IO and networking.
Did I forget about EFS? Nope. In my experience, EFS has not been a huge help when it comes to sharing the WordPress files only. However, you can try using EFS; that way, the content will be replicated much faster and easier.
If you want to learn more about the technologies and the approach that should be taken to create this kind of environments, take a look at this post.
Millions of Page Views Served
Here’s how this environment will be able to serve millions of page views/visitors:
Once visitors hit your domain on their browser their request is passed to the internet, then routed to your DNS manager. The DNS manager (Route53) will solve the request to the specified server and the web server then will serve the application.
What will happen when your traffic increases? Here is an example:
Let’s say you have set the autoscaling to one as the minimum number of instances and a maximum number of three instances. Based on different metrics, your environment will be able to scale out depending on the load it receives.
In the image above, we can see that depending on the aggregate load, the environment continues to satisfy the demand by creating more instances as it grows. The same happens the other way.
This means a lot when you are calculating the TCO of your cloud environment. It also means you won’t run out of memory or other resources; instead, you will have a more available system.
And you might be wondering, What is going to happen to my AWS Resources? Here’s a visual:
The answer is simple: More demand or traffic, more servers to meet it.
Server Caching/WordPress Caching
Server-side caching or WordPress-side caching happens with a plugin or is embedded in AH host. Whenever someone requests the homepage, for example, the request is passed to the database to retrieve the homepage information. Caching creates a temporary file (I say temporary, because you can specify the expiration) so when your request comes, it will check for the files it generates instead of processing the request to the database.
This way, we can serve faster pages/content to our users and reduce the database consumption.
If you yet don’t have a caching system implemented, make sure you do from now on, as it speeds up your website dramatically.
Content Delivery Network
In a few words, the content delivery network (CDN) the network of servers or services in which your content will be hosted and this network will be used to transfer the content to all the visitors. Using a CDN will help you to spread your content around the world faster than having it hosted on your server.
Based on the CPU and memory metrics, our stack will be able to scale up or scale in as desired. This way our response time will always be the same between all the servers and no instance will experience overload due to CPU or memory (these are the annoying 500’s errors).
On AWS we have different instance types that we can use depending on the workload we have. It’s important to choose the right instance type.
If you don’t know which to use, read this blog.
Definitely using the correct instance type for our WordPress will help, as well as selecting the number of resources for it to run.
And there you have it —you now know how your site can support millions of page views with a distributed system.
Backup and Disaster Recovery
It is essential to have a backup strategy as well as a disaster recovery solution in case something happens to our WordPress installation or our environment. Luckily, AWS offers a lot of redundancy between all its services.
There are different ways in which we can protect our application and users’ data:
- WordPress plugins to back up the site (S3).
- Disaster recovery for infrastructure (physical).
- Backup for user data (DB — RDS).
- Backup for application code.
Amazon Web Services already provides different backup solutions for all its services, so we can create backups for the RDS with one click and perform the configurations for backups with EC2.
- To increase the performance use a database, query and page caching plugin.
- Increase the RDS performance with a dedicated Aurora RDS.
- Use S3 and CloudFront to deliver the content faster.
- Enable alarms on AWS Cloudwatch.
- Enable the billing alarm to avoid unwanted charges for usage.
- Use Reserved Instances—if you are going to stay in AWS for a long time reserve, it can save you up to 30 percent.
- Use a version control system to track your changes and deploy to the instances.
This article outlines what I consider an ideal architecture for a high-performance/high-traffic WordPress site. As you can see, Amazon provides us with all the tools and technologies to do it, but sometimes it can be challenging.
As always, if you think I missed anything, please post it in the comments!
If you would like to get more techie blogs, subscribe to our newsletter.