Deploying Acquia Cloud Platform CDN With Confidence
High-performing websites require thought and intentionality behind their design and implementation. A single web page today is composed of many requests that happen over the network. These requests could include the markup for the page you're looking at, CSS instructions for how the page should be styled, fonts, images, interactions with analytics tools, and much more.
A common method to improve performance for all those requests is to use a Content Delivery Network (CDN), which is now out-of-the-box on Acquia Cloud! But, how do you set it up? More importantly, why do we even use a CDN? Let's explore these questions and equip you with guidelines for how to set up Acquia Cloud Platform CDN on your own project, and articulate its importance.
Let's Start With How
Before we can get to the "why" of using a CDN, it would be helpful to have some vocabulary about what a CDN is and how it works.
Let's start with the concept of HTTP caching. The HTTP protocol has instructions that tell a browser it can cache a response for a period of time. There are a lot of configurations that vary in use across browsers and servers, but let's just focus on one of those instructions called the Cache-Control header. This header can tell a browser that it’s allowed to cache an HTTP response for a period of time.
Take an About page as an example. Say the server responds with a Cache-Control header with the value max-age=60,public
. This tells the browser that it can cache the response for one minute. Here's a visual of what that looks like:
You can see that the second and third requests from that browser are cache hits; the requests never hit the server. Why? Because the browser was told it can cache the response for one minute.
This is great for that user. But, what about all the other users coming to the site? They won't get a cache hit. Introducing...the HTTP proxy cache! An HTTP proxy sits in between your browser and the server that the HTTP request is going to. By default, an HTTP proxy cache just lets HTTP requests pass back and forth between the browser and the server. These HTTP proxies are allowed to respect the cache rules of the HTTP protocol, hence why we call them proxy caches. So, imagine many users going to the site. Each one will have their own browser cache, but the proxy will have its own cache. Here's what that would look like:
In this instance, user one goes to the site and the request goes all the way to the server because the user's browser does not have a cache value yet, nor does the proxy cache. But, when user two goes to the site, even though the user doesn't have a browser cache yet, the proxy cache does. So, the request doesn't go all the way back to the server, it just goes to the proxy cache and the response is returned.
Now, imagine there are 1,000 distinct users going to the site all within that single minute, only the first request would go all the way back to the server. The rest of the requests would be served from the proxy cache.
Why are we talking about proxy caches? Because, in large part, that's what CDNs do; that's how they work, and when you're thinking about how Acquia Cloud Platform works, it's good to keep this in mind.
Why Use a CDN?
Why would you use a CDN with Acquia Cloud, especially knowing that it already comes with a proxy cache called Varnish? Doesn't that seem like it's just duplicating functionality? Not exactly, especially when you think about the geo problem.
The Geo Problem
You're not sitting in the same room as the server that rendered this blog post to you. You might be miles away from the server and network latency can have a big impact on how quickly the site responded. With Acquia Cloud, you have some flexibility over what geographic region your servers are in. Let's say that About page we talked about earlier in our example was hosted on servers on the east coast of the United States. If you also live on the East Coast of the United States, you're in luck. But, what if you're viewing that page from Kenya?
Your browser is going to wait for each request to the server (including the Varnish proxy cache) on the East Coast of the United States and back. The network latency in this case can have a critical impact on the site's perceived performance to the user.
Well, what if we could serve that content from a server closer to the user? That is to say, what if we had a network of servers that could serve this content and the user gets the content from the server closest to them? That would certainly help with the geo problem! Introducing...the Content Delivery Network! With a CDN like Acquia Cloud Platform CDN, your users will get content from a server closest to them (after caching rules are applied).
Other Benefits of a CDN
There are other benefits to a CDN besides addressing the geo problem. It can reduce the overall requests that hit your Acquia subscription, which might help you target a lower subscription level. It can help improve your site's performance under peak load.
Consider the fact that any request served from the CDN is a request that does not consume resources like memory or Central Processing Unit (CPU) on the application or database servers. There are also security benefits for some CDNs, which are worth investigating on a case-by-case basis to see if they apply to you.
How to Setup Acquia Cloud Platform CDN
Acquia Cloud Platform CDN comes with your Acquia Cloud Enterprise subscription, however, there are some steps to get it set up which we'll discuss here.
On the tech side, there are a few interesting points:
- It's supported by the Acquia Purge module, which means you can do active purging of expired content.
- It doesn't support customizations: it's not compatible with a custom Varnish config (VCL) and it really only responds to the Cache-Control and X-Drupal-Cache-Tags headers from Drupal (it’s a little more complicated than that, but that’s the basics).
- It's not compatible with other HTTP proxies in front or behind it.
- It uses Fastly under the hood.
- It's still in Beta as of the time of this writing, so your setup process may vary from what's laid out here.
Here are two useful documentation links if you'd like to read more:
And now, the moment you've been waiting for—the steps for setting up Platform CDN. Here are my personal notes; again, since Platform CDN is in beta this may change, here is what I'm recommending:
1. First, talk with the Acquia Account manager to confirm Platform CDN is available on the subscription.
2. Add all domains you want supported to all environments in Acquia.
3. Add SSL certificates on each environment, ensure those certificates cover all domains on their respective environment.
4. Create a Support ticket to enable Platform CDN, be sure to clearly outline the Application as it is named in Acquia, which Environments, and Domains you want supported.
5. At this point, expect some back-and-forth with Acquia support as you iron out details of the setup. For example, at this point, you may go through setup of the Purge module.
6. Once Acquia confirms it's set up on their side, verify the CDN is working (we'll talk through verification later).
7. Update your Domain Name System (DNS) records to a low Time to Live (TTL) so that if you switch over to Platform CDN and it doesn't work, you can quickly switch it back (optional).
8. Update your DNS records. This will make Platform CDN live.
9. Again, verify the CDN is working.
10. Last, update your DNS records to a higher TTL (optional).
The overall process may take some time. I would set expectations at 3-4 weeks to include time to do testing, roll out code changes, and coordinate the rollout with Acquia.
How Do You Verify It Works?
The last thing you want is to switch your DNS over to Platform CDN only to realize some configuration is wrong and your site is down. You can easily prevent this scenario by verifying it's set up correctly, and below we'll go through five things to check. You’ll want to wait to do these verification steps until after Acquia has confirmed the CDN is set up on their side.
1 - Verify SSL
First, verify SSL is set up correctly. Of the five verification checks I list here, this is the only one you can do prior to cutting over your DNS. To verify Secure Sockets Layer (SSL), you can start by verifying the SSL certificate on the server environment itself is correct. The way I do this is a bit of a roundabout, but it works.
Get the public IP of one of the load balancers using a tool like nslookup. The domain for it's usually a pattern like sitenamestg.prod.acquia-sites.com
where sitename
is the name of your subscription. Then, pick one of your custom domains and set it to that IP in your /etc/hosts file (this file may be located in a different place depending on your operating system). Here's an example walking through these steps:
First, get the IP of the load balancer:
$ nslookup examplestg.prod.acquia-sites.com Server: 127.0.0.1 Address: 127.0.0.1#53 Non-authoritative answer: Name: examplestg.prod.acquia-sites.com Address: 151.101.41.193
Now, we set your custom domain to this IP in your /etc/hosts:
$ vim /etc/hosts ... 151.101.41.193 stg.example.com
Finally, we can open up our browser and check that the SSL certificate is valid. Both Firefox and Chrome will show a padlock in the address bar.
If you're on Chrome, you can additionally check what IP address stg.example.com
resolved to by looking at the headers of the request in the network tab:
Now, repeat these steps for each domain on each environment you set up. If you're planning a DNS cutover for a new site launch, you can even test the live domain with this tactic. For example, if you set up the domain "www.example.com" on your PROD environment, but you don't want DNS to point there yet, you can still set up the SSL certificate and verify it works using this method.
Last, remember to remove those entries from your /etc/hosts file!
2 - Verify DNS is Pointing to Fastly
This is an easy check, but at this point, it requires that you have updated your DNS records according to what Acquia support has noted in the setup instructions. Take each domain and verify it's pointing to the correct location using a tool like NsLookup.
$ nslookup stg.example.com Server: 127.0.0.1 Address: 127.0.0.1#53 Non-authoritative answer: stg.example.com canonical name = acquia.map.fastly.net. Name: acquia.map.fastly.net Address: 151.101.189.193
3 - Verify HTTP and HTTPS Ports Are Open
This is also an easy check. It may seem unnecessary, but doing this can give you assurance that at least the path on the network from your local computer to the destination ports are working OK. I love doing this because I know if it works, any issues I do run into are at least not related to firewall issues. There are a variety of port checking tools you can use, here we'll use Netcat (nc).
$ nc -z -w 1 151.101.189.193 80 $ echo $? 0
You can see the exit code was 0 which means it succeeded. Now, we'll check the HTTPS port.
$ nc -z -w 1 151.101.189.193 443 $ echo $? 0
4 - Verify You Get Cache Hits From the CDN
Let's say you think the CDN is set up and working correctly and the site comes up, how do you know you're getting cache hits from the CDN and not Varnish? That is, how do you know the request is being returned from the CDN instead of going all the way to the server environment and back? We can inspect the HTTP response headers from the server to tell us this. To do this, we'll use curl, though you can use any HTTP client that shows you the HTTP response headers.
$ curl -ksD /dev/stdout -o /dev/null "https://stg.example.com" ... cache-control: max-age=60, public x-cache: MISS, MISS x-cache-hits: 0 ...
You'll see the x-cache header had MISS, MISS
. This means the request was a miss on the CDN and a miss on Varnish. More importantly, note that the x-cache-hits value is 0. This means Varnish has had no cache hits for this request. So, let's make that request again.
$ curl -ksD /dev/stdout -o /dev/null "https://stg.example.com" ... cache-control: max-age=60, public x-cache: MISS, HIT x-cache-hits: 1 ...
Great! We see a cache hit! But, that was a hit from Varnish. How do we know? Because the x-cache-hits header incremented by 1. The x-cache-hits header is controlled by Varnish, not the CDN. So, what we want to see is a request where that value does not increase. Let's make the request again.
$ curl -ksD /dev/stdout -o /dev/null "https://stg.example.com" x-cache: MISS, HIT x-cache-hits: 1
Great! We see the x-cache-hits stayed at 1. This means the result came back from the CDN, it didn't go to the server environment.
5 - Verify Browser Cache Is Working
If you've already passed the last four checks, you're in good shape. The CDN is working. However, you probably also want to check that your browser's cache is also working. It's an easy check to do, here is an example of how to check it in Firefox:
Here you see the network tab. The "Transferred" column will show "cached" if it was served from browser cache. Be sure to look at different asset types to make sure they are getting cached: HTML, JS, CSS, Fonts, Images.
What Are Good Cache Settings?
Now that you know how a CDN works, why you would use one, and how to set up Acquia Platform CDN, you might be wanting to dig deeper into tuning your cache settings. How do you know what good cache settings are?
First, it's important to understand that you don't simply cache a "page," you cache the resources that make up the page. A given page might comprise a variety of resources. Here's an example that I pulled that shows a breakdown of what types of resources make up the "page" by the size of each resource:
Resource: https://www.webpagetest.org/
You can see that over 95 percent of the size of the page is JS, CSS, Images, and Fonts which are all highly cacheable. By default with Drupal, those will be cached for 14 days! That's pretty good, and depending on your site, you may consider increasing or decreasing that value which you can find in the htaccess file.
The HTML on the other hand is far trickier. The HTML may be highly-static content like an About page that you don't expect to change very often. Maybe you're OK if it takes 24 hours for someone to see an updated version of the content; that's pretty great cacheability for HTML content. But, what if that HTML has pricing or inventory of a product? That’s not very static, so if you do let it be cached you don't want it to be cached very long. A user seeing the wrong price might result in an unhappy customer.
The setting for HTML caching in Drupal is set in Configuration > Development > Performance, under "Caching" you'll see a setting called "Browser and proxy cache maximum age." If you change this value, keep in mind that any HTTP cache (like a user's browser or a CDN) will keep the cache for the last time it read the value.
Here are some reasons for a high cache maximum age:
- You have the Purge module set up to actively purge expired content.
- Your content is highly static.
Here are some reasons for a low cache maximum age:
- You are not using the Purge module to actively purge expired content.
- Your content is highly dynamic.
There are even reasons for disabling cache in certain scenarios. For example, some content may be sensitive and you want to ensure no one has a copy of it, including the browser's cache (read this drupal.org issue for an example).
If you read around, you'll find recommendations that vary widely. Some recommendations are conservative around 1 to 60 minutes, and some are not, and say 6 to 12 hours. Unfortunately, it's difficult to make general recommendations about caching policies. The truth is, it depends. And, for complex sites, you're not making a policy for all content and all users—it may vary by user role or by type of content.
For example, you may want content pages to have a high-cache maximum age, but product pages have a low-cache maximum age. The policies will also depend on what other caching headers you are using, a key one being the Vary header. Ultimately, you'll need to put some thought and rigor into deciding on what policies best suit your needs.
It's worth repeating that high-performing websites require thought and intentionality behind their design and implementation, and cache settings are a fundamental aspect of high performance.