Tiered Fallback Images

A few years back, I wrote about using nginx to serve fallback images from another domain, when those images were not available on the local filesystem. Today, I ran into the need to do something very similar, but with more than one level of fallback servers to try for the images on a staging site.

For some background on the setup, the images are all stored on S3 using Human Made’s S3 Uploads plugin on production as well as on the staging site. Every now and then, the production database is synced over to the staging site so that there is a complete set of production content to work with on staging. As part of this sync, all the image records come over as well, but since staging is pointed to a different S3 bucket, the images don’t work. A simple solution would be to copy the images from the production bucket to the staging bucket, but this results in a 2x cost increase for storage, which is less than ideal. Instead, I wanted a tiered image fallback approach that would serve the first image found, in this order:

  1. Local Files (on the staging server)
  2. Staging S3 Bucket
  3. Production S3 Bucket

In this way, all images ultimately fall back to the production S3 bucket, which means that any images records that come over in the production sync still work.


By default, the S3 Uploads plugin replaces all the urls to media with the S3 bucket URL. Since I wanted more control over where the images are serving from, we need to disable this behavior. Luckily, all this requires is defining a constant in wp-config.php:


The addition of this constant prevents rewriting of the media URLs, so they now all pointed back to my staging site domain.

Now that all the images were pointing back to my server, I just needed to set up the fallback logic in nginx. In the original approach, I defined an @image_fallback location block that used proxy_pass to proxy images from the other server, however, when using this approach, if a 404 error is returned, that error is passed directly on to the client. I needed to find a way to detect that error, and try yet another fallback. Turns out, there are a couple nginx configuration options that allow me to do just that: proxy_intercept_errors and error_page.

Here’s a modified version of the old image fallback location blocks, with a tiered fallback strategy:

location ~* ^.+\.(svg|svgz|jpg|jpeg|gif|png|ico|bmp)$ {
    try_files $uri $uri/ @stage;

location @stage {
    rewrite ^/wp-content/(.*) /$1; # In S3, the path starts with /uploads
    proxy_pass http://stagebucket.s3-website-us-east-1.amazonaws.com$uri;
    proxy_intercept_errors on;
    error_page 404 = @production;

location @production {
    rewrite ^/wp-content/(.*) /$1; # In S3, the path starts with /uploads
    proxy_pass http://prodbucket.s3-website-us-east-1.amazonaws.com$uri;

By enabling proxy_intercept_errors, nginx is able to detect the 404 error when the stage bucket does not have a copy of the image. The error_page declaration then instructs nginx to pass any 404 errors to the @production block, where we try the other bucket.

S3 Gotchas

If you’re using S3 for the fallbacks, make sure to keep the following things in mind, as they caused a few snags along the way. First, you’ll need to enable static website hosting on your bucket, and ensure you use that url in the proxy_pass declarations, or else S3 will throw 403 errors. Second, watch out for unintentional duplicate slashes in your urls. S3 is very literal in its parsing of urls – the path /uploads/1/image.jpg is treated differently than //uploads/1/image.jpg

Home Server KVM Base Image

At home, I have a server that I use for all sorts of random things, and because I like to complicate things (just ask Zach, he knows), I run a bunch of VMs inside of the server, to keep unrelated things separate from each other. I’ve worked on this over time, fixing pain points and bottlenecks, and currently have it so I can spin up a new VM in about a minute (with the help of Ansible). Here’s how it all currently works.

On the host, I have all the things installed required for KVM virtualization (qemu, libvirt, virtualization-tools, probably some others – I really need to get the hypervisor config into Ansible…). I use LVM to manage all of the storage volumes for guests. I have a volume group (vg_vps) on the host, and inside of that, a bunch of logical volumes. Each logical volume gets mounted as the disk drive for each guest.

Initially, the process to create a new VM was slow and painful – I’d create the new logical volume, attach it to the VM, and also attach a CentOS installation ISO to the VM, boot, manually install CentOS… It was a pain. Over the past week, I worked on getting a base image setup that I can clone new VMs from rather than having to do the manual installation step, and the time savings are awesome.

 Creating the Base Image

Creating the base image was very similar to just spinning up any other VM. First, I created a new logical volume, but this time, made it pretty small (4GB) to keep the clone time as quick as possible, and then booted the VM with the installation ISO. I named the logical volume after the CentOS version, so that I know what version I’m working with (centos7_1511). I installed CentOS, made sure the network interfaces start automatically, and configured partitions for the guest (I use LVM inside the guest as well, mostly because that’s what the installer wanted to do, and I didn’t want to fight it – It’s probably not really necessary). Once installed, I loaded up the OS, installed my public key and turned off SELinux. Then, I just shut off the VM, deleted it (but made sure NOT to delete the logical volume) and I have my base image!

Creating a VM

Once I have the base image, creating a new VM is easy.

  1. First, I create a new logical volume that is at least 4GB
    lvcreate --size=30G --name=mynewvm vg_vps
  2. Once I have that, I copy the base image to the new logical volume with the virt-resize tool
    virt-resize --expand vda2 /dev/vg_vps/centos7_1511 /dev/vg_vps/mynewvm.
    In this command, vda2 is the partition *inside* the VM that you want to expand (in my case, vda1 is just /boot, and vda2 contains everything else).
  3. Then, I remove anything specific to the base VM (like network configurations, ssh-hostkeys, log files, mail spool, cron-spool, etc).
    virt-sysprep -a /dev/vg_vps/mynewvm --enable=cron-spool,dhcp-client-state,dhcp-server-state,logfiles,mail-spool,net-hwaddr,rhn-systemid,ssh-hostkeys,udev-persistent-net,utmp,yum-uuid,customize

At this point, you just associate the “mynewvm” logical volume with a new VM definition, and you have a fully working VM.

LVM Gotchas

Remember above when I said I use LVM inside the guest? Turns out there is one more step you have to do to actually expand the guest filesystem because of this, that I don’t think you’d otherwise have to do. To work around this, I created a script in my base image (/root/growfs.sh). Here’s what’s in the script:

# Expands the vda2 filesystem to fill up the available space
# Does *NOT* expand the actual partition - assuming this is done with virt tools on the host side

echo "Expanding Physical Volume"
pvresize /dev/vda2

echo "Expanding Logical Volume centos-root"
lvextend -l +100%FREE /dev/mapper/centos-root

echo "Growing filesystem"
xfs_growfs /

To make sure I don’t forget to run it, I also have ansible setup the script to be run at boot, again using the virt-sysprep tool (This tool is seriously really handy)
virt-sysprep -a /dev/vg_vps/mynewvm --firstboot-command /root/growfs.sh

Centralized Let’s Encrypt Management

Updated March 16, 2017 to reflect current webroot settings

Recently I set out to see how I could manage lets encrypt certificates from one central server, even though the actual websites didn’t live on that server. My reasoning was basically “This is how I did it with SSLMate, so let’s keep doing it” but it should also be helpful in situations where you have a cluster of webservers, and probably some other situations that I can’t think of at this time.

Before I get too in depth with how this all works, I’m going to define what I mean by two servers we have to work with:

  • Cert Manager: This is the server that actually runs Let’s Encrypt, where we run commands to issue certificates.
  • Client Server: This is the server serving the website, say… chrismarslender.com 😉

Additionally, I have a domain setup that I point to the Cert Manager. For the purposes of this article, lets just call it certmanager.mywebsite.com.

High Level Overview

At a high level, here’s how it works with the web root verification strategy:

  1. I set up nginx on the Cert Manager to listen for requests at certmanager.mywebsite.com, and if the request is for anything under the path /.well-known/ I serve up the file the request is asking for.
  2. On the client servers, I have a common nginx include that matches the /.well-known/ location, and proxies that request over to the certmanager.mywebsite.com server.

Nginx Configuration

Here’s what the configuration files look like, for both the Cert Manager Server as well as the common include for the client servers:

Cert Manager Nginx Conf:

server {
    listen 80;
    server_name certmanager.mywebsite.com;
    access_log /var/log/nginx/cert-manager.access.log;
    error_log /var/log/nginx/cert-manager.error.log;

    root /etc/letsencrypt/webroot;

    location /.well-known {
        try_files $uri $uri/ =404;

    location / {
        return 403;

Client Server Common Nginx Include:

location ~ /\.well-known {
    proxy_pass http://certmanager.mywebsite.com;

Issuing a Certificate

Now lets say I want to issue a certificate for chrismarslender.com – here is what the process would look like.
I’m assuming chrismarslender.com is already set up to serve the website on a client server by this point.

SSH to the Cert Manager server, and run the following command:

letsencrypt certonly -a webroot --webroot-path /etc/letsencrypt/webroot -d chrismarslender.com -d www.chrismarslender.com

Eventually, this command generates a verification file in the /etc/letsencrypt/live/.well-known/ directory, and then Let’s Encrypt tries to load the file to verify domain ownership at chrismarslender.com/.well-known/<file>.

Since the client server hosting chrismarslender.com is set up to proxy requests under /.well-known/ to the Cert Manager server (using the common include above), the file that was just created on the Cert Manager server is transparently served to Let’s Encrypt, and ownership of the domain is verified. Now, I have some fancy new certificates sitting in /etc/letsencrypt/live/chrismarslender.com

At this point, you just have to move the certificates to the final web server, reload nginx, and you’re in business.

In practice, I actually use ansible to manage all of this – I’ll work on a follow up post explaining how that all works as well, but generally I end up issuing SSL certificates as part of the site provisioning process on the Client Servers, in combination with `delegate_to`. Also, ansible makes steps like the moving of certificates to the final web server must less labor intensive 🙂

Things to Figure Out

I’m still trying to figure out the best strategy to keep the certificates updated. I can run the Let’s Encrypt updater on the Cert Manager server and get new certificates automatically, but since it’s not the web server that actually serves the websites, I need to figure out how I want to distribute new certificates to appropriate servers when they are updated. Feel free to comment if you have a brilliant idea 😉

URL Based Variables in Nginx

Over the past few months, I’ve set up a few fairly complex staging environments for websites I’ve been working on.

One setup creates a new subdomain based on the ticket number so we can test just that branch of code. If the ticket number is ticket-123, the testing url might look something like ticket-123.staging.example.com. I have Jenkins set up to create a directory for each site at something like /var/www/html/ticket-123.

Another setup is a staging installation for a large multisite install that utilizes domain mapping, so there are many different domains all on the same multisite install (site1.com, site2.com, and site3.com). The staging server for this clones the production database, does some magic on the urls, and I end up with staging urls like site1.staging.example.com, site2.staging.example.com, and site3.staging.example.com. To save some disk space and avoid the headache of copying a bunch of media every time we move the database from production to staging, I proxy the images from the production site.

All of this could be set up manually, but creating a new nginx config file each time a new ticket is staged or having to set up a separate rules for each site we want to proxy images for on the multisite would be tedious work.

Here’s how I solved these issues.

Nginx allows you to use regular expressions in your server_name line. In addition, you can capture certain parts of the url for use later, by giving them a name. Here’s an example of how I match for a ticket number based on a URL structure that looks like ticket-123.staging.example.com

server {
    server_name  ~^(?P<ticket>.+)\.staging\.example\.com$

The above should match any subdomain on staging.example.com and store the preceding segment of the URL in the $ticket variable. Now that I have the $ticket variable, I can use this information to point nginx to the correct site root.

server {
    server_name  ~^(?P<ticket>.+)\.staging\.example\.com$
    root    /var/www/html/$ticket;

Now any request that comes in for a staged ticket will automatically serve the files from the correct location.

Multisite Image Proxy

The multisite install uses similar techniques for a different end result. In this case, we are only ever staging one codebase at a time (not different tickets), but there are a bunch of images that we want to proxy from the production server. Here’s the catch – the production urls vary, because the main site uses domain mapping. Here’s an example of how the URLs translate from production to staging

  • www. site1.com -> site1.staging.example.com
  • www.site2.com -> site2.staging.example.com
  • www.site3.com-> site3.staging.example.com

Luckily there is a pattern to how the URLs change, so this is a problem I was able to solve again using the named variable capture in Nginx.

Here’s an example of what the server name looks like in the Nginx config (It nearly identical to above)

server {
    server_name  ~^(?P<subsite>.+)\.staging\.example\.com$

Again, now that I have the $subsite variable available, I can use that to construct the URL to proxy images from (See this post for more on proxying images with Nginx).

Here’s what the nginx config looks like to accomplish the smart image proxy

server {
    server_name  ~^(?P<subsite>.+)\.staging\.example\.com$

    location ~* ^.+\.(svg|svgz|jpg|jpeg|gif|png|ico|bmp)$ { {
        try_files $uri @image_fallback;

    location @image_fallback {
        proxy_pass http://www.$subsite.com;


Fallback Images

When working on a website, I always develop locally – usually using Varying Vagrant Vagrants. I’ll often times pull down a copy of the production database and do a search and replace to make sure I’m dealing with local urls, so that I have some real content to develop with. This works great, except for a bunch of annoying missing images. I could download all the images from the server, but who wants all those files on their computer? I don’t.

My solution to this was to have nginx serve the images from the original server, if they are not present locally. It seems to be working great so far, and all it took was a few extra lines in the nginx config.

location ~* ^.+\.(svg|svgz|jpg|jpeg|gif|png|ico|bmp)$ {
    try_files $uri @image_fallback;

location @image_fallback {
    proxy_pass http://example.com;

For any .jpg, .gif, or .png file, nginx first tries to find the file locally. If its not there, it passes it along to example.com and tries to find it there. This could probably be expanded to try and get other file types as well, but at the time, I only needed these image types.

WordPress Unable to Send Emails

I was recently working on setting up an instance of WordPress multisite. As a matter of fact, this website is most likely being served from that very same instance, right now. In the process of setting it up, I noticed that I was not receiving any emails, like I would normally expect from WordPress. That’s strange, I thought, so I decided to pretend like I lost my password, to trigger an email. I didn’t get it. I continued to investigate, and eventually found there was a message from google, hidden away in the mail folder for the web server user. In response to the email that was sent for me to reset my password, google said the following…

Our system has detected that this message is not RFC 2822 compliant. To reduce the amount of spam sent to Gmail, this message has been blocked. Please review RFC 2822 specifications for more information

After a bit of searching google, I found a thread that said it was most likely something to do with the ‘Date’ or ‘From’ headers. So I went back to the delivery failure notification I received, to look at the original email’s headers. A-ha! The ‘From’ header looked a bit suspicious. WordPress was trying to send the email from “[email protected]*.marslender.com”.

After investigating, this comes down to the way that wp_mail() generates the ‘From’ header, when one is not explicitly provided to it. The function essentially takes the SERVER_NAME and adds a “[email protected]” before it. In addition, nginx is configured to respond to “*.marslender.com” so that any subdomain of marslender.com is directed to this site, so the SERVER_NAME had a value of “*.marslender.com” in this situation.

My Solution

Luckily, WordPress provides a filter for the from email address, called ‘wp_mail_from’. I just created a mu_plugin that hooks into that filter, and removes the ‘*.’ portion of the from email, if it was present.

In addition, it looks like there is a trac ticket related to this issue, so this may be resolved in a future version of WordPress.

WordCamp Portland 2013

WordCamp Portland 2013 is fast approaching. It is Saturday August 10th, at the Eliot Center in Portland, and I’ll be there. WordCamps are informal, community-organized events that are put together by WordPress users. Everyone from casual users to core developers participate, share ideas, and get to know each other. There should be a lot of interesting talks going on, including some by some of my new coworkers at 10up.

I’m really excited for a day full of WordPress oriented presentations and conversations, and hope to see some of you there!

New Job!

A bit late on this post, but I have a new job! I am now working as a Web Engineer at 10up, a premium WordPress agency. I am really excited about the new job, and the opportunity to work with some of the most talented people in the WordPress community. Not only does 10up create awesome websites that are built on WordPress, we also have a rather large handful of employees that are actively involved in developing WordPress core, and one of out directors is even a “Rockstar” leading contributor.

PHP’s strlen() function sometimes produces unexpected results

As UTF-8 encoding becomes more common in web development, issues arise every now and again that cause bugs in code, and these bugs are not always easy to figure out. I was recently working on a Swedish website, Opus Bilprovning, through Graphic Fusion here in Tucson. Since the site is in Swedish, there are some extra characters that I had to deal with that I don’t use everyday, such as ä, å, and ö to name a few. Other than trying to pick up a few basic Swedish words, these new characters didn’t seem to be much of a big deal.

Then strlen() came along and broke the script

All was going fantastic until I was trying to extract out a word from a longer string that was UTF-8 encoded. As a simplified example, lets take the word “tjänster”, which is Swedish for “services” and count the characters using strlen()

$string = "tjänster";
echo "Length: " . strlen($string);

Length: 9
Looking at the word, I count 8 characters, but strlen() is telling me there are 9 characters. Why is this so? The problem comes from how strlen() determines the number of characters in the string. In many character sets, one character is represented with 1 byte, so the length of the string is the same thing as the number of bytes in the string. Since this is the case, php’s strlen() function returns the number of bytes in the string.

Why this doesn’t work with UTF-8 strings

In UTF-8, not all characters are represented with 1 byte. In fact, characters can be represented with as many as 4 bytes in UTF-8. In this example, the character “ä” is represented using 2 bytes in UTF-8, so the strlen() function returns 9 rather than the expected 8.

How this is solved

If all you are trying to do is count the length of the string, the solution is very simple; just pass the string through utf8_decode() prior to calling the strlen() function.

$string = "tjänster";
echo "Length: " . strlen( utf8_decode($string) );

Length: 8

Alternate Solution

In addition, there is another function that can be used to accomplish the same thing, and it has the added benefit that it can be used with character sets other than UTF-8. The function is mb_strlen(), and it requires two pieces of information to be passed to it, first, the string to determine the length of, and second, the encoding that is used for the string.

$string = "tjänster";
echo "Length: " . mb_strlen( $string, 'utf-8' );

Length: 8

WordPress register_activation_hook doesn’t work with symlinks

I am currently working on developing a plugin, and as such am running apache, php, mysql, etc on my local machine. To make my life easier, I tend to remove the wp-content/plugins directory and replace it with a ‘plugins’ symlink that points to a folder with all of the plugins I generally use in WordPress. I hadn’t ever noticed any issues with this until today, when I was attempting to setup a function to be called on activation with the register_activation_hook() function. Here is how I was trying to implement this, which is the way that is suggested in the codex.


register_activation_hook( __FILE__, 'plugin_activation' );

function plugin_activation() {

//do the activation stuff here


Try this is any WordPress install that doesn’t use symlinks and it will probably work as expected, but when the plugins directory is actually a symlink, it doesn’t work, and this is why; In php, __FILE__ returns the path to the current file with symlinks resolved.

The following is what __FILE__ was returning


WordPress was expecting something more along the lines of this


When provided with the latter, WordPress is able to remove everything up to and including the plugin directory for the installation, so it would then be left with


Since the path with symlinks resolved doesn’t contain the plugin path as part of it, WordPress gets a bit confused.

To solve this, there are two options. The first would be to just not use symlinks, but I prefer to use them, so I personally chose to do the following.

Instead of using __FILE__ in the register_activation_hook() function, it can be replaced with this


basename(dirname(__FILE__)) returns the directory that the current file is within, so in this case ‘myplugin’. basename(__FILE__) returns the filename of the current file, which is ‘myplugin.php’. When all is said and done, this ends up returning ‘myplugin/myplugin.php’, which is ultimately what WordPress wants when it is calling the activation function.

For this to work properly, your plugin needs to be in its own directory (within the plugins/ directory) and this should be called from the main plugin file.