How and why Anchor runs PHP as CGI with suexec on shared hosting servers

This isn't something most people consider, but it's something that can have a profound impact on the performance and security of your server. We'll cover the usual methods for running PHP, what Anchor does differently (and more importantly, why), and what it means for you.

This is one of our more technical articles and will be your cup of tea/coffee/cola if you're interested in getting your hands dirty with server administration and deployment.

If you're a developer, you might be most interested in the final section, What all this means for developers, to learn about how it affects your workflow and practices (it's short and sweet, we promise).

How PHP is usually run (mod_php)

Whatever its shortcomings, PHP is very popular and here to stay. Some reasons:

  • Familiar, easy-to-learn syntax
  • Available pretty much anywhere
  • Well-supported
  • Extensively documented
  • Easy to deploy
  • It's fast!

Let's talk about that last point, speed. The main reason is because the PHP interpreter is embedded inside the apache process as a module, hence the "mod_php" moniker. Apache processes are persistent, which lets you overcome a substantial disadvantage of the CGI model, which requires spawning a new process for every request.

What's wrong with using mod_php?

Using mod_php for a single site on a server is fine. The problems arise when you want to use PHP for a large number of separate sites, owned by different users who don't trust each other. This is shared hosting in a nutshell.

Breaches of trust

Just as users on a shared hosting server don't share a single FTP login to manage content, you wouldn't want users to be able to read each other's web content, but this is exactly what mod_php allows. Because mod_php embeds the PHP interpreter in the running apache process, this means all PHP code is executed as a single user, typically apache (Redhat-type systems) or www-data (Debian-type).

To clarify:

  • Any user with a website can access anything else on the system that Apache can access

  • Apache necessarily needs to access all of the websites on a system
  • So all website-owners can see the files for all the other websites on a system, including all the little secret bits like database passwords...

    <?php
            system("cp /var/www/users/victim/public_html/blog/wp-config.php /var/www/users/pwnerer/booyah.txt");
    ?>
    Oh snap, I just stole someone's database login details!

Lack of accountability

It's less of an immediate concern, but the lack of accountability of a mod_php setup can be a real headache for a system administrator, one that even lolcat pictures can't cure. Some quick examples:

  • If I were a particularly dopey or malicious PHP coder, I could fill up /tmp or /var/lib/php5 with junk. This has a decent chance of breaking PHP's sessions for everyone. The files will all be owned by the apache user, so it's not trivial to see whose fault it is

  • Once I've stolen database logins I can go and drop databases. This amounts to a denial of service for the victim, and a waste of time for the sysadmin who needs to restore from backups
  • Broken or poorly-written apps can use all the available resources on the system (CPU time and RAM). PHP's built-in limits help mitigate against sloppiness, but can't do much about users determined to cause trouble.

As a sysadmin, I'd rather not spend my time explaining to a customer why someone else's PHP code deleted all their data (though it'd be a welcome change from telling them how their own code managed to nuke everything). Here's a short list of awesome things you could be doing instead of worrying about PHP security on your server/s:

  • Snowboarding
  • Rockclimbing
  • Getting your pilots license
  • Extreme-cold survival training
  • Honing your ninja assassin techniques

General security paranoia

As a final point against it, having PHP embedded inside the webserver just makes security-conscious sysadmins itchy.

Apache is a somewhat-trusted piece of code. We rely on it to behave correctly and enforce a certain amount of security policy for us. However, all code has bugs, and some bugs will create security vulnerabilities.

Here are two really good ways to reduce the number of bugs in a piece of software, and thus reduce the number of vulnerabilities that you're exposed to:

  1. Find bugs and fix them. Apache is a very mature piece of software, so lots of bugs have been fixed.
  2. Simply have fewer lines of code in the software. Take away code, and you take away places for bugs to exist.

Using mod_php is a bit like bolting the entire PHP codebase into apache. That's not entirely accurate or fair, but it does mean a much larger amount of code to worry about, and PHP certainly has its fair share of bugs.

How we do things at Anchor

The key here is to recognise that users can't be trusted. We can't trust them to be nice to each other, and we can't trust them to be nice to the operating environment - the goal is to separate users and keep them on a tight leash.

Separation

Separating users is a simple idea, but needs a bit of work to achieve effectively. Hopefully the following subsections make for easier reading, you're welcome to skip the implementation steps if you don't need them right away. There's a lot of smaller problems that need to be solved, and they all build up to a very nice solution.

Introducing suexec

Unix-type operating systems already have a robust and powerful security model that works well for multi-user systems, so we extend that to the web-serving sphere. Code owned by a user should be run by that user; that way the code can only affect that user's files. To that end, apache gives us a great tool called suEXEC.

Suexec's own documentation describes it best, but it's enough to know that suexec gives you a secure mechanism to run untrusted code as the owner (instead of the apache user) to serve web content.

To do this we necessarily need to trust that suexec isn't a security risk. The great thing is that you can inspect it for yourself, being open source. It's also very short and simple C code - it's much harder to introduce bugs in simple code, which makes suexec more secure and more reliable.

This model of using suexec changes things a bit for how you setup users on your server, which we'll cover now. Fundamentally, you're now using a CGI model for running PHP, and apache needs to be configured to call the PHP accordingly.

This is a reasonably simple change from a normal mod_php setup. We'll start by setting up things for CGI, then add suexec to the mix once they're working nicely.

Implementation basics

I'll assume you're already familiar with apache's usual set of configuration directives, but I'll cover apache's execution mechanism here, as it's not something you normally need to think about.

  • When apache serves something, it does so through a "handler". The default handler is aptly called default-handler, which is used for static content

  • You can assign a particular handler to a file, which can do arbitrary magic

  • The AddHandler directive maps filename extensions to the specified handler

  • Your handler can be an arbitrary executable

Now for a config snippet

AddHandler foobar-php .php
Action     foobar-php /cgi-bin/php-cgi

I've used AddHandler to direct .php files to the "foobar-php" handler, then the Action directive tells apache what to do for the foobar-php handler. You'd normally use a more standard handler name like "application/x-httpd-php", but I've used something abstract here to make a point.

There's a slight catch there, however. The path we pass to Action isn't a fully-qualified filesystem path, it's more like the path component of a URL. That's indeed how it's treated, which we'll deal with next.

(In case you were wondering, we can't use apache's built-in cgi-script handler because all your scripts would need a shebang line, and also need the execute bit set. This is an option for some apps like vbulletin, but it's terribly inelegant)

Every user has their own handler

We're making a small assumption here that every user will have a functional cgi-bin/ directory under their public_html/ directory. cgi-bin is a bit of an anachronism nowadays, but we still use it for its original role - a place to keep your CGI executables that are designated as being explicitly executable, and can't be downloaded to the browser.

Many apache configurations will already have something like this. ScriptAlias is used to map the /cgi-bin/ path to a specific directory, and mark the contents as executable.

ScriptAlias  /cgi-bin/  /var/www/users/username/cgi-bin/

The easiest way to use this now is to copy the system-wide php-cgi binary to each user's cgi-bin directory, but this is very tedious, and also no good if we update our version of PHP installed on the system.

We've chosen to use a very simple wrapper written in C to act as a shim for the real php-cgi binary. It also gives us the very nice feature of per-user php.ini files by taking advantage of the PHPRC environment variable. If a user wants their own custom configuration they're free to use one, and they have the power to do it themselves. If they're really adventurous, they can compile their own bleeding-edge version of PHP and use that directly...

#include <unistd.h>
#include <pwd.h>
#include <sys/types.h>
#include <stdlib.h>
#include <string.h>

/* This is the vendor-packaged PHP on the system  */
#define PHP_INTERPRETER "/usr/bin/php-cgi"

/* A custom php.ini can reside in ~/.etc/  */
#define PHP_INI_ETC_SUFFIX "/.etc/"

int main(int argc, char *argv[])
{
        struct passwd *p;
        char php_ini_dir[64]; // will truncate and fail on a _very_ long username

        /*  Determine and set the PHPRC location  */
        p = getpwuid(getuid()); // CGI does not provide the current username, so we figure it out ourselves
        if (p != NULL)
        {
                strncpy(php_ini_dir, p->pw_dir, 64);
                php_ini_dir[57] = '\0'; // Evil magic number, leave enough room for the etc-path
                strcat(php_ini_dir, PHP_INI_ETC_SUFFIX);
                (void)setenv("PHPRC", php_ini_dir, 1);
        }

        return execv(PHP_INTERPRETER, argv);
}

Compiling this is trivial

gcc -Wall -O2 -o php-handler wrapper.c

We put this compiled wrapper in every user's account at creation time via /etc/skel. Now every user has a copy of the wrapper for themselves, and uses the system-wide PHP installation by default.

Just add suexec

Now we're ready to enable suexec. There are a few small things that can trip you up here, depending on your system.

The first is that suexec is very strict about where scripts may live. Scripts must live under the compiled-in AP_DOC_ROOT path, which is traditionally /var/www (the alternative is to serve sites from http://server1.my-fantastic-hosting-company.com.au/~username/foo.php, but that seems somewhat unlikely). This is fine on some systems, but could be a complete show-stopper on others.

If the server is built from the outset with shared hosting in mind, then you can take this into account, and put all home directories under /var/www. It's a somewhat extreme solution, but it works.

Debian-type systems have a distro-specific enhanced version of the package called "apache2-suexec-custom". It lets you adjust the AP_DOC_ROOT parameter to something more convenient (eg. /home) without recompiling suexec, which is handy, but also a minor security tradeoff.

For Anchor's usage, we've taken an "interesting" approach that will either make you say "wow, that's clever", or make you squirm a bit. I had to do both a few times before my mind settled down.

suexec has a feature that detects if the script is being accessed via the UserDir feature, and relaxes the AP_DOC_ROOT restriction, instead checking that the script is under ~/public_html/. We made a one-liner change to the source and recompiled so that the userdir flag is now permanently set. This works very nicely for us with no real security tradeoff to speak of.

Keeping users in check

This is most of our work done now. We'd still like to lock users down a bit more so they don't use all our resources though. Now that each user is running code as themselves, we can do that.

RLimit

Apache has a few little-known config directives to apply to hard limits on processes spawned by apache workers, including CGI.

  • RLimitCPU - Limit the CPU consumption of a process to N seconds

  • RLimitMEM - Limit the memory consumption of a process to N bytes

  • RLimitNPROC - Limit the number of processes that can be launched to N

We typically set hard limits to 60 seconds of CPU time, 512mb of memory, and 100 processes. This can be overridden in each vhost, but they represent broadly sane values that allow a vast majority of code to run without any problems.

File ownership

Now that your code is run as the owner-user, any files that get created will be owned by that user. This means you can rely on filesystem quotas to enforce local policy, and practically eliminate the chances of a rogue user filling up a filesystem causing problems for other users.

For quota-less filesystems, like tmpfs for /tmp, you can at least see who's using a lot of space and deal with them appropriately. You'll know when this happens because you're monitoring systems will alert you. You *do* monitor your diskspace, don't you? :)

Downsides of running PHP as CGI

The biggest downer for running things as CGI is performance. CGI is old-fashioned, and doesn't handle lots of traffic well because it has to fork a process and start the interpreter every time. In contrast, mod_php keeps the interpreter running all the time and is always ready to go.

At Anchor we've decided that the performance hit is worth it. Our shared hosting servers are quite beefy, and the only real drawback is that we can't throw thousands of shared hosting customers on some crappy hardware. Really, when one server gets full you just buy another one. In the grand scheme of things a few extra servers for shared hosting is no big deal. A typical server has a pair of quad-core xeon CPUs, 16gb of RAM and fast SCSI drives in RAID-1 or RAID-10, and runs quite happily with maybe 500-600 accounts on it.

PHP with CGI+suexec - Because You're Worth It

Alternatives to CGI

What if you don't like CGI? There's a small handful of alternatives, though it's hard to say whether they're contenders.

  • FastCGI is, generally speaking, The alternative, and also what you use with other webservers like lighttpd that don't have a mod_php. There's a lot of fiddly bits you can tweak to get it working just right, but performance is comparable to mod_php once it's running. FastCGI uses a persistent daemon model that can gives you the benefits of suexec (separation) while keeping the interpreter running (unlike CGI).

  • suPHP is really very similar to suexec, just streamlined specifically for PHP

  • Separate apache instances for each user: the ultimate in user-separation. I don't think anyone in their right mind would do this due to the overhead and difficulties of getting them to coexist peacefully, but you could run a separate instance of apache for each user. Each user has full control over their apache and PHP config, and is basically guaranteed isolation from other users.

What all this means for developers

This is all for you, baby. We'd wouldn't be able to go snowboarding or jumping out of helicopters or assassining if there weren't people making websites.

  • You will never need to chmod 777 your files (make them world-writable) to make your blog/CMS/mini4chan work properly. Ever.

  • Naturally, your files don't need to be readable by anyone else either
  • You'll never be stuck with files owned by apache that you can't edit
  • Immune to gradual memory leakage (not an entirely obvious benefit for a developer, you have to trust us on this one)
  • Potential headroom to run more apache process, because each one uses less memory. The PHP interpreter is only loaded into memory when needed, rather than hanging around for every HTTP request

The last couple are kinda lame - because they pale in comparison next to the first advantage.

How about some more-specific examples?

  • Uploading files to Drupal (and any other CMS/blog/Enterprise e-solution) will Just Work
  • Wordpress' self-update functionality will Just Work, no need to give it FTP login details
  • Creating wp-config.php file for a new Wordpress installation? That will Just Work too

  • Joomla and whatever extensions you're using with it, will Just Work
  • Magento will Just Work
  • Expression Engine Just Works

Got any other apps with permissions problems? We reckon they'll just work.


References/External Links