Storing Confidential Data in WordPress

When using a content management system like WordPress, it is obvious that the content site owners and collaborators create and manage needs to be persistently stored somewhere. In WordPress, this storage space is typically a MySQL database. For most WordPress sites, every single request to the site results in several queries to the database so that the content stored can be displayed.

When extending the capabilities of WordPress through plugins, such plugins usually leverage that same database to store their own data. As a plugin developer you are probably already familiar with the many APIs that WordPress provides to integrate with database storage; for example the Options API to store and retrieve options, or the Meta API to store and retrieve metadata. However, do you ask yourself what the consequences of storing data in a WordPress database are?

Not all data is equal. Certain types of data that plugins (or WordPress core itself) need to store are more sensitive than others. Think about personal data from all the customers of your WooCommerce shop, the figures of revenue you are making from affiliate links, or API credentials to access personal information from your Google account. For any data you deal with in WordPress, you should ask yourself:

How sensitive or potentially confidential is the information I would like to store?
What can I do to store the information safely?

In this post, we will look more closely at how we can deal with more sensitive information in WordPress from a security perspective.

Identifying Sensitive Data

There are various types of sensitive data that you may need to deal with in WordPress. First of all, you could differentiate between actual sensitive information and indirect means to get to such information. For example, the amount of money you have in your bank account is a piece of information that you might not want to give to anybody else. The username and password to your online banking account are not the same but, if someone else knew them, they could get access to the same sensitive information. Worst case, they might even be able to perform certain actions using these credentials.

Generally, credentials should always be considered sensitive data, since they are basically a key that can unlock access to a ton of sensitive information. There may be some exceptions, such as an API key that you only need so that the service it connects you to can identify your application against their quotas (in other words, the service would not provide any sensitive information). But even for this kind of key, you probably wouldn’t want someone else to know it because otherwise they could misuse it to make their own requests, reducing your available quota.

When it comes to information, it depends on its type whether it is considered sensitive. There are several types of sensitive information, such as personally identifiable information, like your name or your address. There is also personal information that, connected to any personally identifiable information, would be considered confidential. For example you certainly want to be careful with who knows about your health, potential injuries or psychological issues you have or have had. There is also some “almost-credentials-like” personal information that, if disclosed, could lead to crimes like fraud – for example your credit card numbers or your social security number. And then there is a lot of non-personal information that is sensitive too: consider disclosing confidential information about a new product or service that a company is going to launch may be harmful to their business, just as certain financial data. Last but not least, there is classified information, such as information that underlies strict security regulations, for example imposed by the government.

With any of the above, you should be really careful about storage. Could you possibly achieve the desired functionality and user experience without storing the data? What would be the trade-off? And then, if you do need to persistently store this piece of data: What can you do to store it as safely as possible? Let’s look at this in WordPress now.

Storing Sensitive Data

Sensitive Data in WordPress

You may already be laughing out loud, or shaking your head at what you’re reading here: “Storing Confidential Data in WordPress? Yeah, sure.” And it is true. The security level around WordPress sites varies greatly, and it can be assumed that the average is quite low, which is influenced by various factors:

General security measures available on the server stack, for example as set up by the hosting provider
Quality of the WordPress plugins and themes installed on the site which could include security flaws or even perform malicious actions themselves
Security awareness of the administrative users of the WordPress site, for example the strength of their passwords or whether they use two-factor authentication

A major issue that is omnipresent in WordPress and particularly affects the second point above is the lack of sandboxed access in WordPress. As a plugin or theme, you can basically do anything you want: A malicious plugin for example can easily send data from the database to third parties or create new administrator accounts for people that should not have access. While the flexibility of WordPress is one of its virtues, from a security standpoint it is an absolute no-go. Given the size of the WordPress ecosystem, the amount of available plugins and themes, and the lack of standardized and required quality control, WordPress sites make attractive targets for hackers, with likely the biggest problem being malicious plugins they create with seemingly helpful features, just to get into the system. And then, again, these plugins can do anything they want. Essentially, you should avoid storing confidential data in WordPress. But please, read on.

This is not a rant against WordPress, and my enthusiasm for the platform is not vanishing or anything like that. We just have to face the fact that the baseline WordPress site is likely insecure in some capacity, of course strongly depending on how much security expertise is involved.

Sensitive Data Elsewhere

So what if you still want to access sensitive data in WordPress? Depending on the type of data you need to store or access, there are a myriad of web services out there that provide much higher security standards than the average WordPress site. We are not going to look into specifics here, and of course it is of paramount importance that such a service is chosen consciously, after extensive research. What we are going to look at now though is the following: How could you access such data, which is stored on a third-party service, in WordPress? That was one of the things which we needed to implement when developing the Google Site Kit plugin for WordPress. In case you haven’t heard about it, Site Kit makes data from Google services such as Search Console, Analytics, AdSense, and others available in the WordPress backend to help publishers gather contextual insights and actionable feedback about their audience. The Google user data stored on these services is confidential, so it is not publicly accessible and needs to be protected. We will use this plugin as an example going forward.

A third-party service that can expose data stored in it probably does so via some form of external API. WordPress sites for example have a REST API, and many services provide one as well. In order for a site to connect to such an API, a service that stores sensitive data should require credentials: this could be an API key, a username and password, or an access token. Hence, such credentials need to be persistently stored in WordPress so that they are available to the site.

Note that, depending on how the respective service allows you to obtain these credentials, it may be more or less secure. A widely established way of securely authenticating a site on behalf of a user, for example to access sensitive information from that user, is the OAuth 2.0 framework. It is relied on by many large web services, including most external-facing Google APIs, which Site Kit needs to access data from. Setting up a WordPress plugin like Site Kit to connect to an OAuth 2.0 provider in a secure way is its own challenge, in fact it was one of the biggest challenges we had to solve while developing the plugin – so this will not be covered in this post, but it is something I plan to share more about in the future. What we are going to focus on now is what we did to store obtained access tokens for Google APIs in the WordPress database.

Storing Credentials in WordPress

So what do we need to consider when storing credentials in WordPress? There are four emergency cases we can consider here:

An unauthorized party got access to a copy of our database.
An unauthorized party got access to a copy of our database and our site’s source code.
An unauthorized party got access to a copy of our database, our site’s source, and configuration files containing secret keys etc.
An unauthorized party got (and still has) live access to our database, our site’s source and configuration files containing secret keys etc., for example through a malicious plugin.

The bad news first: In cases 3 and 4, there is not much we can do to prevent abuse of the credentials we store in the database. If a party has access to both the database and all files that are part of the site, we’re basically screwed. If they didn’t just steal the database and files (e.g. through some leak), but actually have ongoing full access to them (case 4), it’s even worse. The good news are: At least for cases 1 and 2 we can do something to prevent the unauthorized party from abusing our stored credentials.

You might already know that credentials, of whatever kind, should never be stored in plaintext. Just as much that you should not write your password on a post-it attached to your screen, you should not write it into your WordPress database – at least not in plaintext. In WordPress, passwords are stored in a hashed version. You may have heard someone say that passwords in WordPress are encrypted – while this term is sometimes used interchangeably, it is technically incorrect. Encrypting a value and hashing a value are two different things:

Encrypting uses an algorithm to cipher a value with a key so that it can only be deciphered (decrypted) using the same key. Because you can decrypt the result to get back the original value, encrypting is a two-way function. Encrypting is useful when you need to be able to access the original value again, but you don’t want to store it in plaintext.
Hashing uses an algorithm to map a value to a fixed length. If a value is hashed with a proper algorithm, you cannot “un-hash” it – that is why hashing is a one-way function. Hashing is useful when all you need to know is whether a certain plaintext value matches the original value of what was hashed. Hashing is not useful when you want to be able to access the original plaintext value that is stored again – since you can’t “un-hash” it.

This is why WordPress passwords are hashed: The only reason the stored value needs to be used for is to compare whether the password that the user enters on next login matches what was originally stored as the password. WordPress never displays the password anywhere, so it never needs to be able to decrypt it. And because there is no need for that, it’s safer to use a hashing function which basically makes it irreversible.

So which of the two methods do we need to use for storing credentials for external APIs in WordPress? Let’s think about it: We need the credentials for every request to the external service – for example, in the Site Kit plugin, we need to attach the access token that we obtained through Google’s OAuth implementation to every API request we issue to Google APIs. We could ask the user to remember the access token and enter it for every API request, but that would result in a terrible user experience, because of how regularly API requests need to be made and because of how long an access token typically is. What is much nicer for the user is of course when it just works without them doing anything extra. In other words: The plugin needs to store the access token in a way that it can use it later without asking the user for it again – hence, encryption is the way to go here.

As mentioned in the brief explainer about encryption above, encrypting is a two-way function. So when implementing this in a plugin, we will need one function that encrypts a value and another function that decrypts a value. We will also need to define some stable key that the plugin can leverage to encrypt and decrypt. It is important that this key does not change – if it did, it would be impossible to decrypt values that were encrypted using another key.

So let’s scaffold our implementation. What we did for Site Kit, and I feel that is generally a good idea, is defining a class, for example called Data_Encryption. The class should expose two public methods, based on the above:

encrypt( string $value ): string|bool → Returns the encrypted version of the given $value, or false if encryption failed.
decrypt( string $raw_value ): string|bool → Returns the decrypted version of the given $raw_value, or false if decryption failed.

Here is our skeleton for the Data_Encryption class:

<?php

class Data_Encryption {

	public function encrypt( $value ) {
		// TODO: Encrypt $value.
		return $value;
	}

	public function decrypt( $raw_value ) {
		// TODO: Decrypt $value.
		return $value;
	}
}
Code language: PHP (php)

That’s already it for the public methods the class will need. Now, in order to encrypt and decrypt values, the class will need awareness of a key to use. And in addition, we will also have it require a salt for improved security. Oh, a salt? Yes, salting is yet another security term we need to look at:

Salting is a concept primarily used in conjunction with hashing, but also useful when encrypting: Before hashing/encrypting a value, a so-called “salt”, which is another unique secret value, is appended to it. It is an extra security measure that makes it harder to e.g. hack a password form (hashing) or distill the original value from an encrypted value (encryption).

In other words, salting is not strictly necessary, but it is a good practice to use. It is in fact more beneficial in combination with hashing than with encrypting, but even for that, why not? It surely does not make things less secure.

For our Data_Encryption class, there are three alternatives for handling the key and salt:

Either the constructor of the class requires them as parameters.
Or the constructor runs logic to determine them internally.
Or the constructor supports receiving them through optional parameters and, if they are not provided, it runs logic to determine them internally (basically a combination of the above two).

This is really a matter of preference and your surrounding setup. For Site Kit, we wanted to keep the class’s public interface as simple as possible and handle determining the key and salt internally, which also clarifies that these values have to be persistent. What we did is introduce private methods get_default_key() and get_default_salt() that return the default key and salt respectively. These two methods are called in the constructor to set the key and salt as properties. So we went for the second alternative from above.

So where do we get these values for key and salt from in WordPress? There are certainly multiple alternatives, and here’s what we chose to do (for both methods):

<?php

class Data_Encryption {

	private $key;
	private $salt;

	public function __construct() {
		$this->key  = $this->get_default_key();
		$this->salt = $this->get_default_salt();
	}

	// encrypt and decrypt methods omitted for readability.

	private function get_default_key() {
		if ( defined( 'GOOGLESITEKIT_ENCRYPTION_KEY' ) && '' !== GOOGLESITEKIT_ENCRYPTION_KEY ) {
			return GOOGLESITEKIT_ENCRYPTION_KEY;
		}

		if ( defined( 'LOGGED_IN_KEY' ) && '' !== LOGGED_IN_KEY ) {
			return LOGGED_IN_KEY;
		}

		// If this is reached, you're either not on a live site or have a serious security issue.
		return 'das-ist-kein-geheimer-schluessel';
	}

	private function get_default_salt() {
		if ( defined( 'GOOGLESITEKIT_ENCRYPTION_SALT' ) && '' !== GOOGLESITEKIT_ENCRYPTION_SALT ) {
			return GOOGLESITEKIT_ENCRYPTION_SALT;
		}

		if ( defined( 'LOGGED_IN_SALT' ) && '' !== LOGGED_IN_SALT ) {
			return LOGGED_IN_SALT;
		}

		// If this is reached, you're either not on a live site or have a serious security issue.
		return 'das-ist-kein-geheimes-salz';
	}
}
Code language: PHP (php)

So what do we do in this code exactly?

We check for whether a special constant for the key/salt exists, something which we define for the plugin. In our case, we named these GOOGLESITEKIT_ENCRYPTION_KEY and GOOGLESITEKIT_ENCRYPTION_SALT respectively. It is recommended to define these constants, e.g. in your wp-config.php, and set them to a secret value, for example values like the ones generated by https://api.wordpress.org/secret-key/1.1/salt/. Once defined, these values should never be changed.
Unfortunately, we cannot expect every WordPress site owner to follow this recommendation and update their configuration with these constants. Therefore we need to have a fallback. Fortunately, WordPress has some constants already for security purposes. It uses them for hashing and not encrypting, but it is still a somewhat okay alternative for when no specific key and salt for our own plugin have been provided. We opted to go for WordPress’s LOGGED_IN_KEY and LOGGED_IN_SALT constants, which are typically filled via https://api.wordpress.org/secret-key/1.1/salt/ as well.
There is a small chance that even these constants are not defined or empty. If that is the case for a site, that site is terribly insecure already with WordPress core only, and we are basically screwed. What we can do here in our plugin is nothing, so we just went ahead and put an easter egg in there. If your site is missing any of the constants from https://api.wordpress.org/secret-key/1.1/salt/, please change that ASAP.

The WordPress constants for keys and salts, as well as our Site Kit-specific ones, should never be changed, as already mentioned. If they ever do (or if you just defined the Site Kit ones while you previously relied on the WordPress ones), your encrypted data will no longer be able to be decrypted. This is something to keep in mind overall in your decision for whether to store a sensitive piece of data in WordPress or not. There certainly are WordPress site owners out there who don’t know that these values should remain constant, and if they change them, they will lose their encrypted data. For Site Kit this is not much of an issue, since it only stores credentials, and if the site loses these credentials, they can be re-obtained through the OAuth flow. If you think about storing data in WordPress though for which that storage is the canonical or only source, think about doing this twice. In this case, losing the encryption key would effectively mean losing all encrypted data forever.

Another highly important thing for managing encryption keys in WordPress: Do not store an encryption key in the WordPress database. The database already houses the data that you would like to encrypt. If someone got unauthorized access to your database and it included the key to the encrypted data, the encryption would be pointless. They could then just use the same open-source code you are using in your plugin, or even if your plugin wasn’t open-source, they could try several popular encryption algorithms and would likely be successful. Storing encrypted data and the relevant encryption keys in separate places is an absolute must. Now, if somebody had access to your site’s full source code and the database, they would still be able to decrypt the values, unless you keep the actual constant values outside of your source code. While this is not the way WordPress is set up by default, I highly recommend following this pattern:

At a minimum, your WordPress configuration must never be included in any repository. For example, you could use .gitignore to exclude your wp-config.php file.
A good way to keep wp-config.php and its security-irrelevant logic part of the source code is to outsource the actual security-relevant values (database credentials, secret keys, etc.) to a separate file. For example, this could be constants in a wp-config-local.php file or, even better, environment variables in a .env file (this is what e.g. Bedrock does, which you should check out if you haven’t yet). The respective file must then be excluded from version control as well.
To go one more step, a great solution would be to rely on environment variables that are set from outside your project directory on the server level. While this is likely not possible on many hosts and requires some extra fiddling with the server, it is an even more secure approach. For example, this is how Pantheon sets the credentials to your WordPress database.

Now that we have dealt with the encryption key and salt, let’s implement the actual encryption and decryption methods. As pointed out before, doing this will require us to use an algorithm. There are countless encryption algorithms out there, but it is important to choose one that is considered secure. AES is a specification that has been established as a standard in the early 2000s and is still considered that as of today. Because of it being a standard, it is also built-in to many programming languages’ default libraries, including PHP, so we will use it for our Data_Encryption class. In order to leverage it for encryption in PHP, the openssl extension has to be installed on the server, which for most environments should be the case.

Let’s look at the implementation of our encrypt method, and we’ll go through it in detail afterwards:

public function encrypt( $value ) {
	if ( ! extension_loaded( 'openssl' ) ) {
		return $value;
	}

	$method = 'aes-256-ctr';
	$ivlen  = openssl_cipher_iv_length( $method );
	$iv     = openssl_random_pseudo_bytes( $ivlen );

	$raw_value = openssl_encrypt( $value . $this->salt, $method, $this->key, 0, $iv );
	if ( ! $raw_value ) {
		return false;
	}

	return base64_encode( $iv . $raw_value );
}
Code language: PHP (php)

The main function here is openssl_encrypt, to which we’re passing several parameters:

The value to encrypt, with the salt appended to it.
The cipher method to use for encryption, essentially the algorithm, for which we pass “aes-256-ctr”. You can use PHP’s openssl_get_cipher_methods function to get a list of all supported values.
The key to use for encryption.
We don’t need to pass any option flags here, so we specify 0. The only reason why we need to do this is because we want to pass the fifth parameter.
The initialization vector, which is a random string with a different length based on the cipher method. This string can be generated by calling openssl_random_pseudo_bytes with the length received from calling openssl_cipher_iv_length with our chosen cipher method “aes-256-ctr”.

After calling the function, if it returns a successful result, we prepend the initialization vector to it that we previously passed to the function – that is only so that we can determine it again when decrypting the value later. At the end, we simply base64_encode the result before returning it. This is not technically necessary, but a good idea here because the encryption result is a binary string that likely includes characters that are human-unreadable and may be problematic for being stored in the database.

Our decrypt method works almost the same way, just in the opposite direction. Here it is:

public function decrypt( $raw_value ) {
	if ( ! extension_loaded( 'openssl' ) ) {
		return $raw_value;
	}

	$raw_value = base64_decode( $raw_value, true );

	$method = 'aes-256-ctr';
	$ivlen  = openssl_cipher_iv_length( $method );
	$iv     = substr( $raw_value, 0, $ivlen );

	$raw_value = substr( $raw_value, $ivlen );

	$value = openssl_decrypt( $raw_value, $method, $this->key, 0, $iv );
	if ( ! $value || substr( $value, - strlen( $this->salt ) ) !== $this->salt ) {
		return false;
	}

	return substr( $value, 0, - strlen( $this->salt ) );
}Code language: PHP (php)

The main function here is openssl_decrypt, and we can call it with the same parameters as we previously called openssl_encrypt, except of course the first parameter, for which we need to pass the value to decrypt here. In order to get that value, we first need to run the $raw_value parameter through base64_decode to get the binary version again, and then we need to remove the initialization vector from the result. That part is also what we need to pass as the fifth parameter so that it matches what was passed when encrypting (remember, we prepended it to the encryption result before).

And that is it! We now have implemented our encryption and decryption logic (which is basically WordPress-agnostic), plus a way to determine an encryption key and salt to use (WordPress-specific). You can find the full Data_Encryption class implementation from Site Kit here.

Summary

Hopefully this post gives you some food for thought on dealing with sensitive data in WordPress, as it is something many developers tend to overlook. And I get why – this may not exactly be fun, and security itself is a topic where everything is always a trade-off. The above approach is certainly not perfect, and there are probably people that would advise you against using it. And again, especially in WordPress you should think twice before you decide to store sensitive data in its database, which is a playground for any kind of combination of plugins and themes that largely differ in quality and may even contain malicious code – external platforms may offer a much more secure storage for such data.

If you need to store sensitive data in WordPress, keep it to a minimum. Do what you can to make it less trivial for a third party to get unauthorized access to it. Maybe it is suitable for your case to use a similar implementation like the one we went through in this article. In either case, make sure you think about the implications of storing data in WordPress, and research what you can do to make it more secure as a plugin developer.

Comments

2 responses to “Storing Confidential Data in WordPress”

Timothy Jacobs

January 23, 2020

If you can get away with only supporting WP 5.2+, a great choice is to use libsodium since the paragonie/sodium_compat library is included for sites on less than PHP 7.2.

Jamie

October 19, 2021

So would storing the aes encryption password in the environment variable on the server mean 1) the admin would type this in on the server – outside of the web application (e.g. wp) 2) if the server was rebooted it would need to be keyed manually again?

I am trying to attain aes for certain fields of data in a custom wp table, and I do not want to store it on the filesystem or DB.