You’re Doing Environment Variables All Wrong - A Node.js Perspective

Josh Cole·09/05/2022·11 min·2,188 words
JavaScript
TypeScript

TL;DR

Environment variables aren’t always what you expect and it’s painful to check each one. Instead, use a library such as safe-env-vars to do the hard work and be safe in the knowledge your environment variables won’t cause you any headaches.

Oh, what?

Environment variables are easy, you say, we’ve been working with environment variables for our entire careers... how could we possibly be “doing them wrong”?! Well, as American computer scientist Jim Horning said, “Nothing is as simple as we hope it will be”. And in this case, a risk is introduced every time you ‘set and forget’ a variable. Let’s explore the problem, or rather, problems.

Let’s start at the top

So what are environment variables and why do we use them? Put simply, environment variables are pieces of state (read; string values) that we store in the ‘environment’ that our application is running in. This state is usually set via one of the mechanisms provided by the operating system, shell, or container orchestrator, which is responsible for our application process.

Environment variables are a simple mechanism, and this is good thing because a lot of engineering is not so simple.

“Simplicity is prerequisite for reliability. “ — Edsger Dijkstra.

Often in engineering we need to iteratively refactor and rework our solutions until we reach a good balance between readability and functionality. Here, simplicity is our friend because it makes it easier to understand what our code is doing and why. We’re far less likely to end up with misbehaving, buggy software if it’s simple.

See, it’s mostly upside!

Well yes, there is an awful lot of upside. As we shall see storing state in the environment allows us to do several very useful things that would otherwise be risky or time consuming.

1. Change configuration at will

We can change the behaviour of our application whilst avoiding risky activities such as changing source code, and time consuming chores such as re-compiling, re-deploying, testing, and so on. If we need to rotate API keys, turn feature flags on or off, or adjust some other behaviour, we can do all this from the comfort of our chairs simply by deploying the new values and restarting our applications.

2. Keep secrets hidden

We can store secrets separately to our source code. This helps us mitigate the risk of sensitive values such as API keys, credentials, and so on that would put our users at risk if they were to be exposed. This way, if a nefarious actor gains access to our source code, they won’t get their hands on the secrets at the same time. It makes it harder for them to do us damage.

3. Stay on the right side of regulation

In regulated industries it’s often necessary to limit personnel access to sensitive systems to a limited number of specific people. By storing the secrets separately to the source code, the engineers can still do their jobs effectively without the keys to the kingdom sitting within their reach.

4. Set different values per engineer or environment

Whilst working locally we often need to use different values for API keys, feature flags, and behaviour flags that make sense whilst developing but not in deployed environments. The same can be said of automated testing where tests may need to change the application’s behaviour and inputs to test particular aspects.

Each deployed environment can be given a different set of environment variables, for instance to keep production secrets isolated and separate from staging secrets. As with local development, we can also change the values in our staging/testing environments independently of the other environments as needed. Flexibility is great!

5. Use dot env files

In the expansive JavaScript universe a common pattern is to use the dot-env package to read in environment variables from a local .env file that is not committed to the repository. This is a much quicker (and importantly more visible) alternative to setting environment variables in the actual environment. Engineers can change the values quickly and easily whilst developing as the need arises.

So what’s the problem?

There are a few. These are all risks that we need to mitigate for, vulnerabilities that can leave us open to attack, and mistakes that can cause unexpected behaviour at the worst times. Even in the best case scenario, badly behaving environment variables can waste a significant amount of time, especially in dynamically typed languages such as JavaScript.

“Seek simplicity but distrust it.” — Alfred North Whitehead.

We need to be careful not to fall in to one of the myriad traps. In each case, it’s hard if not impossible to predict how our application will behave. Sometimes issues are immediately obvious, but in many instances we won’t know about an issue until it randomly rears its head at the most inconvenient time.

1. Missing values

The most obvious risk here is that a value could be missing. This is more likely to be the case on our local machines where one developer makes a change that requires an environment variable we haven’t got set in our local environment. It’s less likely to happen in deployed code which has gone through several layers of reviews and testing, but it can still happen with complex systems. We’re only human after all!

LOG_LEVEL="TRACE"
#API_KEY="..."
DATABASE_URL="..."

Oops, we disabled the API_KEY value and forgot about it. Or perhaps our colleague added ACCESS_TOKEN_TTL in their latest commit and you haven’t noticed you need to add it to your local .env file.

2. Empty values

Similar to missing values, it’s possible for the value of an environment variable to end up as an empty string. Perhaps that was intentional (though it probably shouldn’t be), but how would we know?

LOG_LEVEL=""

What exactly does the above mean to you? Does it mean we want to turn logging off entirely? Does it mean we want to use the default log level and we don’t care what it is? Or (more likely) has something broken that we need to fix? Ask your friends, you might find they have diverging expectations to you.

3. Arbitrary values

Environment variables are often used for boolean values such as feature flags. Booleans have some big downsides which I won’t go into here, but safe to say those boolean values are arbitrary and different engineers will use different values.

For example:

FEATURE_FLAG_AAA="true"
FEATURE_FLAG_B="TRUE"
FEATURE_FLAG_c="yes"
FEATURE_FLAG_c="Y"
FEATURE_FLAG_c="1"

As humans, we instantly know that all these values all represent the same thing, that a particular feature flag has been toggled on. We rely on conventions and consistency to ensure we don’t fall into the trap of using different values in different places, but good intentions won’t always help when herding cats 🐈 (engineers).

The same can be said if you use enum values, such as with log levels (INFO, DEBUG, TRACE, etc). Obviously you could end up with an invalid value that may throw a spanner in the works unless you validate the value you read from the variable... but how many of us really do that? 🌚

4. Incorrect types

We covered the problem with boolean values above, it’s a similar story if you need to use a value as a number. Environment variables are always read in as strings regardless of what value you’ve stored in them:

FEATURE_FLAG_AAA="true"
SOME_NUMBER="3"

Maybe you need the SOME_NUMBER value to be a number so TypeScript will allow you to pass it to the nice library you want to use. Do you parse the value to an integer like this?

const value = Number.parseInt(process.env.SOME_NUMBER);
someNiceLibrary(value);

And what if that value gets changed to a float in one environment but not another?

SOME_NUMBER="3.14"

Suddenly your application is freaking out but you don’t know why. Your seeing some weird behaviour but you don’t know why, or perhaps worse, you’re seeing an error message stack trace that is a red herring and points you totally in the wrong direct for an hour whilst your customer is yelling at you.

You might argue that this issue is more likely to occur in JavaScript than other languages, but unexpected behaviour is always a risk when dealing with side effects like environment variables.

5. Optional values

Another consideration is that sometimes we really do want values to be optional, where things like the following may be totally valid given our context:

#FEATURE_FLAG_AAA="true" # 1. comment out a value we don't need at the moment.
FEATURE_FLAG_AAA="" # 2. or set it to an empty value (not so good!)

If we’re manually checking environment variables to ensure they exist we need to leave this one variable unchecked as it may be optional. This introduces the human element whereby future engineers may not add in presence checks where needed because they see they aren’t consistently applied to all variables. The variable is implicitly optional and this leaves it open to interpretation by the reader. Better to be explicit when variables are optional as the majority (i.e. the default) will be required.

6. Hidden environment variables

It’s a poor (but sadly common) practice for engineers to read in an environment variable at the point they want to use it, for instance:

function calculateCommission(amount: number): number {
  return amount * Number.parseInt(process.env.COMMISSION_RATE);
}

What’s the problem here? Well our nice calculateCommission function can exhibit odd behaviour if our COMMISSION_RATE environment variable is missing or set to some weird value. Perhaps the engineer that wrote this forgot to update the documentation to indicate that the commission rate needs to be configured in the environment and you didn’t realise you needed to do it. Whoops.

7. Behaviour and security

Environment variables are side effects. You might say they add impurities to our code. Our application can’t control the values it’s reading from the environment and must accept what it’s given. This means environment variables are akin to user input and carry the same risks. ☠️

The value of an environment variable could be unexpected, or worse, malicious. Best case, the value triggers a visible error that leads you down the garden path for an hour or two before you figure out what’s actually causing the issue. Worst case, you have exposed your application to input you can’t trust (and you have trusted it absolutely) without verifying it’s authenticity or correctness, and now you have been storing sensitive data in the attacker’s message queue for the last 2 weeks rather than your own. 😬

Right, how do we sidestep these issues?

Simplicity is fantastically splendiferous, except when it’s not.

“Simple can be harder than complex: You have to work hard to get your thinking clean to make it simple. But it's worth it in the end because once you get there, you can move mountains.” — Steve Jobs.

The trick as with all ‘user’ input outside our sphere of control, is to trust but verify, or in our case, trust but validate. There are a few things you want to do for every value you read in from the environment:

  1. Presence checks - ensure expected environment variables are defined.
  2. Empty checks - ensure expected values are not empty strings.
  3. Value checks - ensure only expected values can be set.
  4. Typecasting - ensure values are cast to the expected type at the point you read them in.
  5. Single entry point - ensure all variables are pulled in at the same place, and not smeared around your codebase for people to stumble upon later.
  6. Dot env - read values from both a .env file and the environment.

Writing the code to do this for every project would be a pain, but the good news is, I’ve already done that for you.

Package: safe-env-var

safe-env-vars will read environment variables from the environment as well as a .env file in a safe way with full TypeScript support. By default, it will throw an error if the environment variable you're trying to read is undefined or empty.

It’s very quick to get started with basic usage if all you’re doing is reading in string values that are always required:

import EnvironmentReader from 'safe-env-vars';

const env = new EnvironmentReader();

export const MY_VALUE = env.get(`MY_VALUE`); // string

You can explicitly mark variables as optional:

export const MY_VALUE = env.optional.get(`MY_VALUE`); // string | undefined

Or you can allow the variables to be an empty value, though I would discourage this for the reasons stated in the discussion above:

export const MY_VALUE = env.get(`MY_VALUE`, { allowEmpty: true }); // string

You can even cast the type of the value as you’d expect:

// Required
export const MY_BOOLEAN = env.boolean.get(`MY_BOOLEAN`); // boolean
export const MY_NUMBER = env.number.get(`MY_NUMBER`); // number

// Optional
export const MY_BOOLEAN = env.optional.boolean.get(`MY_BOOLEAN`); // boolean | undefined
export const MY_NUMBER = env.optional.number.get(`MY_NUMBER`); // number | undefined

And finally, you might want to check whether the variable is one of the allowed values. This check always occurs after the presence/empty checks and typecasting the value.

export const MY_NUMBER = env.number.get(`MY_NUMBER`, { allowedValues: [1200, 1202, 1378] ); // number

See the docs for more usage information and examples.

Recommended pattern

I would recommend you have a single point of entry for the environment variables in your application. One place where you read in all the values needed by the different modules and functions. This ensures that there’s only one place to look and one place to change when making modifications.

I like to structure my single point of entry in JavaScript/TypeScript projects like this:

/src
	/main.ts
	/config
		/env.ts
		/constants.ts
		/index.ts

./config/env.ts

import EnvironmentReader from 'safe-env-vars';

const env = new EnvironmentReader();

export const COMMISSION_RATE = env.number.get(`COMMISSION_RATE`); // number

./config/constants.ts

export const SOME_CONSTANT_VALUE = 123;
export const ANOTHER_CONSTANT_VALUE = `Hello, World`;

./config/index.ts

export * as env from './env';
export * as constants from './constants';

...and the usage?

import * as config from './config';

const { COMMISSION_RATE } = config.env;
const { SOME_CONSTANT_VALUE } = config.constants;

export function calculateCommission(amount: number): number {
  return amount * COMMISSION_RATE;
}

This results in a very clean way of working with configurable environment variables as well as constant values. The benefits of this approach are that there is a single point of entry for the environment variables in your application, and every usage of these values directs the reader back to that entry point.

Conclusion

Don’t fall into the trap of believing that because you’ve been using environment variables for years that they’re safe and can’t surprise you. It’s better to trust but verify the values you’re reading using a robust and time-saving library such as safe-env-vars* which does the hard work for you.

*Alternative options may exist. 🙃