.env files are increasingly popular as a way to configure an application. This loose standard offers a number of benefits, most of which are fully realized on Platform.sh. However,
.env files also need to be used correctly; misuse, as with anything, can be worse than not using them at all.
To understand the best way to use
.env files, we first need to understand what is meant by “environment.”
Whatever language it’s written in, your application is ultimately a pile of code. That pile of code is static and unchanging between the different times and places it runs.
When it runs, though, it’s going to care about other “stuff.” That stuff could be a database, a cache server, a file system, a remote authentication gateway, or a thousand other things. That other “stuff” is the context in which the application runs. Or, alternatively, its environment.
That environment could be variable. Your application may require a PostgreSQL server, but which PostgreSQL server it connects to, and the data in it at any given point in time, is all part of the environment and could change independently of your code. You want your application to talk to a payment gateway, but which credentials it uses may vary depending on the context you’re running in. (Building production payment API keys into your application and then sending it for testing tends to end badly. Trust me on this…)
When building an application, it’s important to keep a clean separation between “stuff that is the same in all environments” and “stuff that changes in each environment.” The former is your code. The latter is your environment configuration.
Unix-like operating systems have had a mechanism for handling that environment-specific configuration for decades: environment variables. Environment variables are system-global string values that any application can read at any time and use that to make decisions about what code to run.
The important distinction is that some of that configuration should vary with the install of the application, while others vary with the environment. Remember, any given application may be run in multiple instances for testing.
For example, if you’re building an application that users can install and configure themselves, the “company name” that the application is for is going to vary depending on which user is running it. But the database it connects to is going to vary for every instance of the application that user runs. Think production, testing, local laptop, etc. Those should all have the same company name, but different database credentials.
That is the first important step: separating your application’s per-install configuration (company name) from per-environment configuration (database). Putting those in the same file makes it substantially harder to vary per environment without also making the per-install configuration vary.
An executable configuration file sometimes has a backdoor option to allow the environment to vary. Depending on how the file is written, it may be possible to modify it manually to read certain values from somewhere else: an extra include file, environment variables, etc. That’s still a hack, though. You want to be able to put the consistent configuration into Git, so its changes are updated across all instances, but not the environment-specific configuration.
And that’s where environment variables come in.
The ideal place to store environment-specific configuration is in environment variables. The mechanisms for setting them are well-established. The APIs for reading them are universal. While two different applications may not use the same variable name, there are many simple ways to set one environment variable based on another.
For production and testing, therefore, the best place to manage environment-specific configuration is environment variables. Either design your application to read from them directly, or design it to have a user-modifiable executable configuration file that can be modified to read values from the environment rather than hard code them directly. That way, when you move the application from production to staging to your other staging to a branch-specific environment, you need only update the environment variables for those new environments and the application will keep on chugging.
The caveat with that approach, however, is local development. Odds are you don’t want to set a global variable across your entire computer just to tell your application’s dev copy what fake credentials to use or where your test database is. Even if you’re using a containerized environment locally, you may not want to fiddle with environment variables directly.
This is where the
.env file comes into play.
.env is a de facto standard for an ini-like file that contains fake environment variables. An application that supports
.env files will, on boot, run through each line in that file and read
key=value pairs. For each, it will run “IF an environment variable with this name doesn’t already exist, set it based on this file.” That will set the variable only within the scope of your application’s process, without impacting any other processes on the computer. Then the rest of your application can proceed and read from the environment as it would anywhere else, entirely ignorant of that switcheroo. (Don’t write that code yourself. There are
.env support libraries in every language that all do exactly the same thing. Use one of those.)
That, and only that, is the purpose of
.env files: values that change per-environment, and thus are not part of your code base. Which brings up the most important thing to remember about
.env files: they do not belong in Git. Anything in Git is going to be the same on every environment, by design, which is exactly the opposite of what environment variables and
.env file are for.
Values that do not change between environments also do not belong in the
.env file. The site name, admin email address, and so on should either be in a read-only config file that is committed to Git or in the database, depending on if you want those configuration values to be end-user modifiable. (Either way is valid, as long as you do it deliberately.) But those values do not belong in the
.env file, because they are not environment-specific.
That presents a problem on Platform.sh, as by design the build step is run independently of any environment-specific values. That’s because the build output can be reused in a different environment, such as production. Specifically, in the case of a fast-forward merge to production, we don’t need to rebuild your application. The built version from a branch is reused, which means what you were testing in a branch is exactly the same bits on disk as what gets deployed to production. That is the only way to minimize “works on my branch” type errors and is part and parcel of the whole idea of containerized deployment.
But what if you want to “develop” on a branch? This is where naming becomes important. Don’t think of the non-production environments as development. Development is where you write code, which is your local computer. On your local computer you can set whatever environment variables (or
.env files) you want. Instead of thinking of every branch as a development environment, think of every branch as a staging environment. Staging should be as close to production as possible, including its build modes.
When viewed that way, it makes no sense for the build mode to vary depending on the environment, because that variation is a place for heisenbugs to appear. You don’t want heisenbugs. (Trust me on this.) “Debug mode” should happen in the location you’re debugging and writing code, that is, your local environment.
Which is the only place you can, and should, be using a
So to recap:
.envfiles are a surrogate environment variable for local development only, but should never be committed to Git.