PHP 8.0 feature focus: Language tightenings

Larry Garfield
Larry Garfield
Director of Developer Experience
22 Oct 2020

In our last chapter we went over several smaller feature improvements to PHP. Today, we’ll cover several minor language changes that make PHP safer, but may trip up some older code.

There’s been a very steady trend in PHP over the last several years toward making the language tighter. That means more edge cases that are “undefined behavior that kinda silently works most of the time” turn into explicit warnings or errors, behavior that was documented but totally illogical gets adjusted to be more logical, and so on. Usually the impact is slight, and well-behaved code usually won’t notice a difference, but as we all know not all code is well-behaved.

PHP 8.0 continues the trend of tightening. So today, let’s cover some of the upcoming tidying up that may affect your existing code.

Stable sorting

What happens if you have an array and you sort it, but two of the elements are equal? Does their order change or not?

When the things being sorted are strings or integers, it doesn’t really matter. If they’re objects, however, it may not be obvious.

Suppose you’re sorting an array of Person objects by age. Many people can have the same age, of course, so in what order do equally aged Persons end up?

In PHP 7, the answer was ¯\(ツ)/¯; the resulting order was unpredictable. In PHP 8, the order is now “the same as it was before.” That means if in the original array Jorge, age 40, appears before Melissa, age 40, and they’re sorted by age, then Jorge will still appear before Melissa. Or, in code:

<?php
$people[] = new Person('Jorge', 40);
$people[] = new Person('Melissa', 40);

usort($people, fn($a, $b) => $a->age <=> $b ->age);
// Jorge is guaranteed to still be before Melissa.

As a side effect, it used to be nominally possible to return a boolean from the comparison function rather than an integer, which is what is expected. In PHP, booleans can “weakly cast” to integers 1 and 0 in many circumstances. That’s no longer supported in sort comparison functions and will now trigger a warning. (It was always a bug, now it’s just explicit.)

The stable sorting RFC comes to us courtesy of inconsistency slayer Nikita Popov.

More logical numeric string handling

PHP, like most popular interpreted languages, makes liberal use of type coercion. That is, a variable can change type depending on where it’s used if it makes sense in context to do so. Most of these conversions involve playing fast and loose between strings and number types (int and float). For example, the integer 42 and the string “42” are generally “close enough” to the same thing that they can be considered equal (==), but not identical (===).

That “it’s probably good enough” approach has its advantages, but also introduces a lot of bizarre edge cases. For that reason, PHP’s scalar types support for function signatures (introduced in 7.0) has both a weak mode (that allows that kind of silent conversion) and a strict mode (which doesn’t), with the strict mode being generally recommended for most use cases.

A number of other gotchas lurk in that silent conversion, though. In PHP 7.4, the following are, mind-bendingly enough, true:

<?php
0 == "wait, what?";           // true
0 == "";                      // true
99 == '99 bottles of beer';   // true
42 == '     42';              // true
42 == '42     ';              // false
in_array(0, ['a', 'b', 'c']); // true???

Moreover, in some contexts 42 is considered “close enough” to int(42) and sometimes not.

The technical term for this situation is “totally bonkers.” Fortunately, a pair of RFCs clean up this silliness in PHP 8.0.

There’s some subtlety to them, but the condensed version is that:

  • “numeric strings” are now any string containing all numeric values with leading or trailing whitespace, which can be safely ignored, rather than just leading whitespace. (Note that “numeric values” can include exponents and a dot, such as "42.5e4", not just numerals.)
  • Numeric strings are treated and defined consistently in all cases.
  • When a numeric string is compared to an integer, with == or greater-than/less-than, the string is converted to an integer and then they’re compared as integers.
  • When a non-numeric string is compared to an integer, the integer is converted to a string and then they are compared as strings. No more 0 == "seriously, what?"
  • When comparing a string to a float, the same happens, but the float value may be mangled by floating point precision first, so it’s not always guaranteed to behave as expected. (Such is life with floating point numbers and binary computers.)
  • Trying to pass a non-numeric string to a function that expects a number will now throw a TypeError.
  • Explicitly casting a numeric-leading string (like “99 bottles of beer”) to an integer will still result in int(99), but that’s the only circumstance in which that happens anymore.

All of these changes help make the language more predictable, logical, and consistent, but they are changes. If your code already plays fast and loose with types and string-to-number conversions, you may see some subtle changes in behavior. Fortunately, most code these days does not play that fast and loose with types (precisely to avoid this weirdness).

We have one RFC from Nikita Popov and one RFC from George Banyard to thank for this cleanup. Now go enable strict types in your codebase anyway.

<?php
declare(strict_types=1);

Non-numeric Arithmetic

Another small “wait, what?” fix. Historically, PHP allowed nearly all arithmetic operations to be applied to non-numeric types. Not because it makes sense, but because in ye olden days the idea was that the code should not crash and should try to do something kinda-sorta reasonable, even if there was nothing logical to do.

That leads to weirdness like [] % [5] == 0. That makes absolutely no logical sense, but is just what falls out of the engine by accident.

In PHP 8.0, those nonsensical combinations now throw TypeErrors rather than silently doing something that may or may not make sense. A few do make sense, such as addition on arrays being a type of merge, and those are unchanged. The behavior on primitives (strings, bool, float, etc.) is also unchanged.

This tidying RFC comes once again from the king of consistency, Nikita Popov.

Stricter magic

Objects in PHP support a number of “magic methods”: methods with a special name that have special behavior in the engine. We already saw __toString() back in part 1 of our series as an example. All magic methods begin with __ to indicate that there’s something special about them. (And if you have a method of your own that begins with a double underscore but doesn’t tie into special engine behavior, you are officially Doing It Wrong(tm).) Most of those methods were added well over a decade ago, however, which means they predate PHP adding widespread type support in method signatures.

That isn’t a huge problem, as long as the code is well-behaved. However, the whole point of typed function signatures is that the language itself will slap your hand if your code isn’t well-behaved so you know to fix it before it causes subtle data-losing bugs. Unfortunately, the language allowed developers to add type declarations to their magic methods … even if those declarations were contrary to what the method was supposed to do. Oops.

PHP 8.0 now optionally allows you to declare the right types in your method signatures and will slap your hand (the technical phrase is “throw a Fatal error”) if you specify the wrong one. For the vast majority of users nothing happens, but it allows those who prefer the language to do their work for them to do so safely.

In particular, the following magic methods now support, and enforce, the following typed signatures:

<?php
Foo::__call(string $name, array $arguments): mixed;
 
Foo::__callStatic(string $name, array $arguments): mixed;
 
Foo::__clone(): void;
 
Foo::__debugInfo(): ?array;
 
Foo::__get(string $name): mixed;
 
Foo::__invoke(mixed $arguments): mixed;
 
Foo::__isset(string $name): bool;
 
Foo::__serialize(): array;
 
Foo::__set(string $name, mixed $value): void;
 
Foo::__set_state(array $properties): object;
 
Foo::__sleep(): array;
 
Foo::__unserialize(array $data): void;
 
Foo::__unset(string $name): void;
 
Foo::__wakeup(): void;
?>

Although optional, I would recommend including the types in all cases to make code more self-documenting and nitpicky about finding errors for you early on.

This added type safety is thanks to Gabriel Caruso.

Stricter warnings and errors

PHP has a variety of error levels (Notice, Warning, Error) that it can trigger when something goes wrong, as well as the ability to throw Exceptions or engine Errors. Determining the appropriate severity of a problem is always a tricky problem, especially when some of those options didn’t exist when a given error was first defined.

In PHP 8.0, several errors became stricter. The full list is included in the RFC, but the most notable ones are:

  • Many notices around missing values or using invalid types in weird places are now warnings. That includes reading undefined variables, properties, and array keys.
  • Several warnings around misuse of arrays and traversables are now TypeError exceptions. They are a type problem, and anything the code tries to do after that is guaranteed to be wrong, so that makes sense.
  • Division by zero (who does that???) now throws a DivisionByZeroError instead of a warning.

You get three guesses who we can thank for this RFC, and the first two guesses of Nikita Popov don’t count.

Many resources changed to objects

Finally, although it wasn’t an RFC, a lot of work has been done behind the scenes to convert resources to objects.

“Resources” in PHP speak are a special data type that dates from before PHP even had objects, so we’re talking about the Bill Clinton presidency here. Resources are kind of like objects, only worse. They can only be implemented by extensions and don’t support most of what objects support. It’s generally acknowledged that they were a bad idea and real objects are a better solution in just about every case, but there are still many very common and very old extensions that expose resources instead of objects to user code. That’s especially true of things like database connections, the file system, and other “external thingies.”

There has been an ongoing effort to convert those resources to be objects, a process that is 99% transparent to user code, with the end goal of removing resources from the language entirely. Much of that conversion was completed in PHP 8.0, albeit mostly for the less used extensions.

The odds of that affecting your code are very small. Likely the only reason you’d be affected is if you are checking if a variable is_resource() and are using the CURL, OpenSSL, Sockets, XML-RPC, ZIP, or ZLIB extensions. If so, in PHP 8.0 that function will now return false instead of true. Otherwise you likely won’t notice.

If you have no idea what this is all about, congratulations, you most likely don’t need to care.

Want to understand how PHP works under the hood, and how PHP 8 can make your code faster in just the nick of time? Stay tuned for the next installment.

You can try out pre-release copies of PHP 8.0 today on Platform.sh, with just a one-line change. Give it a whirl, and let us know what your favorite features are.