In our last episode, we discussed PHP 8's new match()
expression. Today we look at an edge case feature that will save your edge case.
PHP 7.4 introduced the concept of Weak References, which allow an object to be referenced without incrementing its reference counter. That's a bit obscure and in practice not all that useful in most cases. What we were really waiting for is Weak Maps, which have landed in PHP 8.0.
Weak Maps are a little nerdy to explain, so bear with me. Normally, when you create an object and assign it to a variable, what happens is the object is created in memory and then the variable is created as a reference to it. Think of it as the variable just having the ID of the object, not the object itself. If you assign another variable to the same object, there's still only one object, but now there are two variables with the object’s ID.
<?php
$a = new Foo();
$b = $a;
// $b and $a are now separate variables that
// both point to a Foo object in memory somewhere.
Every time a variable is removed, PHP checks to see if there are any other variables still referencing that object. If there are none, it knows it's safe to delete that object for you. This process is called "garbage collection," and I've greatly over-simplified it here because this is enough for our purposes.
A Weak Reference, or a Weak Map, is a way of creating a variable that acts like any other, but when PHP checks to see if any variables still point to an object those "weak" variables don't count. So if there are still three weak references pointing at an object, but no normal variables, PHP will happily delete the object and set the remaining variables to null instead.
It will make more sense to see it in action. Suppose you have a series of Product
objects, written by someone else in a library of some kind. You can't modify them, but you're going to use them.
For each Product
, you want to track additional information that isn't on the original object, say, a list of Review
objects on that product. Subclassing the Product
to include reviews is possible but messy, and inheritance often runs into problems when trying to combine multiple modules together.
Instead, we'll make a separate ReviewList
object that contains a Weak Map of Review
objects, lazy-loading them as needed and tracking them by Product
. Once a Product
is removed from memory, when all of the variables that reference it go out of scope, we don't need to keep those Review
objects around. A WeakMap
acts as a self-cleaning cache in this case, and it works similarly to ArrayObject
.
<?php
class ReviewList
{
private WeakMap $cache;
public function __construct()
{
$this->cache = new WeakMap();
}
public function getReviews(Product $prod): string
{
return $this->cache[$prod] ??= $this->findReviews($prod->id());
}
protected function findReviews(int $prodId): array
{
// ...
}
}
$reviewList = new ReviewList();
$prod1 = getProduct(1);
$prod2 = getProduct(2);
$reviews_p1 = $reviewList->getReviews($prod1);
$reviews_p2 = $reviewList->getReviews($prod2);
// ...
$reviews_p1_again = $reviewList->getReviews($prod1);
unset($prod1);
In this example, ReviewList
has an internal "weak cache" keyed off of Product
objects. When getReviews()
is called, if the desired value is already in the cache it will get returned. If not, it will be loaded into memory, saved in the WeakMap
cache, and then returned. (The ??=
bit there is null-coalesce-assign, introduced in PHP 7.4, and is the bee's knees for exactly this sort of case.) Later on, when we create $reviews_p1_again
, the value will be looked up in the cache instead.
However, at some point in the future we unset $prod1
. Generally that won't be done manually, but the variable will go out of scope and get garbage collected. Since there are no more normal references to the product 1 object, the reference to that object in the $cache
Weak Map will get removed automatically. That will also cause the corresponding list of Review
objects to get cleared automatically, too. The memory is saved and no further work is needed. It "Just Works(tm)."
If you tried to do the same with a normal array, there would be two problems:
- Arrays can't use objects as keys, so it would need to be keyed off of the product ID or similar.
- That means the cache wouldn't know to prune itself when the object it's a cache for gets garbage collected. You might be able to implement some complex logic using destructors, global variables, and other black magic, but ... please don't. The odds of getting it wrong are high, and the level of complexity it introduces isn't worth it.
Caching scenarios like that are really the only strong use case for WeakMap
, but when you need it, it's a big memory and code saver.
Once again, we have Nikita Popov to thank for the WeakMap RFC.
Tune in next week as we look at a new feature in PHP 8.0 that tries to address "the billion dollar mistake" of computing.
You can try out pre-release copies of PHP 8.0 today on Platform.sh, with just a one-line change. Give it a whirl, and let us know what your favorite features are.