I gave a talk at WordCamp Los Angeles 2016 on PHP array functions. This is the companion article that I wrote for it. If you’re just looking for the slides, click here.
As a WordPress or PHP developer, you use arrays all the time. They’re an essential (if not necessary) part of your developer toolbox. But that doesn’t mean that you’re using them to their full potential.
That’s because, when we work with arrays, we also tend to work with loops as well. And loops are seductive. They let you traverse an array and perform any operation that you wish on each array element.
That said, it’s easy to overuse loops. When that happens, your code becomes hard to read and to test. That’s because loops, while easy to use, can also make your code much more complex.
But, lucky for us, PHP has a wealth of array functions. They’re used by PHP experts to make their life easier and replace a lot of instances where you’d use a loop. This, in turn, makes their code simpler, easier to read and more testable.
Relevant concepts
Before we begin, let’s take a moment to go over a few concepts. These are foundations on which the article rests. They’ll help you understand what we’re doing when we’re replacing loops by array functions.
The array data type
Let’s start with the array data type. Now, we don’t want to look at the concept of arrays just in PHP. Instead, we want to look at them in the larger context of computer science and mathematics.
In that context, arrays are still a data type used to store a collection of values. The array stores each value at a specific location using either an index or a key-value pair. You can then use the index value or the key to retrieve the value back.
But arrays aren’t just a thing in computer science. They’re also something that you use in mathematics especially with linear algebra. They just don’t go by the same name.
In fact, they have two names: matrix and vector. A vector is a one-dimensional array and matrix is a multidimensional array. You can use both with mathematical functions.
Functional programming
This is where functional programming comes in. A lot of what we’ll see in this article revolves around it. Like object-oriented programming, this isn’t an easy topic to cover. That’s why this will only be a small overview of the topic.
As the name suggests, functional programming revolves around functions. But these aren’t the same functions that you use every day with PHP. Instead, these are functions that are closer to the mathematical ones mentioned earlier.
So what’s so special about these functions anyways!? Well, for us, there are two things that are important. First, these functions always take an input and always generate an output. You can’t have a function without any parameters or that doesn’t return a value.
The second thing is that these functions come in two categories: higher-order and anonymous. Higher-order functions are functions that can accept other functions as arguments. These can be either other higher-order functions or anonymous functions. The array functions that we’ll see today are all higher-order functions.
Meanwhile, anonymous functions only exist to serve these higher-order functions. Like we mentioned earlier, higher-order functions can take them as arguments. But they can also return them as a result. In functional programming, they’re not any different from strings or any other data types.
PHP callables
And this brings us to the PHP concept of “callables“. Callables are a PHP data type used to define a function or method that PHP can call. They also go by the name “callback”.
With WordPress, callables tend to take one of two forms: a string (like 'function_name'
) or an array (like array($object, 'method_name')
). That’s because WordPress still as PHP 5.2 has a requirement. That said, there’s a third type of callable that became available with PHP 5.3.
It’s the anonymous functions that we saw earlier. They don’t look any different from regular PHP functions with one exception. They don’t have a name. (That’s why they’re “anonymous”.)
But this is why we did an overview of functional programming. With anonymous functions, it’s now possible to use principles of functional programming with PHP. They allow us to do things that were hard or impossible to do before. (That’s why I can’t wait for WordPress to support them!) But, for this article, we’re only interested in how to use them with array functions.
With all this said, this doesn’t mean that you can’t use these other types of callables in your own code! You should if you have to support PHP 5.2 or if they make more sense to you. It’s just not what we’ll use in the examples ahead. Instead, we’ll use the array function in combination with an anonymous function.
Replacing loops with array functions
Alright! So let’s go back to the problem that we brought up at the beginning of the article. We went over the fact that we use loops a lot with arrays. (After all, what are we going to loop through otherwise!?)
But it doesn’t have to be that way! We’re going to look at common situations where you use a loop. And then we’ll rework them using PHP array functions.
Now, this isn’t the definitive list of scenarios where you can use a PHP array function. You’d be doing yourself a disservice by thinking that. Instead, these scenarios are a starting point. Once you understand how they work, you can explore ways to use them in your own projects.
Validating array values
Have you ever needed to loop through an array to either generate a smaller one or to remove array values? This is a pretty common when you’re trying to validate the values inside an array. You could do it by looping through your array like this:
$posts_with_meta_key = get_posts(array( 'meta_key' => 'post_meta_key' )); $posts_with_valid_meta_key = array(); foreach ($posts_with_meta_key as $post) { if (my_plugin_is_meta_key_valid($post->post_meta_key)) { $posts_with_valid_meta_key[] = $post; } }
posts_with_meta_key
contains an array of WP_Post
objects that we queried using get_posts
. The query itself asked for the recent posts that had a post meta key of post_meta_key
. That said, we have no idea whether the value stored in post_meta_key
is valid or not.
That’s why we have to loop through all the WP_Post
objects in posts_with_meta_key
. We validate the post_meta_key
value of each post by passing it to my_plugin_is_meta_key_valid
. If it’s valid, we add it to the posts_with_valid_meta_key
array. Once we’re done with our loop, we know that posts_with_valid_meta_key
contains all the posts with a valid post_meta_key
value.
The array_filter function
So what array function could we use instead of a loop? Well, you could use the array_filter
function! It’s an array function that lets you filter (thus the name!) the values inside an array.
array_filter
accepts two parameters (three starting with PHP 5.6): an array to filter and an optional callable. It works by iterating through each value in the given array. It then passes each of those values to the given callable.
The role of the callable is to determine whether array_filter
should keep the array value or not. If the callable returns true
, array_filter
keeps the value. Otherwise, it omits the value from the filtered array that it returns at the end.
But isn’t the callable parameter optional? So what happens if you don’t give array_filter
a callable argument? Well, it’ll still filter the values in the given array. It’ll omit all the values that equate to false
in PHP. This is a great way to clean up an array of unnecessary values.
Replacing our loop with array_filter
Let’s go back to our array validation with a loop example. What does it look like when we rewrite it to use array_filter
instead of a loop? It would look like this:
$posts_with_meta_key = get_posts(array('meta_key' => 'post_meta_key')); $posts_with_valid_meta_key = array_filter($posts_with_meta_key, function(WP_Post $post) { return my_plugin_is_meta_key_valid($post->post_meta_key); });
posts_with_valid_meta_key
now stores the array returned by array_filter
. We pass two arguments to array_filter
. There’s posts_with_meta_key
as the array argument and an anonymous function as the callable argument.
The anonymous function accepts a WP_Post
object as a parameter. That’s because posts_with_meta_key
is an array of WP_Post
objects. It uses that WP_Post
object to get the post_meta_key
value. It then passes the value to the my_plugin_is_meta_key_valid
function like earlier.
The anonymous function returns the result of the function call to array_filter
. It’s how it determines whether to keep or omit the array value. The resulting posts_with_valid_meta_key
array is identical to the one generated using a loop.
Generating a new array using an existing array
Let’s move on to the next scenario involving a loop and an array. Imagine that you have an array. And that, using the values inside this array, you want to create a second array. Using a loop, it could look something like this:
$recent_posts = get_posts(); $recent_post_permalinks = array(); foreach ($recent_posts as $post) { $recent_post_permalinks[] = get_post_permalink($post->ID); }
This is another example that uses an array of WP_Post
objects. recent_posts
contains an array of WP_Post
objects fetched by get_posts
. For this example, get_posts
returns the most recent posts from the WordPress database.
The goal of the loop is to populate the second array named recent_post_permalinks
. As its name suggests, it contains the permalinks of all the posts in recent_posts
. We get each permalink by calling the get_post_permalink
function with the post ID.
The array_map function
You can also replace this loop with an array function. This time, we’re going to replace our loop with a call to the array_map
function. It’s a function that lets you transform the values of an array using a callable.
array_map
accepts at least two parameters: a callable and an array. It iterates through each array value and passes it to the given callable. The callable applies changes to the array value and returns it back to array_map
.
$recent_post_permalinks = array_map('urlencode', $recent_post_permalinks);
Above is a quick example using our existing recent_post_permalink
array. array_map
calls urlencode
on all the permalinks in recent_post_permalinks
. This turned recent_post_permalinks
from an array of permalinks to an array of urlencoded permalinks.
Replacing our loop with array_map
There’s more to array_map
than what we’ve seen. But it’s enough to let us replace our loop with a call to array_map
. This is what our converted loop would look like:
$recent_posts = get_posts(); $recent_post_permalinks = array_map(function(WP_Post $post) { return get_post_permalink($post->ID); }, $recent_posts);
This looks pretty much like the converted loop that we created using the array_filter
function. The first argument is an anonymous function which also accepts a WP_Post
object as a parameter. The anonymous function itself just makes a call to get_post_permalink
using the post ID. It then returns the generated permalink back to array_map
.
This might feel different from the urlencode
example from earlier. But this is still an array transformation. We’re transforming an array of WP_Post
objects into an array of permalinks.
This concept of transforming an array into another array is super powerful. It’s also a great way to determine when to use the array_map
function instead of a loop. It’s amazing how often you can use it.
Searching an array
The next scenario that we’ll look at is searching an array. Now, we’re not talking about searching an array for a specific value. You can do this using the array_seach
function. We’re talking about more complex scenarios like this:
$recent_posts = get_posts(); $longest_recent_post = null; foreach ($recent_posts as $post) { if (!$longest_recent_post instanceof WP_Post || str_word_count(strip_tags($post->post_content)) > str_word_count(strip_tags($longest_recent_post->post_content)) ) { $longest_recent_post = $post; } }
This example starts off pretty much like the one for the array_map
function. The way that we populate the recent_posts
array is the same. We use the get_posts
function to get the most recent posts from the WordPress database. But the similarities end there.
The purpose of the loop in this example is to find the longest post in the array of WP_Post
objects. To do that, we need to track which post is the longest. That’s what we’ll use the longest_recent_post
variable for. It starts off as null
since we have no longest post yet.
To find the longest post, we need to go through each WP_Post
object and compare their length. We use two PHP functions to do this: strip_tags
and str_word_count
. strip_tags
removes the HTML and PHP tags from a string. We then pass the resulting string to str_word_count
which returns the number of words in it.
Inside the loop, we use an if
statement to check for two things. First, we use the instanceof
operator to check if longest_recent_post
is storing a WP_Post
object. If it isn’t, we don’t need to compare the length of the two WP_Post
objects. We can just assign the post
variable inside the loop to the longest_recent_post
variable.
But, if longest_recent_post
is storing a WP_Post
object, we have to go to our second condition. We need to compare the post_content
word count of the two WP_Post
objects. This is where we use the strip_tags
and str_word_count
functions from earlier.
If longest_recent_post
has a higher word count, we do nothing. But, if post
has a higher word count than longest_recent_post
, then it’s our new longest_recent_post
. And we reflect this by assigning post
to it.
The array_reduce function
As with our other loop scenarios, we can also replace this loop with an array function. The function that we can use here is array_reduce
. This is a function that lets you shrink an array to a single value using a callable.
array_reduce
accepts three parameters: an array, a callable and an optional initial value. The callable function is a bit more complex with this function. It needs two parameters to work.
The first parameter is what we call the carry. It contains the current single value that array_reduce
would return at this point. This is where the optional initial value comes into play. It controls the initial value of the carry.
The second parameter is the current array value. You want to compare it to the carry. You return the one that you want to keep and it becomes the new carry.
Replacing our loop with array_reduce
It’s ok if the callable doesn’t make too much sense yet. It’ll (I hope!) make more sense once we convert our loop to use array_reduce
. Let’s look at that now.
$recent_posts = get_posts(); $longest_recent_post = array_reduce($recent_posts, function($longest_post, WP_Post $post) { if (!$longest_post instanceof WP_Post || str_word_count(strip_tags($post->post_content)) > str_word_count(strip_tags($longest_post->post_content)) ) { return $post; } return $longest_post; });
Here’s our anonymous function for array_reduce
. The carry is the longest_post
parameter. And the post
parameter is the current array value.
Now, we can enforce the type of the post
parameter. That’s because we know that recent_posts
is an array of WP_Post
objects. But we can’t do it for longest_post
because the initial carry value is null
when we don’t specify one.
This is why our if
statement is the same as the one from the loop. We still need to check that longest_post
is a WP_Post
object. It’ll only apply to the first array element, but it’s still necessary.
Earlier, the if
statement decided whether to assign post
to longest_recent_post
or not. In our anonymous function, it controls which of the two parameters become the carry. If longest_post
isn’t a WP_Post
object or post
has higher word count than longest_post
, we return post
. Otherwise, we return longest_post
.
Search for a valid array value
Alright, so this might not be a common scenario! We’ll look at a loop that combines two of our previous loop scenarios. It’ll combine the one that validates the values of an array with the one that finds the longest post.
$posts_with_meta_key = get_posts(array( 'meta_key' => 'post_meta_key' )); $longest_post_with_valid_meta_key = null; foreach ($posts_with_meta_key as $post) { if (my_plugin_is_meta_key_valid($post->post_meta_key)) { if (!$longest_post_with_valid_meta_key instanceof WP_Post || str_word_count(strip_tags($post->post_content)) > str_word_count(strip_tags($longest_post_with_valid_meta_key->post_content)) ) { $longest_post_with_valid_meta_key = $post; } } }
(Small note: this isn’t the only nor the best way to build this loop. It’s just the one that illustrates the combined conditions the best. Please don’t use it in practice!)
The loop above has a lot in common with the loop from the two scenarios that we saw earlier. We use get_posts
to get the recent posts with the post_meta_key
post meta key. We also initialize longest_post_with_valid_meta_key
with null
like we did with longest_recent_post
.
The loop itself contains two nested if
statements. The first one validates that post
has a valid post_meta_key
. The second one compares the length of its post_content
to that of longest_post_with_valid_meta_key
. If it satisfies both conditions, post
becomes the new longest_post_with_valid_meta_key
.
Combining array functions
The goal with this scenario isn’t to introduce you a new array function. (Surprise!) It’s to show how we can combine array functions together. This is possible thanks to a concept from mathematics (and by extension functional programming!) called “function composition“.
This concept relies on a functional programming principle. It’s that, in functional programming, a function must accept an input parameter and return a value. Here’s a diagram to illustrate how this concept works:
Let’s imagine that you have a Function A
. If you pass it an input value, it’ll process it and output a value. Behind the scenes, Function A
is quite large and does quite a few things to generate that output.
What you could do is break down Function A
into two functions: Function B
and Function C
. Function B
would take the original Function A
input. It would process it and output value.
Function C
would take this intermediary output and use it as its input. The output from Function C
would then contain the changes of both functions. It’ll also be the same output as the one from Function A
. That’s function composition in a nutshell.
Now, this seems obvious explained like this! But it has powerful implications. It means that you don’t need to create a large function to make multiple changes to a value. You can just compose multiple functions together instead.
This is why, in functional programming, you want to make functions as small as possible. Each function only does one little thing. And, using those small functions, you compose a larger one that applies the effects of each.
Replacing our loop with function composition
This is what we’ll do here. We’re going to compose a function to replace our loop. It’ll use two array functions that we saw in earlier examples: array_filter
and array_reduce
.
$posts_with_meta_key = get_posts(array( 'meta_key' => 'post_meta_key' )); $longest_post_with_valid_meta_key = array_reduce( array_filter($posts_with_meta_key, function(WP_Post $post) { return my_plugin_is_meta_key_valid($post->post_meta_key); }), function($longest_post, WP_Post $post) { if (!$longest_post instanceof WP_Post || str_word_count(strip_tags($post->post_content)) > str_word_count(strip_tags($longest_post->post_content)) ) { return $post; } return $longest_post; } );
This example follows the same idea as our diagram from earlier. We have two small functions that we combined together into a larger function. This larger composed function reads from the inside out.
That means that the first function is array_filter
. array_filter
does the same thing as earlier. We pass it the posts_with_meta_key
array. And it returns an array with all the WP_Post
objects where my_plugin_is_meta_key_valid
returned true
.
This array ofWP_Post
objects serves as the input value for the array_reduce
function. This is the second function in our larger composed function. It’s also the same as the array_reduce
function from earlier. It returns the WP_Post
object with the highest post_content
word count.
The return value from array_reduce
is the longest_post_with_valid_meta_key
. This is the same value that we would have gotten using our loop. But instead, we combined array functions together to achieve the same outcome.
Why do this?
The question that you might still have at this point is “Why should I stop using loops for this?” That’s an understandable feeling to have! Loops are familiar to you. This makes them easier to use.
Meanwhile, using PHP array functions (and, by extension, functional programming) feels harder. It’s not as easy to read and understand this type of code. But this isn’t something that’s permanent either. You get better at it with time.
We can’t say the same thing about loops when we overuse them. Doing that brings with it a certain amount of complexity that won’t disappear over time. This is something that matters when you’re getting better as a developer.
You want to write code that’s simple and easy to test. This, in return, improves your code’s quality and helps prevent technical debt. These are some of the reasons to use PHP array functions in your code whenever you can.