How to use PHP array functions instead of loops

I gave a talk at WordCamp Los Angeles 2016 on PHP array functions. This is the companion article that I wrote for it. If you’re just looking for the slides, click here.

As a WordPress or PHP developer, you use arrays all the time. They’re an essential (if not necessary) part of your developer toolbox. But that doesn’t mean that you’re using them to their full potential.

That’s because, when we work with arrays, we also tend to work with loops as well. And loops are seductive. They let you traverse an array and perform any operation that you wish on each array element.

That said, it’s easy to overuse loops. When that happens, your code becomes hard to read and to test. That’s because loops, while easy to use, can also make your code much more complex.

But, lucky for us, PHP has a wealth of array functions. They’re used by PHP experts to make their life easier and replace a lot of instances where you’d use a loop. This, in turn, makes their code simpler, easier to read and more testable.

Relevant concepts

Before we begin, let’s take a moment to go over a few concepts. These are foundations on which the article rests. They’ll help you understand what we’re doing when we’re replacing loops by array functions.

The array data type

Let’s start with the array data type. Now, we don’t want to look at the concept of arrays just in PHP. Instead, we want to look at them in the larger context of computer science and mathematics.

In that context, arrays are still a data type used to store a collection of values. The array stores each value at a specific location using either an index or a key-value pair. You can then use the index value or the key to retrieve the value back.

But arrays aren’t just a thing in computer science. They’re also something that you use in mathematics especially with linear algebra. They just don’t go by the same name.

In fact, they have two names: matrix and vector. A vector is a one-dimensional array and matrix is a multidimensional array. You can use both with mathematical functions.

Functional programming

This is where functional programming comes in. A lot of what we’ll see in this article revolves around it. Like object-oriented programming, this isn’t an easy topic to cover. That’s why this will only be a small overview of the topic.

As the name suggests, functional programming revolves around functions. But these aren’t the same functions that you use every day with PHP. Instead, these are functions that are closer to the mathematical ones mentioned earlier.

So what’s so special about these functions anyways!? Well, for us, there are two things that are important. First, these functions always take an input and always generate an output. You can’t have a function without any parameters or that doesn’t return a value.

The second thing is that these functions come in two categories: higher-order and anonymous. Higher-order functions are functions that can accept other functions as arguments. These can be either other higher-order functions or anonymous functions. The array functions that we’ll see today are all higher-order functions.

Meanwhile, anonymous functions only exist to serve these higher-order functions. Like we mentioned earlier, higher-order functions can take them as arguments. But they can also return them as a result. In functional programming, they’re not any different from strings or any other data types.

PHP callables

And this brings us to the PHP concept of “callables“. Callables are a PHP data type used to define a function or method that PHP can call. They also go by the name “callback”.

With WordPress, callables tend to take one of two forms: a string (like 'function_name') or an array (like array($object, 'method_name')). That’s because WordPress still as PHP 5.2 has a requirement. That said, there’s a third type of callable that became available with PHP 5.3.

It’s the anonymous functions that we saw earlier. They don’t look any different from regular PHP functions with one exception. They don’t have a name. (That’s why they’re “anonymous”.)

But this is why we did an overview of functional programming. With anonymous functions, it’s now possible to use principles of functional programming with PHP. They allow us to do things that were hard or impossible to do before. (That’s why I can’t wait for WordPress to support them!) But, for this article, we’re only interested in how to use them with array functions.

With all this said, this doesn’t mean that you can’t use these other types of callables in your own code! You should if you have to support PHP 5.2 or if they make more sense to you. It’s just not what we’ll use in the examples ahead. Instead, we’ll use the array function in combination with an anonymous function.

Replacing loops with array functions

Alright! So let’s go back to the problem that we brought up at the beginning of the article. We went over the fact that we use loops a lot with arrays. (After all, what are we going to loop through otherwise!?)

But it doesn’t have to be that way! We’re going to look at common situations where you use a loop. And then we’ll rework them using PHP array functions.

Now, this isn’t the definitive list of scenarios where you can use a PHP array function. You’d be doing yourself a disservice by thinking that. Instead, these scenarios are a starting point. Once you understand how they work, you can explore ways to use them in your own projects.

Validating array values

Have you ever needed to loop through an array to either generate a smaller one or to remove array values? This is a pretty common when you’re trying to validate the values inside an array. You could do it by looping through your array like this:

posts_with_meta_key contains an array of WP_Post objects that we queried using get_posts. The query itself asked for the recent posts that had a post meta key of post_meta_key. That said, we have no idea whether the value stored in post_meta_key is valid or not.

That’s why we have to loop through all the WP_Post objects in posts_with_meta_key. We validate the post_meta_key value of each post by passing it to my_plugin_is_meta_key_valid. If it’s valid, we add it to the posts_with_valid_meta_key array. Once we’re done with our loop, we know that posts_with_valid_meta_key contains all the posts with a valid post_meta_key value.

The array_filter function

So what array function could we use instead of a loop? Well, you could use the array_filter function! It’s an array function that lets you filter (thus the name!) the values inside an array.

array_filter accepts two parameters (three starting with PHP 5.6): an array to filter and an optional callable. It works by iterating through each value in the given array. It then passes each of those values to the given callable.

The role of the callable is to determine whether array_filter should keep the array value or not. If the callable returns true, array_filter keeps the value. Otherwise, it omits the value from the filtered array that it returns at the end.

But isn’t the callable parameter optional? So what happens if you don’t give array_filter a callable argument? Well, it’ll still filter the values in the given array. It’ll omit all the values that equate to false in PHP. This is a great way to clean up an array of unnecessary values.

Replacing our loop with array_filter

Let’s go back to our array validation with a loop example. What does it look like when we rewrite it to use array_filter instead of a loop? It would look like this:

posts_with_valid_meta_key now stores the array returned by array_filter. We pass two arguments to array_filter. There’s posts_with_meta_key as the array argument and an anonymous function as the callable argument.

The anonymous function accepts a WP_Post object as a parameter. That’s because posts_with_meta_key is an array of WP_Post objects. It uses that WP_Post object to get the post_meta_key value. It then passes the value to the my_plugin_is_meta_key_valid function like earlier.

The anonymous function returns the result of the function call to array_filter. It’s how it determines whether to keep or omit the array value. The resulting posts_with_valid_meta_key array is identical to the one generated using a loop.

Generating a new array using an existing array

Let’s move on to the next scenario involving a loop and an array. Imagine that you have an array. And that, using the values inside this array, you want to create a second array. Using a loop, it could look something like this:

This is another example that uses an array of WP_Post objects. recent_posts contains an array of WP_Post objects fetched by get_posts. For this example, get_posts returns the most recent posts from the WordPress database.

The goal of the loop is to populate the second array named recent_post_permalinks. As its name suggests, it contains the permalinks of all the posts in recent_posts. We get each permalink by calling the get_post_permalink function with the post ID.

The array_map function

You can also replace this loop with an array function. This time, we’re going to replace our loop with a call to the array_map function. It’s a function that lets you transform the values of an array using a callable.

array_map accepts at least two parameters: a callable and an array. It iterates through each array value and passes it to the given callable. The callable applies changes to the array value and returns it back to array_map.

Above is a quick example using our existing recent_post_permalink array. array_map calls urlencode on all the permalinks in recent_post_permalinks. This turned recent_post_permalinks from an array of permalinks to an array of urlencoded permalinks.

Replacing our loop with array_map

There’s more to array_map than what we’ve seen. But it’s enough to let us replace our loop with a call to array_map. This is what our converted loop would look like:

This looks pretty much like the converted loop that we created using the array_filter function. The first argument is an anonymous function which also accepts a WP_Post object as a parameter. The anonymous function itself just makes a call to get_post_permalink using the post ID. It then returns the generated permalink back to array_map.

This might feel different from the urlencode example from earlier. But this is still an array transformation. We’re transforming an array of WP_Post objects into an array of permalinks.

This concept of transforming an array into another array is super powerful. It’s also a great way to determine when to use the array_map function instead of a loop. It’s amazing how often you can use it.

Searching an array

The next scenario that we’ll look at is searching an array. Now, we’re not talking about searching an array for a specific value. You can do this using the array_seach function. We’re talking about more complex scenarios like this:

This example starts off pretty much like the one for the array_map function. The way that we populate the recent_posts array is the same. We use the get_posts function to get the most recent posts from the WordPress database. But the similarities end there.

The purpose of the loop in this example is to find the longest post in the array of WP_Post objects. To do that, we need to track which post is the longest. That’s what we’ll use the longest_recent_post variable for. It starts off as null since we have no longest post yet.

To find the longest post, we need to go through each WP_Post object and compare their length. We use two PHP functions to do this: strip_tags and str_word_count. strip_tags removes the HTML and PHP tags from a string. We then pass the resulting string to str_word_count which returns the number of words in it.

Inside the loop, we use an if statement to check for two things. First, we use the instanceof operator to check if longest_recent_post is storing a WP_Post object. If it isn’t, we don’t need to compare the length of the two WP_Post objects. We can just assign the post variable inside the loop to the longest_recent_post variable.

But, if longest_recent_post is storing a WP_Post object, we have to go to our second condition. We need to compare the post_content word count of the two WP_Post objects. This is where we use the strip_tags and str_word_count functions from earlier.

If longest_recent_post has a higher word count, we do nothing. But, if post has a higher word count than longest_recent_post, then it’s our new longest_recent_post. And we reflect this by assigning post to it.

The array_reduce function

As with our other loop scenarios, we can also replace this loop with an array function. The function that we can use here is array_reduce. This is a function that lets you shrink an array to a single value using a callable.

array_reduce accepts three parameters: an array, a callable and an optional initial value. The callable function is a bit more complex with this function. It needs two parameters to work.

The first parameter is what we call the carry. It contains the current single value that array_reduce would return at this point. This is where the optional initial value comes into play. It controls the initial value of the carry.

The second parameter is the current array value. You want to compare it to the carry. You return the one that you want to keep and it becomes the new carry.

Replacing our loop with array_reduce

It’s ok if the callable doesn’t make too much sense yet. It’ll (I hope!) make more sense once we convert our loop to use array_reduce. Let’s look at that now.

Here’s our anonymous function for array_reduce. The carry is the longest_post parameter. And the post parameter is the current array value.

Now, we can enforce the type of the post parameter. That’s because we know that recent_posts is an array of WP_Post objects. But we can’t do it for longest_post because the initial carry value is null when we don’t specify one.

This is why our if statement is the same as the one from the loop. We still need to check that longest_post is a WP_Post object. It’ll only apply to the first array element, but it’s still necessary.

Earlier, the if statement decided whether to assign post to longest_recent_post or not. In our anonymous function, it controls which of the two parameters become the carry. If longest_post isn’t a WP_Post object or post has higher word count than longest_post, we return post. Otherwise, we return longest_post.

Search for a valid array value

Alright, so this might not be a common scenario! We’ll look at a loop that combines two of our previous loop scenarios. It’ll combine the one that validates the values of an array with the one that finds the longest post.

(Small note: this isn’t the only nor the best way to build this loop. It’s just the one that illustrates the combined conditions the best. Please don’t use it in practice!)

The loop above has a lot in common with the loop from the two scenarios that we saw earlier. We use get_posts to get the recent posts with the post_meta_key post meta key. We also initialize longest_post_with_valid_meta_key with null like we did with longest_recent_post.

The loop itself contains two nested if statements. The first one validates that post has a valid post_meta_key. The second one compares the length of its post_content to that of longest_post_with_valid_meta_key. If it satisfies both conditions, post becomes the new longest_post_with_valid_meta_key.

Combining array functions

The goal with this scenario isn’t to introduce you a new array function. (Surprise!) It’s to show how we can combine array functions together. This is possible thanks to a concept from mathematics (and by extension functional programming!) called “function composition“.

This concept relies on a functional programming principle. It’s that, in functional programming, a function must accept an input parameter and return a value. Here’s a diagram to illustrate how this concept works:

function-composition

Let’s imagine that you have a Function A. If you pass it an input value, it’ll process it and output a value. Behind the scenes, Function A is quite large and does quite a few things to generate that output.

What you could do is break down Function A into two functions: Function B and Function C. Function B would take the original Function A input. It would process it and output value.

Function C would take this intermediary output and use it as its input. The output from Function C would then contain the changes of both functions. It’ll also be the same output as the one from Function A. That’s function composition in a nutshell.

Now, this seems obvious explained like this! But it has powerful implications. It means that you don’t need to create a large function to make multiple changes to a value. You can just compose multiple functions together instead.

This is why, in functional programming, you want to make functions as small as possible. Each function only does one little thing. And, using those small functions, you compose a larger one that applies the effects of each.

Replacing our loop with function composition

This is what we’ll do here. We’re going to compose a function to replace our loop. It’ll use two array functions that we saw in earlier examples: array_filter and array_reduce.

This example follows the same idea as our diagram from earlier. We have two small functions that we combined together into a larger function. This larger composed function reads from the inside out.

That means that the first function is array_filter. array_filter does the same thing as earlier. We pass it the posts_with_meta_key array. And it returns an array with all the WP_Post objects where my_plugin_is_meta_key_valid returned true.

This array ofWP_Post objects serves as the input value for the array_reduce function. This is the second function in our larger composed function. It’s also the same as the array_reduce function from earlier. It returns the WP_Post object with the highest post_content word count.

The return value from array_reduce is the longest_post_with_valid_meta_key. This is the same value that we would have gotten using our loop. But instead, we combined array functions together to achieve the same outcome.

Why do this?

The question that you might still have at this point is “Why should I stop using loops for this?” That’s an understandable feeling to have! Loops are familiar to you. This makes them easier to use.

Meanwhile, using PHP array functions (and, by extension, functional programming) feels harder. It’s not as easy to read and understand this type of code. But this isn’t something that’s permanent either. You get better at it with time.

We can’t say the same thing about loops when we overuse them. Doing that brings with it a certain amount of complexity that won’t disappear over time. This is something that matters when you’re getting better as a developer.

You want to write code that’s simple and easy to test. This, in return, improves your code’s quality and helps prevent technical debt. These are some of the reasons to use PHP array functions in your code whenever you can.

Slides

  • Hey Carl,

    Love your slides! 🙂

    Part of the reason of why people have trouble understanding the point of using these array_* functions is that the syntax is a) counter-intuitive and b) horrible when stacked.

    Please consider adding a follow-up article showing how to use these same concepts with collections and a chained, fluent syntax.

    So, instead of having:

    you’ll having something like this:

    The more steps you add, the more obvious the advantage of such a syntax becomes. With the existing array_* PHP functions, every step is split around the beginning and the end of your code, like middleware. With a fluent interface, you can practically read it in the exact way it will be processed.

    • Yep, I love that idea Alain. I’m sure you saw this book go by:
      https://adamwathan.me/refactoring-to-collections/

      It’s on my to read list. It talks about a collection class like you describe. I agree it would be fantastic and also easier to use than this.

      • I already own that one and went through most of it. I assume that you personally will not find that much new stuff in the book, but it is presented in a very nice way and is a recommended read for anyone interested in the subject.

        I just find the title somewhat misleading. Although it uses the Laravel Collection for most of the hard work, it is about first and foremost about using higher-order functions with a fluent interface. The actual benefits of using a Collection abstraction are not really put forward, it is just the way the map() and filter() methods are accessed.

  • Web factory Ltd

    Nice article!
    For those who will (I assume) ask – the complexity of both the loop approach and the array functions is the same. O(n) in most cases. Hence the speed is, in theory, the same as well. In practice it depends how PHP implements some language specifics. I haven’t tested that but would definitely like to see some results. Especially on real-world smaller arrays that we use in WP.

    • I’m curious too! That said, I’m not sure if it matters too much in practice. Most of the time, these arrays don’t contain that many values. It’s the logic that’s complicated.

      • Performance is not really the main reason of using array functions over loops, and in most simple scenarios, the loops will be a bit faster as they don’t incur two additional function calls per element.

        That being said, on more complex examples, the array functions allow for memory optimizations that cannot easily be done with the normal loops. You’ll be able to never have the entire set of elements in memory at once, which is only possible for normal loops using a detour over custom classes and iterators, with these iterators being an additional way of “looping” over your code, in a pure OOP fashion.

        In most normal cases, you’ll want to prefer self-documenting and manageable code over code that is 5% more performant.