It’s not too far-fetched to see WordPress as a library. You write a post and publish it. Meanwhile, WordPress classifies it and puts it away on its shelves.
But how do you find your post again after WordPress shelved it away? You need someone to help you navigate this huge library and find what you’re looking for. You need a librarian!
That’s the job of the WP_Query
class in a nutshell. It’s the librarian of the WordPress database. You talk to it when you want to search through the WordPress database. It’ll help you find the information that you need!
It does this without requiring that you know how WordPress stores that information. That means that you don’t have to know how to write MySQL queries. WP_Query
takes care of all that for you. (Yay!) It’ll transform your search request into a safe MySQL query and process it for you.
But how does the WP_Query
class do this? That’s what we’re going to look at today. We’re going to explore how it works and what it can do for you. This will help you leverage it to the maximum in the future.
What problem does WP_Query solve?
MySQL queries are a bit of tricky beast. On one hand, they let you do pretty much anything that you want. That said, in the hands of an inexperienced developer, they’re a serious security risk.
The truth is that a lot of us don’t write code that touches MySQL that often. Even with the tools that WordPress gives us, it’s always a bit of a risk. That’s why WordPress put the WP_Query
class at our disposal.
Gone is the need to create MySQL queries! You can just fill an array with query arguments instead. WordPress takes care of all the MySQL messiness for you and just returns the posts that you want. That’s pretty sweet.
The life cycle of WP_Query
So what happens when WP_Query
needs to fetch posts from the
database? How does it turn an array of query arguments into an array of WP_Post
objects? This isn’t a documented process so we’re left to ourselves to figure out what’s going on.
To help us with that, we’re going to break down the process into distinct steps (shown above). These attempt to explain what’s going on inside WP_Query
during that process. Let’s take a look at them.
Initializing WP_Query
Before WP_Query
can do anything, it first needs to initialize itself. It does this using two methods: init
and init_query_flags
. init
is the main method that WP_Query
calls when it wants to initialize itself. It resets all its internal variables to their initial values.
init
then calls init_query_flags
. This is a secondary initialization method that focuses only on resetting the query flags. This just means setting them all to false
.
So why does WP_Query
do things this way? Why doesn’t it just do it in its constructor? It’s because WordPress lets you reuse the same WP_Query
object to do as many queries as you want. Because of this, it needs these methods to live outside the constructor.
Extracting the WordPress query arguments
Once initialized, WP_Query
needs a query to parse. To WP_Query
, a query is an array of query arguments. It’s what you use instead of MySQL. It needs this array before it can do anything.
That said, it also accepts the query in the form of a string. It needs to follow the same format as URL query strings (e.g. year=2012&monthnum=12&day=12
). If this happens, WP_Query
will just convert the string to an array of query arguments using wp_parse_args
.
These query arguments then get assigned to two internal variables: query
and query_vars
. There’s no difference between the two nowadays. In the past, query
would store the string version of the WordPress query. Meanwhile, query_vars
would contain the array version.
Converting the query arguments into query flags
Once WP_Query
has an array of query arguments, it needs to convert them into something it can use. That’s the query flags that we discussed earlier. This conversion to query flags happens in the parse_query
method.
The name of the method is a bit misleading. There’s no parsing that happens in this method anymore. The method name is just another artifact from when WP_Query
used a string instead of an array.
So what happens in parse_query
? Well first, it’ll validate all the inputs in the query_vars
array. For example, if you set a year
, it’ll ensure that it’s a non-negative integer and so on.
Once parse_query
has validated the query arguments array, it can start the conversion process. That conversion process is nothing more than dozens of if
, elseif
and else
statements. It uses those conditional statements to inspect all the validated query arguments. It’s during that inspection that it sets the appropriate query flags.
Turning query arguments into a MySQL query
Alright, so parse_query
has converted query arguments into query flags. That said, these query flags won’t get posts for us. We still need a MySQL query to fetch them.
Creating this MySQL query is the main job of the get_posts
method. The method is a lot like parse_query
. It’s just a huge set of conditional statements.
SELECT $found_rows $distinct $fields FROM $wpdb->posts $join WHERE 1=1 $where $groupby $orderby $limits
These conditional statements have a single purpose. That’s to populate all the variables that make up the MySQL query (shown above). To achieve that goal, get_posts
needs over 1,200 lines of code. (Wowzers!)
Because of its size, it doesn’t make a lot of sense to go over get_posts
in great detail. (It wouldn’t be that useful for us anyways.) That said, there are a few things going on that are worth highlighting.
Subqueries
Right now, our focus has been on the WP_Query
class itself. That said, it’s worth mentioning that there are other WP_*_Query
classes. get_posts
uses a few of them to generate parts of the MySQL query.
The first one is WP_Meta_Query
. This is the query that get_posts
uses to handle custom field parameters. It uses it to generate part of the join
,where
and orderby
variables.
The next one is WP_Date_Query
. get_posts
relies on it whenever you use date parameters in your query arguments. Using it, get_posts
creates some of the SQL in the where
variable.
The last one is WP_Tax_Query
. get_posts
uses it whenever you use the tax_query
query parameter. WP_Tax_Query
will use it to generate some of the SQL for the join
and where
variables. It’s also worth noting that all category and tag query parameters get merged into tax_query
argument. So they use WP_Tax_Query
as well.
Actions and Filters
get_posts
has two action hooks that you can use: pre_get_posts
and posts_selection
. It also has a countless number of filters that you can use. (Seriously, there’s a lot!) These filters let you make changes to the MySQL query that get_posts
is generating. And their large number means that you can make those changes with surgical precision.
But let’s not forget about those two actions that we mentioned at first. posts_selection
doesn’t seem to get any use at all. The documentation says that it’s for caching plugins, but they don’t seem to use it now. (Maybe they did at some point?)
Meanwhile, pre_get_posts
is the complete opposite. It’s a super useful action hook. So much so that we’re going to take a small break to look at it!
The pre_get_posts hook
pre_get_posts
is one of the most powerful hooks in all WordPress. It lets you change theWP_Query
object before theget_posts
method starts generating the MySQL query. It does that by passing you a reference to the WP_Query
object.
This means that all the changes that you make are permanent. There’s no need to use a global variable or return value. You’re always modifying the WP_Query
object that get_posts
is going to use. (Scary!)
It also means that you can also do some serious damage if you’re not careful. WordPress isn’t going to do any validation to see if the changes that you made are safe. It’ll just ensure that all the necessary query arguments are there using fill_query_vars
.
A small pre_get_posts example
On the flip side, you can do some pretty cool stuff with this hook. A common example is hiding a post category. You can do that by adding the category ID to the category__not_in
query argument.
function remove_uncategorized_category(WP_Query $query) { $query->query_vars['category__not_in'][] = 1; } add_action('pre_get_posts', 'remove_uncategorized_category');
This hides the default Uncategorized
category which always has the ID of 1
. It’s worth noting that WP_Query
always initializes the category__not_in
query argument as an array. That’s why we don’t have to do any validation before adding the category ID to the array.
But there’s a problem with our function. Can you spot it? Our posts are also hidden in the WordPress administration panel. (Oops!) Let’s make a small change to fix that.
function remove_uncategorized_category(WP_Query $query) { if (!is_admin()) { $query->query_vars['category__not_in'][] = 1; } } add_action('pre_get_posts', 'remove_uncategorized_category');
We just added an is_admin
check. That way you can still see the posts when you’re in the WordPress administration panel. And that’s why you need to be careful when you use pre_get_posts
hook!
Generating WP_Post objects
The last step of the process is converting our database results into WP_Post
objects. This happens twice at the tail end of the get_posts
method. In both cases, the conversion uses this block of code:
$this->posts = array_map( 'get_post', $this->posts );
posts
is a WP_Query
internal variable. It contains the result of the MySQL query that get_posts
generated. This result is an array containing either more arrays or stdClass
objects. Neither of these are WP_Post
objects.
That’s where array_map
comes in. It’ll take the array inside posts
and pass each element to get_post
. get_post
will then convert each of these array elements to a WP_Post
object.
Once array_map
finishes going through the array, it’ll only contain WP_Post
objects. It’s a simple trick, but it highlights the power of PHP’s array functions. One small line of code to convert all your posts to WP_Post
objects!
WP_Query and “The Loop”
Now that you understand how WP_Query
fetches posts from the database. We have to look at another important aspect of the WP_Query
class. And that’s its relationship with “The Loop“.
In fact, for a lot of people (and maybe you too!), “The Loop” is what they associate with WP_Query
. They just can’t think of one without the other. After all, “The Loop” is pretty much the foundation of WordPress. It’s how most of us interact with posts. So how does WP_Query
manage the loop?
The history of “The Loop”
Before we get into the inner workings of “The Loop”, let’s go over some of its history. The idea of a loop that goes through every WordPress post isn’t new. It’s been around since the time of b2.
Before the introduction of the WP_Query
class, “The Loop” looked like this:
<?php if ($posts) : foreach ($posts as $post) : start_wp(); ?>
It was just a foreach
loop that would go through the posts
variable. That was a global variable that WordPress would use to store all the posts that it fetched from the database. It wasn’t any different from the posts
variable that we saw earlier inside WP_Query
.
But where did WordPress fetch posts before WP_Query
? Well, WordPress would create and execute a MySQL query in wp-blog-header.php
. It would then store the result from that query in posts
. This would happen only once per page load. There was no way to run the WordPress query again afterwards.
Now, let’s go back to our foreach
loop. As it loops through the posts
variable, it creates a post
variable in the global scope. start_wp
would then use that global variable to display the current post.
There was a serious drawback to how “The Loop” worked before WP_Query
. You couldn’t use “The Loop” more than once. That’s because WordPress stored everything in global variables.
The arrival of the WP_Query
class changed that. Instead of relying on global variables, WordPress would store the query results in it. The query generation code also moved from wp-blog-header.php
to WP_Query
. These changes made it possible to use WP_Query
to create more than one loop in your code.
How does WP_Query manage “The Loop”?
So that was the origin of the relationship between WP_Query
and “The Loop”. We’ve also seen how WP_Query
generates a MySQL query to fetch posts from the database. There’s only one piece of the puzzle left. It’s to look at what’s going on in WP_Query
when it’s going through “The Loop”.
The internal variables
From WP_Query
‘s perspective, “The Loop” is just a bunch of internal variables. These variables are: current_post
, in_the_loop
, post
and post_count
. All that WP_Query
does is manage them as it goes through “The Loop”.
Out of those four variables, three are important to the inner workings of “The Loop”. current_post
stores the index value of the current post in the posts
array. post
is the WP_Post
object at the current_post
index. post_count
tracks the total number of posts in the posts
array.
Meanwhile, in_the_loop
is just a flag that tracks whether WP_Query
is in “The Loop”. WP_Query
doesn’t even use it. It’s there to give you an easy way to find out what the status of “The Loop” is.
Checking if we have posts in “The Loop”
Before looping through the posts that WP_Query
fetched, you need to know if it even fetched any. That’s part of the job of the have_posts
method. It returns true
or false
whether there are still posts in “The Loop” or not.
It replaces the if ($posts)
from the old loop. Instead of that if
statement, have_posts
compares the current_post
variable to the post_count
variable. It’s looking to see what would happen if it incremented current_post
. Would it be larger than post_count
?
As we saw in the previous section, current_post
is the index of the current post in the posts
array. We don’t want current_post
to be larger or equal to post_count
. That would mean that current_post
points to a non-existing array element.
Resetting “The Loop”
When that happens, have_posts
does the other part of its job. It resets “The Loop” using the rewind_posts
method. It’s a small method that changes some of the internal variables that manage “The Loop”.
The method resets the current_post
index to -1
. This tells WP_Query
that the loop hasn’t started yet. It also changes the post
variable so that it contains the post at the beginning of the posts
array. That’s because the post stored in post
needs to match the one that current_post
points to.
Once rewind_posts
finished resetting “The Loop”, have_posts
sets the in_the_loop
flag to false
. This is the last step in the reset process of “The Loop”.
Looping through the posts
In the old loop, you’d cycle through all the posts using foreach ($posts as $post)
. The new loop replaces that foreach
loop with the the_post
method. This is the method that does the actual looping part of “The Loop”.
Whenever you call the_post
, it’ll always start by setting the in_the_loop
flag to true
. It’ll also check if current_post
is set to -1
. If it is, it’ll call the loop_start
hook.
Once that’s done, the_post
will call the next_post
method. This is a small method that increments current_post
index by one. It then fetches the post at that index in the posts
array and sets it to the internal post
variable. It finishes up by returning the post to the the_post
method.
Setting up the post data
Once it has a post, the_post
has one last thing to do. It needs to set up all the data from the WP_Post
object into global variables that WordPress will use. This is the job that the start_wp
function handled in the old loop. Now, it’s the_post
that handles it by calling the setup_postdata
method.
The setup_postdata
method does almost the same thing as the old start_wp
function. The big difference is that you have to pass it a post to setup. start_wp
would only use the post
global variable from the foreach($posts as $post)
.
So what does setup_postdata
do? Well first, it needs to ensure that you passed it a WP_Post
object. If you didn’t, setup_postdata
will try to convert it into one. If that doesn’t work, the method doesn’t do anything. (Bummer)
Once it has aWP_Post
object, setup_postdata
starts extracting global variables from it. These are:
- id
- authordata
- currentday
- currentmonth
- page
- pages
- multipage
- more
- numpages
These global variables are remnants of some of the oldest code in core. Some, like id
, are pretty easy to figure out, but others are not. That’s why we’re going to take a moment to go over them.
Pagination global variables
The primary use of these global variables is pagination. WordPress needs a lot of them to handle it. So which one are they?
To begin, you have page
which stores the current page number. Unlike the rest of the global variables, page
comes from the query parameter with the same name. It doesn’t come from the WP_Post
object. This makes sense since it doesn’t make sense to store the current page number in a post.
Next, you have pages
which is an array that contains the content of the post, but split by page. setup_postdata
creates the array by exploding the content using <!--nextpage-->
as the delimiter. It then counts how many elements are in pages
and stores that in numpages
.
The last pagination global variables is multipage
. It’s a flag that can be either true
or false
. multipage
is set to true
whenever numpages
is greater than one. This alerts you that the post needs to use pagination.
Controlling the more tag
The more
global variable is the one that’s least documented and hardest to understand. It’s a flag that tells WordPress whether to respect the more tag or not. That’s the <!--more-->
that you add to your post when you want to only have a teaser on the homepage.
By default, more
has a value of 0
. This tells WordPress to respect the more tag in your post. setup_postdata
will change that value to 1
in specific scenarios. These are when:
is_page
istrue
. (You’re viewing a page.)is_single
istrue
. (You’re viewing a post.)is_feed
istrue
. (You’re on a feed.)numpages
is greater than1
andpage
is greater than1
. (You’re not on page 1 of a multipage post.)
If you think about it, those scenarios make sense. You only want to truncate the content for a teaser on the home page. You don’t want that when you’re viewing a post, a page or a feed.
What about the other global variables?
The rest of the global variables are pretty straightforward. You have id
which is the ID of the post that setup_postdata
is setting up. currentday
and currentmonth
are the day and month the author published that post. setup_postdata
formats the two using mysql2date
.
You also have authordata
. It stores the result of the call to get_userdata
. Unless there’s an error, this will always be the WP_User
object of the post author.
Managing post comments
WP_Query
doesn’t just manage posts. It also manages the comments of a post. WP_Query
stores them in the comments
internal variable as an array. And, in most cases, this array will stay empty. WP_Query
doesn’t fetch comments by default.
It needs a comment feed
WP_Query
itself will only fetch comments in a specific situation. That’s when the is_comment_feed
flag is true
. When that happens, get_posts
will run a separate query to fetch the comments of a post. It’ll then store the result of that query in the comments
variable.
There’s also the comment template
There’s one other situation where WordPress will fetch comments for WP_Query
. That’s inside the comments_template
function. This is the function that a theme calls when it wants to load the comments template for a post.
Like we mentioned earlier, WordPress doesn’t fetch the comments of a post by default. That means that, when you call comments_template
, the comments
array is still empty. That’s a bit of an issue when your job is to load the template to display these comments.
But don’t you worry, comments_template
is on the case! It fetches all comments using a comment query. It then stores the result of the query inside WP_Query
. This lets the comments template use the comment loop.
The comment loop
“Comment loop?”, you say. Why yes there’s also a comment loop! WP_Query
is the class in charge of managing it.
The code for the comment loop is like a leaner version of the code for “The Loop”. It only uses three internal variables: comment
, comment_count
and current_comment
. WP_Query
uses them the same way it does with their post counterparts.
The methods are also the same as “The Loop”. You just replace “post” with “comment”. The result is that the comment loop looks the same as “The Loop”.
That said, the comment loop doesn’t see that much use. That’s because theme designers have the option to use another more convenient function. That’s wp_list_comments
.
How does WP_Query handle nested loops?
As we’ve seen throughout this article, WP_Query
uses a LOT of global variables. But what happens when you want to nest a loop inside another? How does it manage all these global variables? The trick is an object-oriented feature called “encapsulation“.
Encapsulation to the rescue!
That’s because, if you dig down, every WordPress query is an instance of WP_Query
. That’s true even for the main WordPress query. WordPress stores it in the wp_the_query
global variable.
With encapsulation, WordPress can ensure that every query result stays safe. Each WP_Query
object will always store the result of their own query inside itself. And you can access it as long as that instance of WP_Query
still exists.
And it gets better! Encapsulation also ensures each WP_Query
object has its own loop. Like the query result, each WP_Query
object stores the state of its own loop. You can go back to it at any time as long as PHP didn’t destroy that instance of the WP_Query
object.
What’s really happening
With this in mind, let’s go back to our initial question, “How does WP_Query handle nested loops?” Well, the reality is that there isn’t any nesting happening per se. It was all clever trick by encapsulation! (The rascal!)
The fact is that each WP_Query
instance encapsulates its own loop. It never contains anything other than its own loop. So what does the WP_Query
class do when you’re “nesting” loops?
Well, it still needs to manage all the global variables for the post and query. Those are the variables that WordPress functions use. If we don’t replace them, they won’t refer to the correct post or query.
For example, let’s take the have_posts
function. It calls the WP_Query
method from the instance stored in the wp_query
global variable. That means that we need wp_query
to always store the current query that we’re using.
So that’s what nesting loop come down to. WordPress needs to replace global variables whenever you swap from one query to another. This isn’t as complicated as it sounds.
Managing global variables
The main global variable replacement scenario is for post global variables. So let’s say that you switch from one WP_Query
instance to another. You need the_title
to output the title of the current post in the query that you just swapped to and not the old one.
The reset_postdata
method in the WP_Query
class handles this scenario. It takes the post stored inside the post
variable and restores its global variables. It does that by setting the post
variable as the new post
global variable. It then calls setup_postdata
so that it can restore the rest of the global variables.
WordPress also offers the wp_reset_postdata
function. This function also calls the reset_postdata
method. It calls it on the WP_Query
instance stored in the wp_query
global variable.
wp_query
is an important global variable for WordPress. This is where it stores what it considers to be the current query. All WordPress loop functions refer to it when they need to access to a WP_Query
instance. That’s why WordPress also needs a function to reset the wp_query
global variable.
That’s the job of the wp_reset_query
function. It replaces the current query with the main WordPress query. As we saw earlier, WordPress stores that query in the wp_the_query
global variable.
The function itself just replaces the wp_query
instance with the wp_the_query
one. Once it does that, it calls wp_reset_postdata
. This resets all the global variable so that they point to the current post in the main WordPress query.
Your personal WordPress librarian
The WP_Query
class has now been around for over a decade. That’s a long time! And even after all this time, it’s still the preferred way to access posts stored in WordPress.
That’s because creating MySQL queries isn’t for everyone. So it’s handy to have a personal librarian to help you find posts. But the WP_Query
class isn’t without its complexities.
There are a lot of global variables at play. And things can get messy when you try to nest queries. That’s why it’s a good idea to know how it works. But also know how it ties back to the inner workings of WordPress and “The Loop”.