April 28 2023

Why you shouldn't use WP Rocket due to the last-modified bug

We have discovered a serious bug affecting WP Rocket that could seriously affect the SEO of your site.

WP Rocket is one of the most used WordPress plugins to improve website performance. According to the commercial declarations reported on their website, the plugin is currently used by over 3 million WordPress installations.

WP Rocket number of installations

However, considering the huge circulation, multiple license plans and the fact that WP Rocket is one of the most coveted and most pirated WordPress plugins ever, it is safe to say that WP Rocket is currently running on at least 12 million WordPress sites worldwide.

This plugin offers many useful features to optimize site caching and improve page load times. WP Rocket saves the cache as files on disk and offers many configuration options to optimize site performance.

Although WP Rocket is a very popular caching solution compared to other competitors, has a serious bug that could affect the ranking and indexing of your site. This bug affects the "last-modified" field which is used by search engines to understand if a page has been modified recently. If this field is not updated correctly, search engines may not index your site correctly and this could lead to a decrease in organic traffic.

Therefore, it is important to take this WP Rocket bug into consideration when considering using this plugin for your online business. In this text, we will take a detailed look at this bug and the consequences it could have on your website. If you want to get the most out of your online business and make sure your site ranks well in search engines, you should avoid using WP Rocket until this bug is fixed permanently.

An overview of the Last Modified HTTP Header.

The "Last-Modified" HTTP header is a header field used in client-server communication to provide information on the date and time of the last modification of a web resource. This field is sent by the server to the client as part of the HTTP response e contains the date and time the resource was last modified.

Last Modified Header

Its main purpose is to help clients determine if a web resource has changed since it was last accessed, so that they can avoid downloading the same resource again if it hasn't changed.

In this way, the “Last-Modified” field helps to reduce network traffic and improve website performance. When a client requests a resource, it can include an “If-Modified-Since” header field in its request, containing the date and time the resource it saved was last modified. If the resource has not been modified since it was last accessed, the server responds with a “304 Not Modified” status, thus avoiding transferring the resource again.

Last Modified and Google crawlers.

The "Last-Modified" HTTP header field provides information about the date and time a web resource was last modified. However, if this field is configured incorrectly and always returns the current date, disregarding the effective date the resource was last modified, some unintended consequences may occur for Google crawlers.

In particular, if the "Last-Modified" header always returns the current date and time, Google crawlers could misinterpret the resource as a new page, even if it hasn't actually been modified. This could lead to a decrease in website performance in terms of indexing and positioning in search results.

Also, if the "Last-Modified" field is set incorrectly, the server may send a "200 OK" response even though the resource hasn't been modified since it was last accessed. This could lead to an increase in network traffic and a decrease in website performance, as Google's crawlers could download the same resource again, even though it has not actually been modified.

Incorrect Last-Modified and Google Bot Crawl Budget.

Google's crawl budget represents the amount of a website's resources that the search engine is willing to devote to crawling and indexing the website. This budget depends on many factors, including the quality of the website, how often the content is updated, and the speed of the website.

Crawl_budget_Google

An always up-to-date Last-Modified HTTP header can be detrimental to Google's crawl budget as it could cause the search engine to devote unnecessary resources to crawling resources that haven't actually been modified. In particular, if the Last-Modified header is updated every time the cache is regenerated, even if the content hasn't changed, Google could interpret the resource as a new page and spend unnecessary resources crawling that page.

This would lead to ineffective use of Google's crawling budget, which could be better used to crawl other more relevant and up-to-date resources. Also, ineffective use of Google's crawl budget could slow down the indexing of the website's most important resources.

To avoid this problem, it is important to ensure that the Last-Modified HTTP header is only updated when the resource has actually changed. In this way, it is possible to guarantee correct management of the Google crawl budget and improve the overall indexing of the website.

WP Rocket and the Last-Modified bug in detail.

When WP Rocket caches WordPress posts and pages to disk, replaces the Last-Modified date of the article with that of the generation or regeneration of the cache. This means that every time the cache is regenerated, WP Rocket updates the Last-Modified field, even if the post or page may not have been modified or updated in years.

Specifically, WP Rocket determines the date of the Cache file produced on disk and proposes that date as the Last modified HTTP header. The offending code is the following:

Last-Modified-Bug-Wp-Rocket-Code

As we can see from the above code, WP Rocket returns the Last-Modified header, formatting the date retrieved from the cache date on disk via the filemtime() PHP function.

PHP's filemtime() function is a native function that returns the last modification date of a file. This feature is very useful for checking if a file has changed since a previous version and for updating the file cache.

This bug may cause some problems for search engines and other crawlers that use the Last-Modified field to determine if a resource has been modified since it was last accessed. If the Last-Modified field is updated every time the cache is regenerated, even if the post or page hasn't been modified, then crawlers could misinterpret the resource as a new page, even if it hasn't actually been modified.

To avoid the Last-Modified bug, WP Rocket should have updated the HTTP header in the produced cache files by retrieving the last modified date of the post or page directly from the WordPress database. This way, the Last-Modified field would always be updated only when the resource was actually modified, ensuring proper indexing by search engines.

To implement this solution, WP Rocket should have used the WordPress “get_post_modified_time()” function to retrieve the last modified date of the post or page. This function returns the date and time the post or page was last modified, which can be used to properly update the Last-Modified field in the HTTP header.

We have officially reported the bug to WP Rocket.

The error we have discovered is very serious especially considering that it is not an oversight by the developers, but precisely the lack of fundamentals and cornerstones on what the Last-Modified HTTP header is and what it is for. Therefore, developing a plugin intended for millions, tens of millions of websites without having in mind the basic logic of how header management should work is not a serious but very serious fact.

We therefore wanted to report the matter directly to WP Rocket, including in the reporting request a hypothetical example, in which a hypothetical post published on December 24th to wish the readers Christmas wishes, would result in a subsequent Last-Modified header, and after the December 24th itself, although in no way the post has been edited.

Wp-Rocket-Last-Modified-BugThey have ignored us for now and instead of thanking us …

It may seem absurd that in the face of such a serious problem reported to technical support, they limited themselves to informing us that they cannot respond to our request because our Wp Rocket license has expired.

It's not a joke, but it's really what they replied to us in the email that we report below.

Response-Mail-WP-Rocket-to-bug-Last-Modified

One wonders if it is right that a similar company with such carelessness should continue to deserve the full trust of their customers, who unaware of the seriousness of their behavior go to create real damage to their customers' sites without even knowing it or at least want to know.

For now we have solicited the request using the right names and specifying that we are not the ones asking them for support, but we are the ones offering it free of charge to them given their ignorance on the subject.

We are hopeful of a FIX as soon as possible and in the meantime we advise you to monitor the situation on Google's crawl statistics and measure any negative effects if you are already using it.

Do you have doubts? Don't know where to start? Contact us!

We have all the answers to your questions to help you make the right choice.

Chat with us

Chat directly with our presales support.

0256569681

Contact us by phone during office hours 9:30 - 19:30

Contact us online

Open a request directly in the contact area.

INFORMATION

Managed Server Srl is a leading Italian player in providing advanced GNU/Linux system solutions oriented towards high performance. With a low-cost and predictable subscription model, we ensure that our customers have access to advanced technologies in hosting, dedicated servers and cloud services. In addition to this, we offer systems consultancy on Linux systems and specialized maintenance in DBMS, IT Security, Cloud and much more. We stand out for our expertise in hosting leading Open Source CMS such as WordPress, WooCommerce, Drupal, Prestashop, Joomla, OpenCart and Magento, supported by a high-level support and consultancy service suitable for Public Administration, SMEs and any size.

Red Hat, Inc. owns the rights to Red Hat®, RHEL®, RedHat Linux®, and CentOS®; AlmaLinux™ is a trademark of AlmaLinux OS Foundation; Rocky Linux® is a registered trademark of the Rocky Linux Foundation; SUSE® is a registered trademark of SUSE LLC; Canonical Ltd. owns the rights to Ubuntu®; Software in the Public Interest, Inc. holds the rights to Debian®; Linus Torvalds owns the rights to Linux®; FreeBSD® is a registered trademark of The FreeBSD Foundation; NetBSD® is a registered trademark of The NetBSD Foundation; OpenBSD® is a registered trademark of Theo de Raadt. Oracle Corporation owns the rights to Oracle®, MySQL®, and MyRocks®; Percona® is a registered trademark of Percona LLC; MariaDB® is a registered trademark of MariaDB Corporation Ab; REDIS® is a registered trademark of Redis Labs Ltd. F5 Networks, Inc. owns the rights to NGINX® and NGINX Plus®; Varnish® is a registered trademark of Varnish Software AB. Adobe Inc. holds the rights to Magento®; PrestaShop® is a registered trademark of PrestaShop SA; OpenCart® is a registered trademark of OpenCart Limited. Automattic Inc. owns the rights to WordPress®, WooCommerce®, and JetPack®; Open Source Matters, Inc. owns the rights to Joomla®; Dries Buytaert holds the rights to Drupal®. Amazon Web Services, Inc. holds the rights to AWS®; Google LLC holds the rights to Google Cloud™ and Chrome™; Facebook, Inc. owns the rights to Facebook®; Microsoft Corporation holds the rights to Microsoft®, Azure®, and Internet Explorer®; Mozilla Foundation owns the rights to Firefox®. Apache® is a registered trademark of The Apache Software Foundation; PHP® is a registered trademark of the PHP Group. CloudFlare® is a registered trademark of Cloudflare, Inc.; NETSCOUT® is a registered trademark of NETSCOUT Systems Inc.; ElasticSearch®, LogStash®, and Kibana® are registered trademarks of Elastic NV This site is not affiliated, sponsored, or otherwise associated with any of the entities mentioned above and does not represent any of these entities in any way. All rights to the brands and product names mentioned are the property of their respective copyright holders. Any other trademarks mentioned belong to their registrants. MANAGED SERVER® is a registered trademark at European level by MANAGED SERVER SRL Via Enzo Ferrari, 9 62012 Civitanova Marche (MC) Italy.

JUST A MOMENT !

Would you like to see how your WooCommerce runs on our systems without having to migrate anything? 

Enter the address of your WooCommerce site and you will get a navigable demonstration, without having to do absolutely anything and completely free.

No thanks, my customers prefer the slow site.
Back to top