Table of contents of the article:
In the world of database management, MySQL Query Cache has played a crucial role for years. This feature was intended as a way to improve database performance by storing query results for reuse later. However, with the arrival of MySQL 8, the Query Cache was finally abandoned. But why was a seemingly useful feature removed? In this article, we will explore the reasons for this decision in detail.
What is MySQL Query Cache?
The MySQL Query Cache, First introduced with MySQL version 4.0 in 2002, it was intended as a dedicated portion of memory within the MySQL daemon to store query results. Its operation was based on a relatively simple but effective principle: if the same query had been executed repeatedly, the Query Cache would have made it possible to avoid reprocessing that query by directly providing the previously stored result. This feature was particularly useful during the years when hardware resources, such as CPUs and disks, were considered high-value assets and were not as easily accessible as they are today. At that time, optimizing the use of available resources was fundamental, and the Query Cache represented an effective solution to reduce the workload on systems.
Over the years, with the architectural evolution of CPUs, programming languages, servers, the advent of the Cloud and new technologies and paradigms, the meaning of the original MySQL Query Cache has increasingly lost its original intent, coming to be very often a con and not a pro, both in the DBMS design phase by the companies that have developed MySQL over time, from MySQL AB to Sun Microsystem to the current Oracle, to the point of generating load and speed towards developers and systems engineers who placed their trust in this now obsolete and old feature.
Starting from MySQL 5.7 launched in 2015, the Query Cache was declared officially deprecated and was officially removed as a feature starting from MySQL 8 launched in April 2018.
We can therefore say that for at least 10 years the use of MySQL's integrated Query Cache has begun to be criticized and considered not very useful, and that since at least 2018, Oracle has decided to remove this feature.
Reasons to Disable the MySQL Query Cache
1. Exact Query Matching
The MySQL Query Cache stores query results using a hashed version of the queries. This approach means that even small differences between two queries result in a different hash, compromising the effectiveness of caching for slightly different queries. For example, a query like
SELECT id, field1, field2 FROM table; is considered different from
SELECT * FROM table; even if the selected fields are the same and in the correct order.
Adding comments in queries also generates a different hashed version. With the use of modern dynamic ORMs and APMs, comments in queries are often employed to trace the process and include hints such as timestamps and hostnames. Adding a timestamp and hostname for process tracking provides a specific benefit, but makes the query unique, effectively rendering the Query Cache useless.
Interestingly, MariaDB 5.5, MariaDB 10.x, and MySQL 5.x versions of Percona include a feature that removes comments from queries when they are stored in the Query Cache. This functionality can be enabled via the named system variable
2. Non-deterministic features
One of the most significant limitations of the MySQL Query Cache lies in its inability to handle non-deterministic elements. The concept of “non-deterministic” refers to those elements in a query that can produce different results each time they are executed. Common examples include the use of temporary tables, user variables, and functions such as
In the case of non-deterministic functions like
RAND(), which generates a random number, or
NOW(), which returns the current instant, the Query Cache turns out to be ineffective. The reason is simple: the result of a query containing such functions changes with each execution, making it unnecessary to memorize the previous result. This aspect is particularly relevant in many modern applications, where the use of functions based on time or randomly generated data is frequent.
For example, imagine a query that uses
NOW() to select records up to the current time. If this query is run at different times, the results will vary based on the current time, making a previously stored result in the Query Cache useless. Furthermore, many modern applications and frameworks rely heavily on these functions to determine data visibility or to generate dynamic content, which makes Query Cache impractical and often unused.
The Query Cache's inability to adequately handle these non-deterministic elements severely limits its use in modern database architectures, where flexibility and the ability to handle dynamic, ever-changing data are essential.
3. Cache Invalidation for Insert, Update, Delete
One of the most significant problems with the MySQL Query Cache is related to its management of INSERT, UPDATE and DELETE operations. Whenever one of these events occurs, the Query Cache automatically invalidates the stored results that are related to the tables involved. This mechanism is designed to prevent the provision of stale data, ensuring that the database always returns the most up-to-date data.
However, this strategy brings with it a number of inefficiencies. In a dynamic database environment, where insert, update, and delete operations are frequent, the Query Cache may find itself invalidating and recreating the cache at a high frequency. This process not only consumes valuable system resources but also drastically reduces the benefits that caching should bring.
For example, let's consider a table that is used frequently for transactions or data updates. Any modification to this table – even the most minimal – will result in the invalidation of all previously cached query results that include this table. In scenarios where tables are subject to frequent updates, this behavior makes the Query Cache almost useless, since the cached data is continuously invalidated and rarely used.
Furthermore, this invalidation mechanism is not selective; it does not discriminate between small insignificant changes and large data updates. As a result, even a single insert or small change can cause a large portion of the cache to be invalidated, increasing the workload on the server and decreasing overall performance.
In summary, the approach taken by the MySQL Query Cache to manage data updates turns out to be a double-edged sword: While it ensures data freshness, it simultaneously undermines the efficiency and effectiveness of caching, especially in environments with high-frequency database operations.
4. Table-Based Caching
MySQL Query Cache is based on a table-oriented caching principle. This means that whenever a table undergoes a modification, all cache entries associated with that table are invalidated. While this strategy may seem reasonable to maintain data integrity, it has significant drawbacks in terms of caching efficiency, especially in environments with dynamic and frequently updated databases.
In a real-world context, database tables are often subject to continuous changes, both small and large. These may include routine updates, new data entries, or deletions. With table-based caching, even the smallest change, such as updating a single field, can trigger the invalidation of a large amount of cached data that is related to that table.
This mechanism becomes problematic, in particular, for complex queries involving multiple tables. For example, a join query that joins multiple tables will be affected by cache invalidation if even one of the tables involved undergoes a change. As a result, the most expensive queries that would most require the benefits of caching are often those most penalized by this approach.
Furthermore, in scenarios where some tables are subject to frequent changes, the Query Cache loses much of its usefulness. Continuous invalidation and regeneration of the cache not only consumes server resources but also slows down the overall response time of the database, since queries must be re-executed rather than retrieving results from the cache.
Finally, this approach lacks granularity. It does not distinguish between tables that rarely change and those that are constantly changing. Therefore, table-based caching is not optimized for use in complex and dynamic environments, resulting in a negative impact on overall database performance.
5. No ability to specify which queries to cache.
One of the most obvious weaknesses of the MySQL Query Cache is its inflexible nature, described as an “all or nothing” approach. This means that database administrators and developers do not have the ability to customize or influence which queries are cached and for how long. This rigidity can lead to significant inefficiencies and complications in cache management, especially in complex and differentiated databases.
In an ideal database environment, it would be useful to be able to specify which tables or schemas should be cached, or even which specific types of queries would benefit most from caching. However, with MySQL Query Cache, this granularity of control is not possible. The system is designed to cache queries indiscriminately, without distinguishing between queries that benefit from caching and those for which caching might even be counterproductive.
For example, in a database that contains both rarely changing tables and frequently updated tables, it would be logical to want to cache only queries related to the former. However, MySQL's Query Cache does not offer this flexibility. As a result, valuable resources are wasted caching queries from dynamic tables, whose cache entries will almost immediately be invalidated, while static tables, which could benefit more from caching, do not receive preferential treatment.
Additionally, the lack of control over cache lifetime means that data may remain in the cache for less than optimal periods. In some cases, this could lead to stale data being served, while in others, queries could be reloaded too frequently, wasting computing resources.
The rigidity of the Query Cache therefore proves to be an obstacle in situations where a more refined and targeted management of the cache could significantly improve the overall performance of the database. This limitation represents a clear disadvantage in an era where customization and resource optimization are critical to operational efficiency.
6. Locking operations on the Query Cache
A critical aspect of MySQL's Query Cache is its locking mechanism. Whenever a query is cached or the cache needs to be invalidated (for example, following an insert, update, or delete operation), the system imposes an exclusive lock on the entire cache. This means that while these operations are being processed, no other process can access the Query Cache.
In environments with a high volume of concurrent operations, this can become a serious bottleneck. When multiple threads attempt to access the cache at the same time, significant slowdowns can occur because each thread must wait its turn to access the cache. This problem is exacerbated in systems with a large cache volume, where invalidation and update operations can take longer.
7. Single Processor Oriented Design
The MySQL Query Cache is a product of an era when computer architecture was dominated by single-processor servers. This historical background is crucial to understanding its current limitations. At the time of its design, concerns about multithreading and managing concurrency between multiple cores were virtually non-existent. Single-processor servers handled operations sequentially, and the Query Cache was optimized for this type of linear processing.
However, as technology has evolved and the introduction of multicore processors, the dynamics of server computing have changed dramatically. Modern multicore processors can run multiple threads in parallel, significantly increasing efficiency and the ability to handle high workloads. This evolution has led to a fundamental shift in performance expectations for database systems.
Despite these developments, MySQL's Query Cache has not been adequately adapted to take advantage of the capabilities of multicore processors. Its original architecture, based on a sequential processing model, is not optimized to handle concurrent operations. This becomes especially problematic in high-concurrency environments, where multiple threads try to access or modify the cache simultaneously. Instead of benefiting from the parallelism offered by multicore processors, the Query Cache becomes a bottleneck, since its locking and synchronization mechanisms are not adequate to effectively handle concurrency.
The direct consequence of this limitation is an underutilization of the computing resources available in modern servers. Rather than distributing the workload efficiently across cores, Query Cache tends to limit operations to a single thread, slowing down overall processing and decreasing the responsiveness of the database system. This not only negatively impacts overall performance but also limits the scalability of database systems, a critical factor in an age where processing speed and the ability to handle large volumes of data are paramount.
8. Single Server Oriented Design
MySQL Query Cache, in its original design, was optimized for environments where the database resided on a single server. This scenario had all reading, writing, and caching happening on-premise, which was ideal for the time. In such an environment, centralization of data and cache operations ensured rapid query response and efficient data management. Local caching was particularly effective because all data was immediately accessible and manageable within a single system.
However, as IT infrastructures evolve and transition towards distributed architectures and cloud solutions, the traditional single server model has been overtaken. Modern IT architectures tend to distribute the workload across multiple servers, often geographically dispersed. These environments leverage data replication, load balancing, and other strategies to optimize system availability, resilience, and scalability.
In these distributed architectures, the centralized caching model of MySQL Query Cache begins to show its limitations. Cache management in a distributed environment requires careful synchronization between multiple servers to ensure that data is consistent and up-to-date across all nodes. This synchronization process can be complex and require a significant amount of network and computing resources, counteracting the benefits of caching itself.
Furthermore, the centralized nature of the Query Cache does not fully exploit the advantages of distributed architectures, such as redundancy and resilience. In a distributed system, the ability to continue to function effectively even when there are failures or interruptions in some parts of the system is critical. However, a cache system that depends on a single source of truth is vulnerable to single points of failure and cannot provide the same resilience.
9. Difficulty Disabling Query Cache Completely
One of the most problematic aspects of MySQL Query Cache is the complexity and sometimes inability to completely disable it without rebooting the system. This problem arises from the way MySQL handles Query Cache configuration parameters.
In MySQL, Query Cache configuration is based on two main parameters:
query_cache_type. The parameter
query_cache_size determines the amount of memory allocated to the Query Cache, while
query_cache_type establishes caching behavior (enabled, disabled, or on-demand).
query_cache_size a 0 should disable the Query Cache since there would be no memory allocated for this purpose. However, the reality is more complex. Even setting
query_cache_size at 0, some internal structures of the Query Cache remain active and continue to consume resources, albeit to a lesser extent. Furthermore, the parameter
query_cache_type continues to influence system behavior: if set to a value that enables caching, MySQL will keep some Query Cache features active, such as checking for cached results.
Additionally, disabling Query Cache is not instantaneous for all database connections. Connections that existed when the parameters were changed may continue to use the previous caching behavior until they are restarted or recreated. This means that the effect of any changes to the Query Cache configuration parameters is not uniformly applied in real time.
These peculiarities in Query Cache management can lead to confusion and difficulties in managing system performance, especially in complex environments or in situations where rapid and complete disabling of caching is required. The result is that, in many cases, the only solution to completely disable Query Cache and ensure that all connections respect this setting is to restart the MySQL server. This restart requirement, of course, can lead to downtime and interruptions in service, which are problematic in production environments where high availability is essential.
Alternatives to MySql Query Cache
Fortunately, there are many effective alternatives to MySQL Query Cache. A first option involves the use of modern Object-Relational Mappings (ORMs), many of which already include caching implementations. While the quality of these caching mechanisms can vary significantly, in many cases they offer superior performance to MySQL's Query Cache. Even less efficient ORMs may be more suitable in certain contexts. Additionally, for those using an ORM without built-in caching functionality, there is always the option to develop your own custom caching mechanism.
Another thing to consider is the lack of control over which queries are cached and how long they last in MySQL's caching system. Creating a custom caching mechanism can provide a solution to this problem, allowing for greater flexibility and specificity. However, developing a caching system from scratch can be complex and require significant effort.
A practical alternative is to use ProxySQL, a solution that can be installed alongside the application (for example, in a sidecar container if using Docker). ProxySQL allows detailed caching configuration, including the ability to specify which queries to cache or which to exclude. The regular expression support feature in ProxySQL is particularly useful: it allows you to cache only the heaviest queries, those with multiple joins, while keeping queries involving single tables fresh.
For example, in a previous work experience, we relied heavily on ProxySQL caching. The company was using PrestaShop, which was not particularly efficient at database management with its category indexing query, involving up to 8 different tables and making caching in MySQL impossible. To further complicate the situation, the PrestaShop database contained a catalog of over 5 million products, which were included in the category query to avoid empty categories. By adding this query to ProxySQL with a Time To Live (TTL) of 4 hours (categories rarely change!) we were able to significantly reduce the load on the database, going from 64 cores to 12.
Delving into these 10 reasons why Query Cache is commonly disabled by MySQL experts illustrates a trend towards innovation and adaptation in database management. This disabling is often the result of thoughtful analysis, aimed at optimizing performance and exploiting the potential of MySQL in modern and dynamic scenarios. As highlighted in the presentation I prepared for a colleague and of which this post represents the written substance, technology evolves and our strategies must evolve with it.
True, there are cases where Query Cache can still prove useful and improve performance, but these are specific and well-identified situations. For every scenario where Query Cache might seem like the right solution, there are multiple more flexible and performant alternatives ready to replace it. These alternatives not only better meet current needs but are also designed to adapt to the future evolution of IT environments.
If you are looking to maximize the effective use of MySQL and need guidance in navigating the available options, our technical and systems support team is here to help. With a specialization in Linux hosting and systems engineering, we are experts in the optimization of CMS such as WordPress, Joomla, Drupal and e-commerce platforms such as WooCommerce, Magento and PrestaShop. Whether you're looking to improve the performance of your MySQL database or adopt a custom caching strategy, we can offer the advice and technical support to help you achieve your goals. Contact us to find out how we can support the efficiency and growth of your MySQL environment.