The battle between FORCE INDEX and the QUERY OPTIMIZER
Query Optimizer is a part of the server that takes a parsed SQL query and produces a query execution plan.
MySQL Query Optimizer uses (as one of its parameters) the stored key distribution (Cardinality) mechanism to determine the order in which tables should be joined, when you perform a join on something other than a constant and also for determining which indexes to use for a specific table within a query.
However, BEWARE! The query analyzer is not perfect. One of the many reasons is that the key distributions can be far from accurate. If the table has been heavily modified (INSERT or DELETE for example), over time the key distributions (Cardinality) diverge from the true state. (To solve this issue one must run ANALYZE TABLE once in a while).
An Optimizer that picks the wrong plan can cause severe performance issues. This means that your server can hit a performance problem at the most unexpected time.
Lucky for us, we can force the Optimizer to pick a particular plan (index) with the “Force Index” hint.
Does this mean that we should always use a Force Index hint?
NO!
Using FORCE INDEX() indiscriminately can also easily backfire. The distribution of the data or the values of query parameters can change, and the key that you thought was best for the query can, over time, become a less preferred plan. FORCE KEY should be used when you understand exactly why you are using it, and are 100% sure that there is no way the key that is being forced could become bad.
Note, that most often the need for FORCE KEY arises when the query is not very straightforward and there is no “awesome” winner key that the optimizer can easily spot. Such needs can often be addressed by re-writing the query and/or adding better keys.
Notes:
1. By default, ANALYZE TABLE statements are written to the binary log so that they will be replicated to replication slaves.
2. Another contributing factor to the MySQL Optimizers calculation of optimal query execution plan is the number of predicted disk IO/Seeks.
3. Use the ‘SHOW INDEX FROM table‘ statement in order to see the key distribution (Cardinality) for each index.
4. MyISAM data and indexes suffer from fragmentation. Running OPTIMIZE TABLE under MyISAM will sort and recreate the index and data files.
Thanks to Sasha from Percona for opening my eyes to this issue.
Hi,
Thank you for your post, I shall be experimenting with this shortly.
One interesting problem, when I type ‘show index from ‘ it states my cardinality for a boolean field as 17, instead of my expected 2 (on / off). The underlying type is ‘tinyint(1) not null’ – I’ve confirmed only values 1 and 0 are present ?
Additionally it also has the value 17 for a few other fields, for example ‘currency’ I know I have 96 distinct rows, but that’s also reporting a cardinality of 17.
Have you run into this at all?
Cheers,
Rob.
Please run ANALYZE TABLE and lets see how this affects the cardinality.