Last Updated on
Before reading this article, I suggest to read Indexes bad practices, if you missed it. In that article I mentioned that useless indexes should be deleted, and in the comments Vipul asked how to find such indexes. Here I explain how to do it.
Finding useless indexes
First of all, let’s define what a useless index is. There are two types of them:
- Duplicate indexes;
- Unused indexes.
If you don’t know what duplicate indexes are, you should really check the previous article. We’re not discussing the matter again here.
There is at least one tool to find such indexes: pt-duplicate-key-checker.
This great tool finds duplicate indexes, and outputs a list of them, as well as the
ALTER TABLEs to drop them. It also outputs the index definitions, so you will easily find out why a certain index is a duplicate.
The tool is safe with respect to constraints.
UNIQUE indexes are not considered duplicates, unless they duplicate other UNIQUE indexes. Foreign keys are considered duplicates only if they are identical.
pt-duplicate-key-checker is usually called without any arguments, and that is what I do. For completeness, I will mention here some additional checks that could be done – but probably no one does.
--all-structs: Ignored the index types. An index can duplicate another even if they are of different types – for example, a
FULLTEXTindex and a regular BTREE. This is normally wrong, but there is an exception: a BTREE could duplicate a HASH index. If your database has improper HASH indexes, you may want to try this parameter. But then, you will have to check manually the output.
--ignore-order: Normally, an index on (a, b) should not be considered as a duplicate of (b, a). The reason is that some queries can only be server by one of these indexes, as explained in the previous article. This parameter causes these indexes to be considered as duplicates. This is only true if we compare both these columns with
=and don’t use
ORDER BYon them. If you have these non-common case you may use
--ignore-orderto spot duplicates, but you will have to check manually the output.
If you use MariaDB or Percona Server, you have a plugin called user_statistics. It can also be installed in MySQL.
The user_statistics method
If you enable it, you will have a table called
INDEX_STATISTICS, in the
information_schema database. All indexes not present in that table were not used since the last restart of MySQL. To spot them you can use this query:
SELECT st.TABLE_SCHEMA, st.TABLE_NAME, st.INDEX_NAME FROM information_schema.STATISTICS st LEFT JOIN information_schema.INDEX_STATISTICS idx ON idx.INDEX_NAME = st.INDEX_NAME AND idx.TABLE_NAME = st.TABLE_NAME AND idx.TABLE_SCHEMA = st.TABLE_SCHEMA WHERE (idx.INDEX_NAME IS NULL OR idx.ROWS_READ = 0) AND st.NON_UNIQUE = 1 ORDER BY 1, 2, 3 ;
The performance_schema method
There is a less safe alternative. For that, you need to have the
performance_schema enabled. If it is disabled, you need to restart your server to enable it.
performance_schemais disabled by default in MariaDB since version 10.0. Let’s hope that they’ll change their mind, for a lot of reasons.
performance_schema enabled, you can run this query to find unused indexes:
SELECT object_schema, object_name, index_name FROM performance_schema.table_io_waits_summary_by_index_usage WHERE index_name IS NOT NULL AND count_star = 0 ORDER BY object_schema, object_name, index_name;
Why is it less safe? Because it assumes that an index is only unused if it never caused an IO wait.
Duplicate indexes are not necessarily unused. This is not a problem if another index can be used in place of the duplicate. But there is an exception, that I mentioned in The perils of ALTER TABLE in MySQL/MariaDB. MySQL supports the
IGNORE/USE/FORCE INDEX syntax, which mentions an index by name. If the index doesn’t exist, the query fails – even in the care of
Dropping an unused index is safe. As mentioned, indexes found with the user_statistics method are surely unused. But indexes found with the
performance_schema method could theoretically be used.
To reduce these risks, MySQL 8 has a useful feature: invisible indexes. You can make an index invisible in this way:
ALTER TABLE t ALTER INDEX idx_a INVISIBLE;
If an index is invisible, MySQL will pretend that it doesn’t exist. Therefore, if it is mentioned by a query with
IGNORE/USE/FORCE INDEX, the query will return an error. And if it is used by a query, the query will become slow – or it will use another index. The advantage is that making an index invisible or visible is an instantaneous operation. Therefore, even if the table is very big, if a problem appears, we can make the index visible again in a moment.
It is important to drop invisible indexes once we’re sure they are not needed. An invisible index still needs to be kept up to date, which has an overhead. We don’t want to have an overhead for something that we don’t need.
- Indexes bad practices
- Primary Key in InnoDB: how they work and best practices
- Dropping database objects in a safe way
We saw how to find duplicate indexes, including some edge cases that are not reported by default by pt-duplicate-key-checker. Then we saw how to find unused indexes: the
performance_schema database should always be enabled, but for this particular task the user_statistics plugin is better. Finally, we discussed how to use invisible indexes to reduce risks.
Listing useless indexes is part of my MySQL Health Checks. It’s in the part about MySQL schema problems and optimisation of the most impacting queries. Compared to these topics eliminating dropping indexes is a low hanging fruit, but it brings benefits, especially for write-heavy workloads.
As usual my conclusion is: what do you think? If you have questions, disagree with something or have more ideas, please comment.