acts_as_taggable_on_steroids was REALLY slow

I got a text message about some timeouts on a client website over the weekend. I looked in to it and couldn’t really find anything particularly wrong, but the page load times for this one page were definitely much longer than they used to be. Digging deeper and talking with another dev on the team, we thought we may have to look at caching this section or rewrite some of our tagging. We use acts_as_taggable_on_steroids on this project, and I’ve seen some suspect queries come up before when dealing with large datasets. Well, it turns out it was a MySQL issue, and it was easily fixed.

The problem:

mysqlslow.log is reporting this:

# Query_time: 58  Lock_time: 0  Rows_sent: 1  Rows_examined: 18461772

Well, that’s a problem. Let’s run an EXPLAIN on the query:

| rows |
+------+
| 3957 |
|    2 |
|    1 |
|    1 |
|    1 |
+------+

Hmm, that’s strange. The EXPLAIN is saying that it should look at about 4000 records, but the slow query log is saying that it actually examined 18 million.

The solution:

WARNING: Depending on the storage engine, this will READ or WRITE LOCK the TABLE.

mysql> ANALYZE TABLE widgets;
+-------------------------------+---------+----------+----------+
| Table                         | Op      | Msg_type | Msg_text |
+-------------------------------+---------+----------+----------+
| widgets                       | analyze | status   | OK       | 
+-------------------------------+---------+----------+----------+
1 row in set (0.00 sec)

mysql> ANALYZE TABLE tags;
+------------------------------+---------+----------+----------+
| Table                        | Op      | Msg_type | Msg_text |
+------------------------------+---------+----------+----------+
| tags                         | analyze | status   | OK       | 
+------------------------------+---------+----------+----------+
1 row in set (0.10 sec)

mysql> ANALYZE TABLE taggings;
+----------------------------------+---------+----------+----------+
| Table                            | Op      | Msg_type | Msg_text |
+----------------------------------+---------+----------+----------+
| taggings                         | analyze | status   | OK       | 
+----------------------------------+---------+----------+----------+
1 row in set (0.06 sec)

What’s going on here?

The MySQL optimizer uses table and index statistics to determine join order and index selection. In this case, the difference between Rows_examined in the slow log and rows in the explain was the first clue that something was off. Second, the index selection in the explain was off as well. By running ANALAYZE TABLE on the involved tables, this updated the table statistics to properly reflect key cardinality. I put a warning about table locks before those statements, because running this will lock for writes and (for innodb) reads. I was only dealing with a total of 100K records here, but my experience has been that even for millions of rows, this doesn’t take more than a few seconds. But there are certainly scenarios where this may not be the case. In addition, the OPTIMIZE TABLE command takes significantly longer on large tables, as it will completely rebuild the table and indexes.

The results:

The new rows from the explain plan:

| rows |
+------+
| 1034 |
|    1 |
|    1 |
|    1 |
|    1 |
+------+

Now the query runs in less than 0.05 seconds consistently.

Feedback welcome!