Friday, February 8, 2013

Optimization of Phpfox: MySQL

Today  Yesterday I landed in a bug report that mentioned full table scans, in my younger years (ok, 5 years ago) I was very interested in databases and even took a workshop in Mexico by MySQL to become a DBA so I dutifully began testing.

The first thing I did was to set up a local copy of Phpfox 5.3.0 RC1 (not yet released) with content (users, blogs, friends,...), then enabled the slow_query_log and set the long_query_time to 2, because this is a local server with no traffic I assumed a safe bet that no query would take longer than 2 seconds. Then I enabled log-queries-not-using-indexes to capture those in the slow query log. Then using the script I was able to find queries that could be optimized (theoretically at least).

There were indeed some queries using union and joins that could be optimized (large sites will likely see a performance increase in 5.3.0 RC1).

I do have to point out a couple of things:
First, some queries are being logged but they are meant to scan the full table, for example when getting all the user groups we want all the records in that table, using indexes would not improve performance because we want everything in that table, intentionally this table is typically very small, it defaults to 5 records and in normal circumstances shouldn't grow beyond 10 rows.
Second, In some cases, MySQL seemed to not want to use a specific index, for example with this query:

SELECT COUNT(*)
FROM phpfox_user AS u
JOIN phpfox_user_field AS ufield
ON (ufield.user_id = u.user_id)
WHERE u.status_id = 0 AND u.view_id = 0
If you run that query with explain you will see that MySQL had many indexes to use but chose none, and in my test it did run a full table scan. So to help in this situation we implemented the function forceIndex() in the DBA library, this allowed us to tell MySQL which index to use and after rewriting the query we saved on fetching rows, (in the one that I have just rewritten in the feed there is a filtering improvement of 45.31%)  which translates in a performance improvement.

Third, we made sure that whenever the developer queries for getSlaveField or getField the database library adds a LIMIT 1 if it was not added by the developer, we saw a tiny tiny performance after this small change.

Fourth, I also found some strange behavior in mysql where it would log a query from a 'derived' table and not use any indexes for that sub-query, but taking the sub-query out and testing it by itself did use the index, in this case force index did not help, for what is worth, this only happened in the main Blog section when logged in as an administrator.

Fifth, in one occasion after optimizing the query (meaning it did not get logged in the slow query log) it took longer for mysql to fetch the results, we opted for 'un-optimizing' the query since the tangible benefits outweighed the theoretical ones.

The improvements added affect sections frequently reached like the Browse Members and Home (after logging in, where the news feed is).

For developers: here is sample code using the forceIndex function:

$iCnt = $this->database()
        ->from($this->_sTable, 'u')
        ->forceIndex('status_id')
        ->join(Phpfox::getT('user_field'), 'ufield', 'ufield.user_id = u.user_id')
        ->where($this->_aConditions)
        ->execute('getSlaveField');
There surely are queries still to fix, if you find them please let us know, we will continue looking for them but must also attend to other bug reports.