If you copy a MyISAM table with PRIMARY key, don’t forget to order the rows first
In a case that you copy (INSERT INTO .. SELECT ..) a MyISAM table that have a PRIMARY key, it is much faster to insert the new rows in the primary key order:
- The insertion will be faster: because the primary key cannot disabled, every insertion will also updates the primary key. In a case the rows are inserted in the primary key order, there will be no redundant disk scans.
- The queries that uses the primary key will perform better: it make sorting easier for MySQL if the table is in order by the column that you want to order it by later.
Let’s see some examples for the theory:
mysql> show create table test_tableG *************************** 1. row *************************** Table: test_table Create Table: CREATE TABLE ` test_table ` ( `user_id` int(10) unsigned NOT NULL, `auto_id` int(10) unsigned NOT NULL auto_increment, `col_time` timestamp NOT NULL default CURRENT_TIMESTAMP, `col1` int(10) unsigned NOT NULL default '0', `col2` float NOT NULL, `col3` float NOT NULL, `col4` int(10) unsigned default NULL, `col5` int(10) unsigned default NULL, `col6` varchar(255) default NULL, PRIMARY KEY (`user_id`,`auto_id`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> CREATE TABLE test_table_ordered LIKE test_table; Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO test_table_ordered SELECT * FROM test_table ORDER BY user_id,auto_id; Query OK, 22391869 rows affected (10 min 37.33 sec) Records: 22391869 Duplicates: 0 Warnings: 0 mysql> mysql> mysql> mysql> CREATE TABLE test_table_NOTordered LIKE test_table; Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO test_table_NOTordered SELECT * FROM test_table; Query OK, 22391869 rows affected (1 hour 18 min 38.50 sec) Records: 22391869 Duplicates: 0 Warnings: 0 mysql> mysql> mysql> CREATE TABLE test_table_orderedCOPY LIKE test_table_ordered; Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO test_table_orderedCOPY SELECT * FROM test_table_ordered; Query OK, 22391869 rows affected (3 min 19.96 sec) Records: 22391869 Duplicates: 0 Warnings: 0 mysql> mysql> mysql> CREATE TABLE test_table_NOTorderedCOPY LIKE test_table_NOTordered; Query OK, 0 rows affected (0.01 sec) mysql> INSERT INTO test_table_NOTorderedCOPY SELECT * FROM test_table_NOTordered; Query OK, 22391869 rows affected (1 hour 19 min 19.63 sec) Records: 22391869 Duplicates: 0 Warnings: 0 |
The non-ordered insertion into table test_table_NOTordered took ~1 hour 18 min. However, the ordered insertion into the table test_table_ordered was much faster and took only ~10 min.
The non-ordered insertion into test_table_NOTorderedCOPY took the same time as the non-ordered insertion into table test_table_NOTordered.
The most faster insertion was the ordered insertion into table test_table_orderedCOPY. It took only ~3 min. This is because the table that we selects from was already ordered, hence no time spent on ordering the table before the insertion.
A note: ORDER BY does not make sense for InnoDB tables that contain a user-defined clustered index (PRIMARY KEY or NOT NULL UNIQUE index). InnoDB always orders table rows according to such an index if one is present. The same is true for BDB tables that contain a user-defined PRIMARY KEY. (see http://dev.mysql.com/doc/refman/5.0/en/alter-table.html)
I will be happy to receive comments.
Thank you I will try this next time I am copying, I have some large tables we use for testing and this will be tested.
Interesting measurements.
“The queries that uses the primary key will perform better: it make sorting easier for MySQL if the table is in order by the column that you want to order it by later.” – of course, queries that use other keys will benefit more from another kind of order…
Also note on MyISAM tables you can do “ALTER TABLE some_table ORDER BY some_col”, a one-time sorting of the table.