Selecting records from one table that dont have specific relationship in a second table

Question: How can you select records from one table that doesn’t have a specific ID in a second, many-to-many table?  

In other words, imagine you have a questions table (A), categories table (B), and a many-to-many relationship link table (L).  This setup allows a question to have several categories.  How can you find questions that don’t have a specific category (say X)? 

What about the following query? 

               SELECT A.* FROM A JOIN L USING (question_id) WHERE L.category_id <> X;  

This query will not meet the requirements because it will still return questions that have several categories in addition to the X category, since it will only filter the X value.  

My way to handle this is to use the MySQL ‘NOT IN’ clause. 

                SELECT A.* FROM A WHERE A.question_id NOT IN (SELECT question_id FROM L WHERE category_id IN (X));  

This query will retrieve all questions and related categories that don’t belong to category X.  

Comments and/or different solutions are welcome.
Ilan.

The fading of Query Cache limitations

There is no question about the MySQL Query Cache greatness. 

The MySQL Query Cache feature started his way with some limitations.  However, in time, there were a lot of improvements (fixing) for those limitations.

One should test his MySQL version with the Query Cache limitations before using it.

Here are some improved limitations I found (at MySQL 5.0 version and above):

  1. The limitation:  Query comments are used as part of the query cache hash key. This means that different comments will (probably) produce different  query cache hash keys.
    The improvement: The existence of the comment in the query is important but not their contents. This means that even empty comment affects the query cache hash key, but two same queries that different with their comment content will have the same query cache hash key.
  2. The limitation:  Queries that begin with spaces or comments are ignored by the Query Cache mechanism.
    The improvement: The Query Cache mechanism ignores spaces and comments at the beginning and at the end of the query.
  3. The limitation:  Query Cache does not support prepared statements.
    The improvement: MySQL Query Cache does work well with prepared statements.

However, MySQL Query Cache still have the following “textual changes” limitations:

  1. It is case sensitive – changing even one letter case will produce different hash key.
  2. Extra spaces within the query (not at the beginning) will produce different hash key.  

 I would love to hear from you about more improvements to the MySQL Query Cache mechanism.
Ilan Hazan.

How to retain comments in MySQL statements

There is a lot of buzz lately around retaining comments in MySQL’s logs and MySQL Stored procedures (Routines).
When analyzing a MySQL query, it’s nice to have a comment inside the query (or procedure) that can provide us with more info about the query.
For example, When analyzing MySQL’s logs (like Slow Query Log or General Log), seeing the calling function’s name that issued this slow query can help a lot. This way it’s much easier to find the query in the code or if it’s called from a script.
One solution is to write an inline comment (C multi column comment style) in the query’s body.
For example: “SELECT /* calling function name */ column1, column2 … FROM table …”.
However, when I tried to use it within a bash script the comment was stripped out.
I found that the MySQL client program is stripping out in-query (or in-procedure) comments.
Fortunately, there is a flag that controls this behavior whose default is to discard comments (-skip-comments).
 
To retain (preserve) comments in MySQL statements sent to the server (i.e. queries or MySQL Stored procedures) use the -–comments flag (or -c).
For example $ bin/mysql  -comments  -uxxxx  -pyyyy
 
Important notes:
1.     This flag is active from MySQL version 5.0.52. (see http://mirror.leaseweb.com/mysql/doc/refman/5.0/en/mysql-command-options.html)
2.     The comments are retained within the scripting language with the embedded SQL libraries (such as PHP).

How to produce random rows from a table

A while ago, I were searching for a way to produce random rows from a table in MySQL. I found several solutions but none of them satisfied me. Of course, I could use  a combined logic of MySQL and a programming language. For example by producing random numbers in PHP and using them in the MySQL query in a IN clouse.  However, I was looking for a pure query solution. The easy way that MySQL provides is by using ORDER BY RAND(). This trick works fine for very small tables. However, if you have a big table, this solution is a bad choice.  It has a big performance  issue as it orders all the table rows prior to selecting a few of them.

My solution to this problem is using User Defined Variables. Let’s say we want X random rows from a table myTable.  

SELECT myTable.* FROM (SELECT FLOOR (RAND() * (SELECTcount(*) FROM myTable)) num ,@num:=@num+1 from (SELECT @num:=0) a , myTable LIMIT X) b ,  myTable WHERE b.num=myTable.id;

I will try to explain some of the parts:

  1. FLOOR (RAND() * (SELECT count(*) FROM myTable)) num  – produce an integer value between 0 to the max rows this table has
  2. (SELECT @num:=0) a – initialize the User Defined Variable a to zero
  3. (SELECT FLOOR (RAND() * (SELECTcount(*) FROM myTable)) num ,@num:=@num+1 from (SELECT @num:=0) a , myTable LIMIT X) b – produces X random integer numbers

If you need it with range, say X random rows from a table myTable with a range from S to T you can write it like this:

SELECT myTable.* FROM (SELECT FLOOR (S + RAND() * (T – S)) num ,@num:=@num+1 FROM (SELECT @num:=0) a , myTable LIMIT X) b ,  myTable WHERE b.num=myTable.id;

Some caution: because this query uses User Defined Variable (@num) which are connection specific, you can NOT use this MySQL query with a system that uses connection pooling or persistent connections.

Comments are welcomed,
Hazan Ilan.