Natural Language Search in MySQL
Summary: In this tutorial, you will learn how to perform natural language search in MySQL by using MATCH and AGAINST functions.
Basically in natural language search, MySQL looks for rows or documents which are relevant to the free-text natural human language query, for example “How to use MySQL full-text search”. Relevance is a positive floating-point number. When relevance is zero, it means no similarity. MySQL computes relevance based on the number of words in the row, the number of unique words in that row, the total number of words in the collection, and the number of documents (rows) that contain a particular word.
In order to perform searches, you use two functions MATCH() and AGAINST(). The MATCH() function specifies the column to be searched and the AGAINST() function specifies the search expression to be used.
We will use the table products in our sample database to make up examples of the tutorial. First we need to add full-text search to product line column by performing the following query:
ALTER TABLE products
ADD FULLTEXT(productline)
Now to search for products which product line has word “Classic “, you can use the below query:
SELECT productName, productline
FROM products
WHERE MATCH(productline) AGAINST('Classic')
+-------------------------------------+--------------+
| productName | productline |
+-------------------------------------+--------------+
| 1952 Alpine Renault 1300 | Classic Cars |
| 1948 Porsche Type 356 Roadster | Classic Cars |
| 1970 Triumph Spitfire | Classic Cars |
| 1957 Corvette Convertible | Classic Cars |
| 1957 Ford Thunderbird | Classic Cars |
| 1970 Chevy Chevelle SS 454 | Classic Cars |
| 1970 Dodge Coronet | Classic Cars |
| 1966 Shelby Cobra 427 S/C | Classic Cars |
| 1949 Jaguar XK 120 | Classic Cars |
| 1958 Chevy Corvette Limited Edition | Classic Cars |
| 1952 Citroen-15CV | Classic Cars |
| 1982 Lamborghini Diablo | Classic Cars |
| 1969 Chevrolet Camaro Z28 | Classic Cars |
| 1971 Alpine Renault 1600s | Classic Cars |
| 2002 Chevy Corvette | Classic Cars |
| 1956 Porsche 356A Coupe | Classic Cars |
| 1992 Porsche Cayenne Turbo Silver | Classic Cars |
| 1961 Chevrolet Impala | Classic Cars |
| 1976 Ford Gran Torino | Classic Cars |
| 1969 Dodge Super Bee | Classic Cars |
| 1972 Alfa Romeo GTA | Classic Cars |
| 1962 LanciaA Delta 16V | Classic Cars |
| 1968 Ford Mustang | Classic Cars |
| 2001 Ferrari Enzo | Classic Cars |
| 1969 Corvair Monza | Classic Cars |
| 1968 Dodge Charger | Classic Cars |
| 1969 Ford Falcon | Classic Cars |
| 1970 Plymouth Hemi Cuda | Classic Cars |
| 1969 Dodge Charger | Classic Cars |
| 1993 Mazda RX-7 | Classic Cars |
| 1965 Aston Martin DB5 | Classic Cars |
| 1948 Porsche 356-A Roadster | Classic Cars |
| 1995 Honda Civic | Classic Cars |
| 1998 Chrysler Plymouth Prowler | Classic Cars |
| 1999 Indy 500 Monte Carlo SS | Classic Cars |
| 1992 Ferrari 360 Spider red | Classic Cars |
| 1985 Toyota Supra | Classic Cars |
| 1982 Camaro Z28 | Classic Cars |
+-------------------------------------+--------------+
38 rows in set (0.01 sec)
To search for products which product line has word “Classic” or “Vintage” you can perform the following query:
SELECT productName, productline
FROM products
WHERE MATCH(productline) AGAINST('Classic,Vintage')
+-------------------------------------------+--------------+
| productName | productline |
+-------------------------------------------+--------------+
| 1917 Grand Touring Sedan | Vintage Cars |
| 1911 Ford Town Car | Vintage Cars |
| 1932 Model A Ford J-Coupe | Vintage Cars |
| 1928 Mercedes-Benz SSK | Vintage Cars |
| 1913 Ford Model T Speedster | Vintage Cars |
| 1934 Ford V8 Coupe | Vintage Cars |
| 18th Century Vintage Horse Carriage | Vintage Cars |
| 1903 Ford Model A | Vintage Cars |
| 1938 Cadillac V-16 Presidential Limousine | Vintage Cars |
| 1917 Maxwell Touring Car | Vintage Cars |
| 1939 Chevrolet Deluxe Coupe | Vintage Cars |
| 1941 Chevrolet Special Deluxe Cabriolet | Vintage Cars |
| 1932 Alfa Romeo 8C2300 Spider Sport | Vintage Cars |
| 1904 Buick Runabout | Vintage Cars |
| 1939 Cadillac Limousine | Vintage Cars |
| 1912 Ford Model T Delivery Wagon | Vintage Cars |
| 1936 Mercedes Benz 500k Roadster | Vintage Cars |
| 1936 Chrysler Airflow | Vintage Cars |
| 1940 Ford Delivery Sedan | Vintage Cars |
| 1937 Lincoln Berline | Vintage Cars |
| 1928 Ford Phaeton Deluxe | Vintage Cars |
| 1936 Mercedes-Benz 500K Special Roadster | Vintage Cars |
| 1937 Horch 930V Limousine | Vintage Cars |
| 1930 Buick Marquette Phaeton | Vintage Cars |
| 1970 Dodge Coronet | Classic Cars |
| 1961 Chevrolet Impala | Classic Cars |
| 1966 Shelby Cobra 427 S/C | Classic Cars |
| 1982 Camaro Z28 | Classic Cars |
| 1969 Chevrolet Camaro Z28 | Classic Cars |
| 1992 Porsche Cayenne Turbo Silver | Classic Cars |
| 1949 Jaguar XK 120 | Classic Cars |
| 1958 Chevy Corvette Limited Edition | Classic Cars |
| 1956 Porsche 356A Coupe | Classic Cars |
| 2002 Chevy Corvette | Classic Cars |
| 1952 Citroen-15CV | Classic Cars |
| 1971 Alpine Renault 1600s | Classic Cars |
| 1982 Lamborghini Diablo | Classic Cars |
| 1970 Chevy Chevelle SS 454 | Classic Cars |
| 1957 Ford Thunderbird | Classic Cars |
| 1957 Corvette Convertible | Classic Cars |
| 1993 Mazda RX-7 | Classic Cars |
| 1969 Dodge Charger | Classic Cars |
| 1970 Plymouth Hemi Cuda | Classic Cars |
| 1969 Ford Falcon | Classic Cars |
| 1968 Dodge Charger | Classic Cars |
| 1969 Corvair Monza | Classic Cars |
| 2001 Ferrari Enzo | Classic Cars |
| 1968 Ford Mustang | Classic Cars |
| 1962 LanciaA Delta 16V | Classic Cars |
| 1965 Aston Martin DB5 | Classic Cars |
| 1952 Alpine Renault 1300 | Classic Cars |
| 1948 Porsche 356-A Roadster | Classic Cars |
| 1970 Triumph Spitfire | Classic Cars |
| 1948 Porsche Type 356 Roadster | Classic Cars |
| 1976 Ford Gran Torino | Classic Cars |
| 1969 Dodge Super Bee | Classic Cars |
| 1985 Toyota Supra | Classic Cars |
| 1992 Ferrari 360 Spider red | Classic Cars |
| 1999 Indy 500 Monte Carlo SS | Classic Cars |
| 1998 Chrysler Plymouth Prowler | Classic Cars |
| 1995 Honda Civic | Classic Cars |
| 1972 Alfa Romeo GTA | Classic Cars |
+-------------------------------------------+--------------+
62 rows in set (0.00 sec)
We can rewrite the same query above again by using IN NATURAL LANGUAGE MODE keyword. Because IN NATURAL LANGUAGE MODE is default so you can omit it in the query.
SELECT productName, productline
FROM products
WHERE MATCH(productline)
AGAINST('Classic,Vintage' IN NATURAL LANGUAGE MODE)
By default, MySQL performs search in the case-insensitive fashion. However you still can force MySQL to perform search in case-sensitive fashion by using binary collation for indexed columns.
A very important feature of full-text search is the ranking of results. When the MATCH() is used in WHERE clause, rows with higher rank or more relevant are returned first. To demonstrate the relevance sorting in full-text search, we will use productName column of the table products in our sample database to perform search.
ALTER TABLE products
ADD FULLTEXT(productName)
Now we can search for products which product name has word ‘Ford’ produced in ‘1932’. Here is the query to do so:
SELECT productName, productline
FROM products
WHERE MATCH(productName) AGAINST('1932,Ford')
Let’s take a look at the output:
+-------------------------------------+------------------+
| productName | productline |
+-------------------------------------+------------------+
| 1932 Model A Ford J-Coupe | Vintage Cars |
| 1932 Alfa Romeo 8C2300 Spider Sport | Vintage Cars |
| 1968 Ford Mustang | Classic Cars |
| 1957 Ford Thunderbird | Classic Cars |
| 1940s Ford truck | Trucks and Buses |
| 1903 Ford Model A | Vintage Cars |
| 1934 Ford V8 Coupe | Vintage Cars |
| 1911 Ford Town Car | Vintage Cars |
| 1969 Ford Falcon | Classic Cars |
| 1913 Ford Model T Speedster | Vintage Cars |
| 1926 Ford Fire Engine | Trucks and Buses |
| 1976 Ford Gran Torino | Classic Cars |
| 1940 Ford Pickup Truck | Trucks and Buses |
| 1940 Ford Delivery Sedan | Vintage Cars |
| 1928 Ford Phaeton Deluxe | Vintage Cars |
| 1912 Ford Model T Delivery Wagon | Vintage Cars |
+-------------------------------------+------------------+
16 rows in set (0.01 sec)
As you see, all products which product name has two words ‘1932’ and ‘Ford’ return first and then the products which product name has word ‘Ford’ returned later in the output.
There are some important points you should remember when using full-text search:
- The minimum length of word to be found by MySQL full-text search engine is four. It means if you search for word ‘Car’, you will not get any returned rows.
- Stopword is ignored. MySQL defines a list of stopword which to be ignored when performing search. Those stopwords are defined in the MySQL source code distribution storage/myisam/ft_static.c