MySQL PERCENT_RANK Function

Summary: in this tutorial, you will learn how to use the MySQL PERCENT_RANK() function to calculate the percentile ranking of a row within a partition or result set.

The PERCENT_RANK() is a window function that calculates the percentile rank of a row within a partition or result set.

The following shows the syntax of the PERCENT_RANK() function:

PERCENT_RANK() OVER (
  PARTITION BY partition_expression 
  ORDER BY 
    sort_expression [ASC | DESC]
)Code language: SQL (Structured Query Language) (sql)

The PERCENT_RANK() function returns a number that ranges from zero to one.

For a specified row, PERCENT_RANK() calculates the rank of that row minus one, divided by 1 less than the number of rows in the evaluated partition or query result set:

(rank - 1) / (total_rows - 1)Code language: SQL (Structured Query Language) (sql)

In this formula, rank is the rank of a specified row and total_rows is the number of rows being evaluated.

The PERCENT_RANK() function always returns zero for the first row in a partition or result set. The repeated column values will receive the same PERCENT_RANK() value.

Similar to other window functions, the PARTITION BY clause distributes the rows into partitions and the ORDER BY clause specifies the logical order of rows in each partition. The PERCENT_RANK() function is calculated for each ordered partition independently.

Both PARTITION BY and ORDER BY clauses are optional. However, the PERCENT_RANK() is an order-sensitive function, therefore, you should always use the ORDER BY clause.

MySQL PERCENT_RANK() function examples

Let’s create a new table named productLineSales based on the orders, orderDetails, and products tables from the sample database:

CREATE TABLE productLineSales
SELECT
    productLine,
    YEAR(orderDate) orderYear,
    quantityOrdered * priceEach orderValue
FROM
    orderDetails
        INNER JOIN
    orders USING (orderNumber)
        INNER JOIN
    products USING (productCode)
GROUP BY
    productLine ,
    YEAR(orderDate), 
    orderValue;Code language: SQL (Structured Query Language) (sql)

The productLineSales table stores the summary of the sales data including product line, order year, and order value.

1) Using MySQL PERCENT_RANK() over the query result set

The following query finds the percentile rank of every product line by order values:

WITH t AS (
    SELECT
        productLine,
        SUM(orderValue) orderValue
    FROM
        productLineSales
    GROUP BY
        productLine
)
SELECT
    productLine,
    orderValue,
    ROUND(
       PERCENT_RANK() OVER (
          ORDER BY orderValue
       )
    ,2) percentile_rank
FROM
    t;Code language: SQL (Structured Query Language) (sql)

In this example:

  • First, we used a common table expression to summarize the order values by product lines.
  • Second, we used the PERCENT_RANK() to calculate the percentile rank of the order value of each product. In addition, we used the ROUND() function to round the values to 2 decimals for a better representation.

Here is the output:

+------------------+------------+-----------------+
| productLine      | orderValue | percentile_rank |
+------------------+------------+-----------------+
| Trains           |  188532.92 |               0 |
| Ships            |  661622.34 |            0.17 |
| Planes           |  944295.37 |            0.33 |
| Trucks and Buses | 1014787.52 |             0.5 |
| Motorcycles      | 1114192.13 |            0.67 |
| Vintage Cars     | 1782234.23 |            0.83 |
| Classic Cars     | 3782065.50 |               1 |
+------------------+------------+-----------------+
7 rows in set (0.00 sec)Code language: JavaScript (javascript)

Here are some analyses from the output:

  • The order values of Trains were not better than any other product lines, which was represented with a zero.
  •  Vintage Cars performed better than 50% of other products.
  •  Classic Cars performed better than any other product line so its percent rank is 1 or 100%

2) Using MySQL PERCENT_RANK() over the partition

The following statement returns the percentile ranking of product lines by order values in each year:

SELECT
    productLine,
    orderYear,
    orderValue,
    ROUND(
    PERCENT_RANK()
    OVER (
        PARTITION BY orderYear
        ORDER BY orderValue
    ),2) percentile_rank
FROM
    productLineSales;Code language: SQL (Structured Query Language) (sql)

Here is the output:

MySQL PERCENT_RANK function over partition example

In this example, we divided the order values of the product lines by order year. The PERCENT_RANK() then applied to each partition.

For example, in 2003 Vintage Cars performed better than 50% of other product lines while in 2004 Ships performed better than 50% of other products.

Summary

  • Use the MySQL PERCENT_RANK() function to calculate the percentile rank of a row within a partition or result set.
Was this tutorial helpful?