Skip to content
Software connecting all databases
  • Home
  • Software
    • Download Nexus
    • Purchase Nexus License
    • Activate Nexus License
    • SmartCompress
    • 60-Second Feature Videos
  • Training
    • Corporate Education
  • Data Migration
    • Data Migration Software
    • Data Migration Services
  • Books
  • Partners
  • Press
  • Blog
  • About Us
    • What We Do
    • Who We Are
  • Contact
Coffing Data Warehousing - Federated Query and ETL Software, Training, and Books
FacebookTwitterGoogleLinkedinYoutubeEmail
Software connecting all databases
  • Home
  • Software
    • Download Nexus
    • Purchase Nexus License
    • Activate Nexus License
    • SmartCompress
    • 60-Second Feature Videos
  • Training
    • Corporate Education
  • Data Migration
    • Data Migration Software
    • Data Migration Services
  • Books
  • Partners
  • Press
  • Blog
  • About Us
    • What We Do
    • Who We Are
  • Contact

Azure Synapse Analytics – RANK, DENSE_RANK, and PERCENT_RANK

May 10, 2024

I can teach you analytics!  Azure Synapse is excellent at analytics. This week we are working on the RANK, DENSE_RANK, and PERCENT_RANK commands. You will also learn about PARTITION BY and how to use a derived table to filter your rank commands.

All of these examples have come from my books and training classes. Please do me a favor and tell your training coordinator that you know the best technical trainer in the World. Ask them to hire me to train at your company, either on-site or with a virtual class. They can see our classes, outlines, and a sample of my teaching at this link on our website. I hope to meet you and say thanks in my next class at your company:)

https://coffingdw.com/education/

You will see an ORDER BY statement in each example, but it will not come at the end of the query. The ORDER BY keywords are always included in the RANK calculation. It is the ORDER BY statement that determines what we are ranking. In the example below, we clearly state ORDER BY daily_sales asc, which means we are ranking the daily_sales column and giving the lowest daily_sales value a rank of one. We don’t need the keyword asc because that is the default for an ORDER BY statement, but I put it there for clarity.

In its simplest explanation, a RANK will sort the data first via the ORDER BY statement, then give the first-row rank of 1. It will rank the second row with two unless the values of rows one and two are equal. Check out the example below.

In the picture above, notice that an open and close parenthesis immediately follows the keyword RANK.  The open and close parenthesis insinuates a function.  There is never anything inside the parenthesis, but it is required.  The keyword OVER follows, representing that this analytic is an ordered analytic, which is interchangeable with the term window function. The term ordered analytic means the data set will be put in a specific order before the calculation begins.  We sort the data using the ORDER BY statement, and in this example, we are sorting by the daily_sales column.  Since the default for an ORDER BY statement is ascending, we rank the data by daily_sales ASC.

When you first see a RANK command with nothing in the parenthesis, you might initially have trouble thinking, “What are we ranking?”  Check out the column in the ORDER BY statement; that is what you rank.  We have an ORDER BY daily_sales, so we are ranking by daily_sales.

The first two rows have a value of 32,800.50, so they both get a rank of 1.  Rows one and two are tied but notice that row three gets a rank of 3.  There are no more ties after that, so each row gets the next sequential ranking.

A user often wants to give the highest value to the number one rank.  Check out the example below because we will use the ORDER BY to rank the daily_sales column in DESC order.

The phrase below sounds like a protest chant:

What are we ranking? daily_sales!

How are we ranking it? Descending order!

Each RANK example will have an ORDER BY statement, but sometimes, you will also have a PARTITION statement. In the example below, you see the keywords PARTITION BY, which means the RANK function will reset and start over. Our ORDER BY statement is ordering the data by the column daily_sales DESC, so daily_sales is what we are ranking, but we will reset the rank calculation and start over with each product_id break because the column product_id is in the PARTITION BY statement. Check out the next example below.

The only systems I have ever seen that use the QUALIFY statements are Snowflake, Teradata, BigQuery, and Databricks. If you are working with a system that does not support QUALIFY, you can place your SQL statement in a derived table and then use a WHERE clause.

In the example below, all of the SQL using blue is part of the derived table, as well as the additional WHERE clause. The results are the same as the previous example but are done with a derived table and not a QUALIFY statement.

You are about to see something not often seen: the DENSE_RANK. It works exactly like the RANK command, but RANK and DENSE_RANK handle ties differently. In our example below, we use the RANK and DENSE_RANK functions. Since both have the same ORDER BY daily_sales statement, the data comes out the same, the only difference being how they handle the ties. The first two rows get one rank since they both tie with a value of 32,800.50. The RANK gives the third row a ranking of three, and the DENSE_RANK gives the third row a ranking of two.

PERCENT_RANK is a window function used to calculate the relative rank of a row within a partition of a result set, expressed as a percentage. It evaluates the position of a specific row compared to the other rows in the partition based on the specified order.

The example below shows the PERCENT_RANK function. The formula to get PERCENT_RANK is

(RANK-1 / (Total Rows -1).

Using PERCENT_RANK can help analyze data distribution and identify outliers or extreme values within partitions of a dataset. Below, we use a PARTITION BY statement to analyze data distribution within the product_id.

Did you know that Coffing Data Warehousing is the first company to create software that joins data across all systems? Download your free Nexus trial at www.CoffingDW.com.

Watch how the Nexus performs federated queries to Join an Azure Synapse and SQL Server table to tables from twenty systems. Watch the video.

To find out more about Nexus you can watch our 60-second videos here:

https://coffingdw.com/sixty-second-feature-videos/

Posted in Blog
← Databricks Analytics – RANK, DENSE_RANK, and PERCENT_RANKOracle Analytics – RANK, DENSE_RANK, and PERCENT_RANK →

Latest Posts

  • On-site and Virtual Database Training and SQL February 25, 2025
  • Query. Answer. Instant Dashboard. February 17, 2025
  • Compress Teradata or Leave Money on the Table February 11, 2025
  • Instantly Query, Migrate, Federate, & Integrate Data Across All Systems February 4, 2025
  • The AI Foundation Every Company Needs December 18, 2024
  • Federated Queries – Comparing Presto to Nexus December 7, 2024
  • 20-table Join Spanning 20 Systems Proves the Data Rosetta Stone Theory November 21, 2024
  • Your Company has 10,000 Laptops. Make them all Data Warehouses. November 7, 2024

From The Blog

  • On-site and Virtual Database Training and SQL
  • Query. Answer. Instant Dashboard.
  • Compress Teradata or Leave Money on the Table
  • Instantly Query, Migrate, Federate, & Integrate Data Across All Systems
  • The AI Foundation Every Company Needs
Copyright © 2024 · Coffing DW
All Logos & Trademarks Belong To Their Respective Owners
Facebook Twitter Google Linkedin Youtube Email