Skip to content
Software connecting all databases
  • Home
  • Software
    • Download Nexus
    • Purchase Nexus License
    • Activate Nexus License
    • SmartCompress
    • 60-Second Feature Videos
  • Training
    • Corporate Education
  • Data Migration
    • Data Migration Software
    • Data Migration Services
  • Books
  • Partners
  • Press
  • Blog
  • About Us
    • What We Do
    • Who We Are
  • Contact
Coffing Data Warehousing - Federated Query and ETL Software, Training, and Books
FacebookTwitterGoogleLinkedinYoutubeEmail
Software connecting all databases
  • Home
  • Software
    • Download Nexus
    • Purchase Nexus License
    • Activate Nexus License
    • SmartCompress
    • 60-Second Feature Videos
  • Training
    • Corporate Education
  • Data Migration
    • Data Migration Software
    • Data Migration Services
  • Books
  • Partners
  • Press
  • Blog
  • About Us
    • What We Do
    • Who We Are
  • Contact

Yellowbrick Analytics – ROW_NUMBER

May 29, 2024

I can teach you analytics! Yellowbrick is built for analytics, and this week, I will show you row_number, a not-so-distant cousin to rank. I used to think that the row_number analytic was harmless and somewhat useless, and boy, was I wrong. The row_number is one of my favorite analytics, and I will show you many ways to pinpoint specific rows in a dataset, such as the first or last sale per product or the highest or lowest grade point. Get ready to add to your analytic arsenal.

All of these examples have come from my books and training classes. Please do me a favor and tell your training coordinator that you know the best technical trainer in the World. Ask them to hire me to train at your company, either on-site or with a virtual class. They can see our classes, outlines, and a sample of my teaching at this link on our website. I hope to meet you and say thanks in my next class at your company:)

https://coffingdw.com/education/

In the picture below, you can see the analytic row_number followed by parenthesis. There is never anything inside the parenthesis, but it signifies that it is a function. You will always have the keyword OVER because before the row_number performs, the dataset will order the data as the ORDER BY statement instructs.

Our example below first orders the data by a major sort of product_id and then a minor sort of sale_date. Notice we see in the result set the product_id of 1000 and within 1000 sequential dates in order. Only after the data is ordered does the row_number place a number starting from one on the first row and incrementing by one until the last. These types of analytics are often referred to as ordered analytics because they order the data before the analytics runs.

Our next example uses a PARTITION BY statement, which means performing the analytic within each product_id separately. I like to think of it as resetting the row_number with the beginning of each product_id. Notice that we are ordering the data by sale_date, but it appears we are ordering by product_id and then sale_date because of the combination of partitioning and ordering within the row_number analytic.

Our example returns the last three sales per product_id, demonstrating the power of a derived table. A WHERE clause filters before calculations or analytics run, so we place our query inside parenthesis to create a derived table. I have placed everything you need to create a derived table and query it with a WHERE clause in blue.

We use a WITH statement to create the derived table materialized by the row_number query in parenthesis.

Notice that we partition by product_id and use the ORDER BY sale_date DESC to guarantee that the last sale_date for each product_id receives a row_number of one.

Always remember that a row_number query allows you to control when the data gets partitioned and, based on the ORDER BY statement, which row will receive the row_number of one.

The derived table is created first and then queried by the SELECT statement at the bottom. We select the rows with a row_number of less than four using a WHERE clause.

I have also placed an ORDER BY statement at the end of the query to ensure the data is presented as I want.

The examples above and below do the same thing, but we are using a brilliant technique I want you to know about. The example below does not return the row_number because when we query our derived table, we list the columns from the derived table we want for the final report. We use the row_number to filter the rows but don’t place it in the final SELECT list.

Now that you know what to look for, the example below should be easy on your brain and your eyes. We partition by the class_code and order by the grade_pt in descending order. We also use a WHERE clause to filter out grade_pt null values. We then place our row_number query inside a derived table, which I have colored in blue, often referred to as a common table expression (CTE). I have also done a final ORDER BY using a CASE statement to order the answer set logically instead of alphabetically.

We have made a few changes in our example below to return the lowest grade averages per class_code. The most important detail is that we are now ordering the data by grade_pt in ascending mode. We place our row_number query inside a derived table I have colored in blue. We can then use a WHERE clause to filter out rows that do not receive a row_number of one. We only select columns we want on the final report in our top SELECT statement without selecting the row_number on the report. I have also done a final ORDER BY using a CASE statement to order the answer set logically instead of alphabetically.

The example below uses clever techniques to find the first and last claim filed per individual. We have created two derived tables using the WITH technique to accomplish the request. Notice the keyword WITH is only used once, but I have the names of both derived tables colored in pink. I have also colored the first derived table blue and the second green.

Both derived table queries perform a join between the subscribers and claims tables. Our row_number analytic in both partitions the data by subscriber_no and member_no because those column combinations make up an individual subscriber filing a claim.

The only difference between the two derived table queries is how the data is sorted using the ORDER BY statement, ASC vs. DESC mode.

The next thing to focus on is the UNION set operator at the bottom. A set operator will have two select statements, each with the same amount of columns and similar data types, often referred to as being from the same domain. Both SELECT statements run simultaneously to combine the contents of both derived tables, but we filter with a WHERE clause to deliver the first and last claims only.

We now see the first and last claims per individual in our report.

Did you know that Coffing Data Warehousing was the first to create software that joins data across all systems? Download your free Nexus trial at www.CoffingDW.com.

Watch how Nexus performs federated queries to join a Yellowbrick table to tables from twenty systems in a single query. Watch the video.

To find out more about Nexus, you can watch our 60-second videos here:

https://coffingdw.com/sixty-second-feature-videos/

Posted in Blog
← SQL Server Analytics – ROW_NUMBERVertica Analytics – ROW_NUMBER →

Latest Posts

  • Mastering Derived Tables in Greenplum September 19, 2025
  • Mastering Derived Tables in Vertica September 19, 2025
  • Mastering Derived Tables in Yellowbrick September 19, 2025
  • Mastering Derived Tables in Oracle September 19, 2025
  • Mastering Derived Tables in MySQL July 16, 2025
  • Mastering Derived Tables in SQL Server July 15, 2025
  • Mastering Derived Tables in DB2 July 15, 2025
  • Mastering Derived Tables in Postgres July 15, 2025

From The Blog

  • Mastering Derived Tables in Greenplum
  • Mastering Derived Tables in Vertica
  • Mastering Derived Tables in Yellowbrick
  • Mastering Derived Tables in Oracle
  • Mastering Derived Tables in MySQL
Copyright © 2024 · Coffing DW
All Logos & Trademarks Belong To Their Respective Owners
Facebook Twitter Google Linkedin Youtube Email