Snowflake Analytics – ROW_NUMBER

Yes, the Snowflake Architecture and SQL book is on its way to your favorite book stores, such as Skillsoft, where you can read it online, or Amazon, where you can buy it. It is another masterpiece of information from Tom Coffing and David Cook, two of the most famous authors in the history of computers. You can purchase your own PDF version of the book, which is in color, has hundreds of wonderful examples, covers every single SQL command, and will be your forever guide to mastering Snowflake. The book is only $199.00 and comes with your name watermarked on each page.
Contact Tom.Coffing@CoffingDW.com if you want to purchase the book. Check out the table of contents here. https://www.nexusdataserver.com/snowflakeclass.
I can teach you analytics! I have never seen a database do analytics better than Snowflake. These analytic lessons are the perfect opportunity for you to learn analytics like the back of your hand. If you want your career stock to soar, invest in yourself and learn these analytic lessons. You are about to be introduced to “promotion heaven.” I will teach you analytics each week until you become the next Einstein, or even better, the next Tera-Tom.
The best place to start is with the ROW_NUMBER command because it is simple enough to teach you the analytic fundamentals. Still, it is so incredibly powerful that you will be able to build powerful reports instantly.
In each example, you will see an ORDER BY statement, but it will not come at the end of the query. The ORDER BY keywords is always within the analytic calculation. These analytics will take the data set and sort it with an ORDER BY statement. Once the data sort happens in step 1 of the process, the calculations begin.
In its most simple explanation, a ROW_NUMBER will sort the data first with an ORDER BY statement and then place the number 1 on the first row, a number 2 on the second row, and so on. Check out the example below.
In the picture above, notice that the keyword ROW_NUMBER is immediately followed by an open and close parenthesis. The open and close parenthesis insinuates a function. There is never anything inside the parenthesis, but it is required. The keyword OVER follows, representing that this analytic is an ordered analytic, which is interchangeable with the term window function. The term ordered analytic means the data set will be put in a specific order before the calculation begins. We sort the data using the ORDER BY statement, and in this example, we are sorting by the column Product_ID and within Product_ID by the column Sale_Date. Product_ID is the major sort, and any ties further sort by Sale_Date, which is the minor sort. Once the data set sorts, the first row gets a one, and the number increments sequentially.
Each analytic example will have an ORDER BY statement, but sometimes you will also have a PARTITION statement. If you see the keyword PARTITION, it means the analytic calculation will reset and start over. Check out the next example below.
Get ready for something you have never seen before, which is the QUALIFY statement. QUALIFY acts like a filter but differs from a WHERE clause filter. The QUALIFY statement waits until all of the analytic calculations finish. When the report finishes and is ready to return, the QUALIFY steps in and filters the rows further.
In the example below, we are attempting to find the smarted two students in each class code. We want to find the top two students with the highest-grade points for the freshman, sophomores, juniors, and seniors.
The brilliance behind this query is that you control the sort through the ORDER BY statement. You also control when you want the calculation to reset through the PARTITION statement. Once the sort, reset, and calculations finish, the QUALIFY comes in and states, “I only want the top two rows per class code.”
Snowflake and Teradata are the only two systems I have ever seen that use the QUALIFY statements. If you are working with a system that does not support QUALIFY, you can place your SQL statement in a derived table and then use a WHERE clause. In the example below, all the SQL using blue is part of the derived table. The results are exactly the same as the previous example, but it is done with a derived table and not a QUALIFY statement.
The real brilliance of the Row_Number function comes from when someone needs to know the last claim filed by everyone. Then, you control data sorting and when you want to reset the calculation with the PARTITION BY statement. Therefore, you can control who gets a ROW_NUMBER of one and then use the QUALIFY statement.
However, pay attention if you want to see a world-class Tera-Tom trick on analytics. Below, we are joining two tables (CLAIMS and SUBSCRIBERS). We also cast the CLAIM_DATE column as a date instead of a timestamp.
The brilliance comes from the last three lines in the SQL where we place the ROW_NUMBER within the QUALIFY statement. Of course, the ROW_NUMBER will not appear on the final report, but we return only the last claim per person.
The example below comes from the Nexus Query Chameleon. Is the Nexus the best query tool in the world? Yes, but it did take Tom Coffing’s development team 18 years to perfect. Not only does Nexus query all systems, but it also migrates automatically between all systems, joins data across all systems, and has a Super Join Builder that builds the SQL automatically.
Download your free Nexus trial at www.CoffingDW.com.

Check out just some of the amazing features of Nexus here.




