Azure Synapse Analytics – Moving Difference

I can teach you analytics! Azure Synapse is built for analytics, and this week, I will show you the amazing moving difference commands. The moving difference is one of the most incredible aspects of analytics, but it can seem confusing unless it is explained right. With power comes incredible responsibility, so I won’t let you down.
All of these examples have come from my books and training classes. Please do me a favor and tell your training coordinator that you know the best technical trainer in the World. Ask them to hire me to train at your company, either on-site or with a virtual class. They can see our classes, outlines, and a sample of my teaching at this link on our website. I hope to meet you and say thanks in my next class at your company:)
https://coffingdw.com/education/
Moving differences are commonly used in data analysis to understand the rate of change in a series of values over time, which allows us to look for positive or negative trends. They provide insights into the direction and magnitude of change, which can be useful for decision-making and forecasting. Check out the example below.
Let me caution you so you are not confused. A moving difference only compares one row to another row.
The first thing to remember is that these analytics are often referred to as ORDERED ANALYTICS because we have an ORDER BY statement inside the calculation that sorts the data first. Once the data is ordered correctly, the analytic calculates from the first row sorted to the last. In our example below, we first order from PRODUCT_ID ASC and then SALE_DATE ASC. We don’t need the ASC because ascending mode is the default, but I put it there for clarity.
If I have a moving window of four, as in the example below, then the DAILY_SALES of row one will compare to the DAILY_SALES value of row five (four rows apart) and show a positive or negative difference between those two rows.
Notice that the first four rows have null values in the moving difference. Why? Because each of them has something in common: they don’t have a row four rows ahead of them to compare.
Notice that the fifth row has a value of -16409.90. We are only comparing the DAILY_SALES value in row one with the DAILY_SALES value in row five because we have a moving window of four because of the keywords ROWS BETWEEN 4 PRECEDING AND 4 PRECEDING. I have colored those values in blue. Notice that we compared row two with row six (four rows apart), and we did better by 9,799.78 dollars, colored in red.

The example below performs the same moving difference calculation every four rows but uses the LAG command. Remember, both examples only compare the difference between the current row’s DAILY_SALES and the DAILY_SALES value four rows ahead.

The example below has two differences from the previous examples. The first difference is the PARTITION BY PRODUCT_ID statement within the moving difference calculation. PARTITION BY PRODUCT_ID means performing the moving difference calculation within each product ID. In other words, with each product ID break, we will begin the calculation again.
The second change I have made is we have a moving window of two rows because of the ROWS BETWEEN 2 PRECEDING and 2 PRECEDING. Now, we are comparing the current row’s DAILY_SALES value with the DAILY_SALES value two rows ahead.
Notice the null values at the beginning of each product ID because there are not two rows ahead.

Now that you have a general understanding of the moving difference, I will go super deep to show you how to compare what you made on Monday with the previous Monday and compare Tuesday to the previous Tuesday, etc. However, I will have to do it in three parts because it is complicated until it is easy.
The first part will not involve the moving difference; instead, merely add the day of the week using the FORMAT command on the SALE_DATE. Notice that I am using an ORDER BY PROD, DOW, SALE_DATE at the end of the query. I am showing you the ordering of the data I will use in my moving difference calculation. I am ordering the data so each product_id has the same days of the week in chronological order. The example below shows all of the Fridays from the first sale to the last sale. And then all the Mondays from the first to the last sale.
Since I will be partitioning by the day of the week, each moving difference calculation will only be performed within each weekday. So, step one is to retrieve the day of the week in the answer set.

Azure Synapse requires me to put my previous query in a derived table because I want to use the day of the week aliased by DOW in the moving difference calculation. I have colored the previous query in blue, put parenthesis around it, and named it my_derived_table. Everything in blue runs first and then saves the data in a derived table.
The data sorts like the example below because of how I perform the PARTITION BY PROD, DOW, and using the ORDER BY SALE_DATE ASC within the moving difference.
Notice that all Mondays are chronologically bunched together, each one row apart. The moving difference will first be the Mondays comparing each row with the previous row; in other words, it will compare Monday’s DAILY_SALES with the previous Monday for product ID 1000. I guarantee that only Mondays will compare to Mondays within their partition, and then Tuesdays will compare with Tuesdays within their partition, etc.
Please don’t look at the final ORDER BY statement at the end of the query because you will see that result in part three of the example. I want you to focus on and understand how the moving difference orders the data within the calculation displayed in the picture below.

Here is where it gets tricky but brilliant. Look above you at the query again and notice that we have two ORDER BY statements. The first ORDER BY is within the moving difference calculation, and that is how the data is ordered for the calculation. I am reordering the data for the final display below to show you chronologically by SALE_DATE for product ID 1000.
Still, the moving difference shows what we did on Monday’s DAILY_SALES compared to Monday of the previous week. I have the numbers for the second and third Monday in red. On the first Monday, we made 97.00; on the second Monday, we made 62.00 with a moving difference of -35.
To summarize, we added the day of the week into a query and then put that query in a derived table. We then used the derived table to retrieve each column but added a moving difference. The moving difference is partitioned chronologically on each product ID and within each Product ID on the day of the week.
Finally, we added an ORDER BY statement at the end to reshuffle the data to see each sale for each product ID chronologically with a moving difference comparing Monday to the previous Monday and Tuesday to the previous Tuesday, etc.

Did you know that Coffing Data Warehousing is the first company to create software that joins data across all systems? Download your free Nexus trial at www.CoffingDW.com.
Watch how Nexus performs federated queries to join an Azure Synapse table to tables from twenty systems in a single query. Watch the video.
To find out more about Nexus, you can watch our 60-second videos here: