Snowflake Analytics – Part 5 – MDIFF
I can teach you analytics! I have never seen a database do analytics better than Snowflake. This week we are working on the Moving Difference, which is interchangeable with MDIFF. You use a Moving Difference (MAVG) to compare the difference between the value from two different rows. For example, you can see your Daily_Sales from this Monday, and then see the difference between the Daily_Sales from last Monday.
The most difficult part about an MDIFF is that you are only comparing two rows.
In each example, you will see an ORDER BY statement, but it will not come at the end of the query. The ORDER BY keywords is always within the MDIFF calculation. It is the ORDER BY statement that runs first, and once the system orders the data, the moving difference between the current row’s value and another row’s value will calculate. It is the initial ordering of the data set that gives these analytics the name “Ordered Analytics.” The other name is “Window Functions,” because they calculate within a certain window of rows.
Let me summarize the MDIFF in the picture below. Order the data first by the columns Product_ID and Sale_Date and after the data sorts, then begin with row one Was there a value four rows ahead? No, then Null. Move to row two. Can we compare row two’s Daily_Sales with a value of four rows ahead? No, then Null. Move to row three. Can we compare the Daily_Sales value with a row four rows ahead? No, then Null. Move to row four. Can we compare the Daily_Sales value with a row four rows ahead? No, then Null. Move to row five. Can we compare row five’s Daily_Sales value with a row four rows ahead? Yes, We made 32800.50 in Row five. We made 48850.40 in row one. We lost -16049.90 when comparing the Daily_Sales in row five to row one. and Average the Daily_Sales. This continues until the end of the rows. Each row compares with the row four rows ahead.
Each MDIFF example will have an ORDER BY statement, but sometimes you will also have a PARTITION statement. In the example below, you see the keywords PARTITION BY Product_ID, and that means the MDIFF function calculates within each Product_ID only. Snowflake does not place the Product_ID partitions in ascending or descending order.
In the picture above, we have a moving window of 2 so we will compare the current row’s Daily_Sales to the Daily_Sales two rows ahead. The Null values represent the fact that there were not any values two rows ahead for rows one and two in each partition.
If you want to move data from any system to Snowflake, you should use the Nexus and NexusCore Server for the data movement. You can move these systems to Snowflake:
- Teradata
- Oracle
- SQL Server
- DB2
- Greenplum
- Redshift
- Azure SQL Data Warehouse
- Postgres
- MySQL
- Netezza
- Snowflake
Below is an example of how pretty and easy-to-use the NexusCore Server is to move data from another system to Snowflake. Below is an example of how pretty and easy-to-use the NexusCore Server is to move data from another system to Snowflake. You can run this job immediately, or you can schedule it daily, weekly, monthly, yearly, or custom.
If you want to move data to snowflake or you want to use the greatest query tool known to humankind, then use the Nexus and NexusCore Server. Download your free Nexus trial at www.CoffingDW.com or contact me for a demo.
To be able to query from every data source no matter how new is now achievable with the Nexus Universal ODBC. We are proud to offer Nexus Universal ODBC – our new connection method that allows users to connect to and interact with any current or future ODBC data source in the universe. If there is a data source you want to query, and it has an ODBC driver, you can connect and query immediately. Check out the picture below to see the new tree with the Universal ODBC connection.
I hope you enjoyed today’s Snowflake analytic lesson. See you next week.
Thank you,
Tera-Tom
Tom Coffing
CEO, Coffing Data Warehousing
Direct: 513 300-0341
Website: www.CoffingDW.com
YouTube channel: CoffingDW
Email: Tom.Coffing@CoffingDW.com