I can teach you analytics!  I have never seen a database do analytics faster than Yellowbrick.  This week we are working on the Moving Difference, which is interchangeable with MDIFF. You use a Moving Difference (MAVG) to compare the difference between the value from two different rows.  For example, you can see your Daily_Sales from this Monday, and then see the difference between the Daily_Sales last Monday. 

The most challenging part about an MDIFF is that you are only comparing two rows. 

In each example, you will see an ORDER BY statement, but it will not come at the end of the query.  The ORDER BY keywords is always within the MDIFF calculation.  It is the ORDER BY statement that runs first, and once the system orders the data, the moving difference between the current row’s value and another row’s value will calculate.  It is the initial ordering of the data set that gives these analytics the name “Ordered Analytics.”  The other phrase is “Window Functions,” because they calculate within a particular window of rows. Let me summarize the MDIFF in the picture below.  Order the data first by the columns Product_ID and Sale_Date and after the data sorts, then begin with row one. Was there a value four rows ahead? No, then Null.  Move to row two.  Can we compare row two’s Daily_Sales with an amount of four rows ahead?  No, then Null.  Move to row three.  Can we compare the Daily_Sales value with a row four rows ahead?  No, then Null.  Move to row four.  Can we compare the Daily_Sales value with a row four rows ahead?  No, then Null.  Move to row five.  Can we compare row five’s Daily_Sales value with a row four rows ahead?  Yes,  We made 32800.50 in Row five.  We made 48850.40 in row one.  We lost -16049.90 when comparing the Daily_Sales in row five to row one.  Some of the best comparisons are seven and 30 because we can compare the difference between seven days or 30 days.  The reason that each corresponds four rows ahead is because of the SQL keywords BETWEEN 4 PRECEDING and 4 PRECEDING.

Each MDIFF example will have an ORDER BY statement, but sometimes you will also have a PARTITION statement.  In the example below, you see the keywords PARTITION BY Product_ID, which means the MDIFF function calculates within each Product_ID only.  Yellowbrick does not place the Product_ID partitions in ascending or descending order. 

In the picture above, we have a moving window of 2, so we will compare the current row’s Daily_Sales to the Daily_Sales two rows ahead.  The Null values represent the fact that there were no values two rows ahead for rows one and two in each partition.   

If you want to move data from any system to Yellowbrick, you should use the Nexus and NexusCore Server for the data movement.  You can move these systems to Yellowbrick:

  • Teradata
  • Oracle
  • SQL Server
  • DB2
  • Greenplum
  • Redshift
  • Azure SQL Data Warehouse
  • Postgres
  • MySQL
  • Netezza
  • Snowflake

Below is an example of how pretty and easy-to-use the NexusCore Server is to move data from another system to Yellowbrick.  Below is an example of how stunning and easy-to-use the NexusCore Server is to transfer data from another system to Yellowbrick.  You can run this job immediately, or you can schedule it daily, weekly, monthly, yearly, or custom.

Move from Snowflake to Yellowbrick

If you want to move data to Yellowbrick or you want to use the most fantastic query tool known to humankind, then use the Nexus.  Download your free Nexus trial at www.CoffingDW.com.

Watch the video of the Nexus moving data to Yellowbrick and all other systems.

Watch the video: https://youtu.be/mqNQ65H7lps

I hope you enjoyed today’s Yellowbrick analytic lesson.  See you next week. 

Yellowbrick – The only modern data warehouse for hybrid cloud

Yellowbrick is the world’s only modern data warehouse for hybrid cloud. Enterprises rely on Yellowbrick to do the impossible in data analytics: get answers to the most challenging business questions for improved profitability, better customer loyalty, and faster innovation in near real-time, and at a fraction of the cost of alternatives. Yellowbrick offers superior price/performance for thousands of concurrent users on petabytes of data, along with the unique ability to run analytic workloads on-premises, in a private cloud, and/or any public cloud and manage them in a simple, consistent way–all with predictable pricing via an annual subscription. Learn more at http://www.yellowbrick.com.

If you want to learn more about modern data analytics, sign up for the ANSWERS seminar that Yellowbrick is holding on September 29, 2020.  Coffing Data is a sponsor.

I hope you enjoyed today’s Yellowbrick analytic lesson.  See you next week. 

Thank you,

Tera-Tom

Tom Coffing
CEO, Coffing Data Warehousing
Direct: 513 300-0341
Website: www.CoffingDW.com
YouTube channel: CoffingDW
Email: Tom.Coffing@CoffingDW.com