I can teach you Yellowbrick analytics!  I have never seen a database do analytics faster than Yellowbrick.  Last week we introduced you to Lead, and this week we are teaching you Lag. You use a Lead to place the value from the next row on the answer set’s current line.  You can then see today’s value, and on the same line, see tomorrow’s value.  You do the opposite for the Lag.  You can see today’s value, and on the same line, see yesterday’s value. 

The first thing I am going to do is show you a Lead.  Then, the Lag will make more sense.    

You will see an ORDER BY statement in each example, but it will not come at the end of the query.  The ORDER BY keywords is always within the Lead or Lag calculation.  It is the ORDER BY statement that runs first, and once the system orders the data, the Lead or Lag will calculate.  The initial ordering of the data set gives these analytics the name “Ordered Analytics.” The other name is “Window Functions,” because they calculate within a particular window of rows.

Let me summarize the Lead in the picture below.  Order the data first by the column Sale_Date and after the data sorts, then begin with row one and ask, “Can we get the Daily_Sales value from the next row, and add it to the current line?” If the answer is “Yes,” then do it, but if the answer is “No,” then put in a Null.

The LEAD above allows you to see the Daily_Sales for today next to tomorrow’s value.

Check out the next picture below.  This query is the same as the query example above, but instead of LEAD, we use LAG. 

The LAG above allows you to see the Daily_Sales for today next to yesterday’s value. Notice the example where it has LAG (Daily_Sales).  We want to get the value of Daily_Sales one row up. 

In the example below, we are using two LAG window functions.  We alias the first LAG as yesterday because it is capturing the Daily_Sales value one row above.  We give the second Lag an alias of two_days_ago because it is capturing the value two rows above.  We see today’s Daily_Sales, yesterday’s Daily_Sales, and the Daily_Sales from two-days previous on the current line of the answer set.

Also, notice that the command lag(daily_sales, 1) is the same as Lag (daily_sales) because the moving window of one is the default if a number is not specified.

Now, we are going to take the next example even farther by adding a PARTITION BY statement. The PARTITION BY resets the calculation and acts much like a GROUP BY statement.  Notice that each Product_ID calculates within the Product_ID only.  When a new Product_ID appears, the LAG calculation starts over.

If you want to move data from any system to Yellowbrick, you should use the Nexus and NexusCore Server for the data movement.  You can move these systems to Yellowbrick:

  • Teradata
  • Oracle
  • SQL Server
  • DB2
  • Greenplum
  • Redshift
  • Azure SQL Data Warehouse
  • Postgres
  • MySQL
  • Netezza
  • Snowflake

Below is an example of how pretty and easy-to-use the NexusCore Server is to move data from another system to Yellowbrick.  Below is an example of how stunning and easy-to-use the NexusCore Server is to transfer data from another system to Yellowbrick.  You can run this job immediately, or you can schedule it daily, weekly, monthly, yearly, or custom.

If you want to move data to Yellowbrick or use the most fantastic query tool known to humankind, use the Nexus.  Download your free Nexus trial at www.CoffingDW.com.

Watch the video of the Nexus moving data to Yellowbrick and all other systems.

Watch the video: https://youtu.be/mqNQ65H7lps

I hope you enjoyed today’s Yellowbrick analytic lesson.  See you next week. 

Yellowbrick – The only modern data warehouse for hybrid cloud

Yellowbrick is the world’s only modern data warehouse for hybrid cloud. Enterprises rely on Yellowbrick to do the impossible in data analytics: get answers to the most challenging business questions for improved profitability, better customer loyalty, and faster innovation in near real-time, and at a fraction of the cost of alternatives.

Yellowbrick offers superior price/performance for thousands of concurrent users on petabytes of data, along with the unique ability to run analytic workloads on-premises, in a private cloud, and/or any public cloud and manage them in a simple, consistent way–all with predictable pricing via an annual subscription. Learn more at www.yellowbrick.com.

I hope you enjoyed today’s Yellowbrick analytic lesson.  See you next week. 

Thank you,

Tom

Tom Coffing
CEO, Coffing Data Warehousing
Direct: 513 300-0341
Website: www.CoffingDW.com
YouTube channel: CoffingDW
Email: Tom.Coffing@CoffingDW.com