I can teach you Yellowbrick analytics! I have never seen analytics perform faster than on Yellowbrick. Today we are going to learn about First_Value. This analytic seems too simple at first, but I will show you how to use First_Value to look for trends.
First_Value will display the first value of a specific column residing in the first row. The order of the rows happens because of the ORDER BY statement. So, after the data sorts via the ORDER BY statement, the first row is established. The first_value will repeat the column value chosen in the first_value function for the other rows in the specified window.
Let me show you the first example, and the paragraph above will become clear.
In the picture below, notice that the column of the First_Value function is daily_sales. After we sort the data using the ORDER BY sale_date statement, the first row’s daily_sales value is 48850.40. The first value of 48850.40 repeats on each subsequent row.
The next example is similar, but it uses a PARTITION BY statement. In the picture below, we are still using the column daily_sales in the First_Value function. We are also still ordering the data by sale_date, but notice the PARTITION BY product_id statement. The data will group by product_id, so the product_ids of 1000 will calculate for only product_id 1000s, and then the calculation resets for product_id 2000s.
After the rows partition by product_id and sort by sale_date within product_id, the first row’s daily_sales value in each partition repeats.
The next example shows you how to look for trends. The picture below is similar to the example above, except we subtract each row’s daily_sales value from the First_Value. This subtraction compares each row with the first row. We can now see which rows did better or worse than the first row, which shows us trending. We can see whether a specific row did much better or way worse than the sale on the first day.
News – CoffingDW becomes a Partner with Yellowbrick
I am excited to announce that Coffing Data Warehousing is partners with Yellowbrick. This partnership is excellent for both sides.
Yellowbrick Data Warehouse is a modern, standards-based MPP analytic database that shatters ceilings on price/performance, achieving 100X performance versus legacy alternatives at a fraction of their cost. With its unified hybrid cloud architecture, applications can deploy anywhere with the same data and performance, de-risking cloud migrations and unlocking multi-cloud innovation.
Coffing Data Warehousing’s Nexus Desktop and Nexus Server is the perfect combination for a remote workforce to manage a hybrid-cloud environment. From the user’s desktop, users can query every system, migrate to every system, and join data across every database. Users set up everything with the click of a button, but the Nexus Server’s job is to execute the work. The Nexus Desktop and Server ensure that all processing is done on a high-speed network so that users can work from any location. We are migrating millions of rows per second to Yellowbrick.
Check out the picture below. Notice how the desktops connect to the Nexus Server, and the Server controls the data movement between all database platforms.
If you want your remote workforce to be even more effective than when working at the office, here is your strategic plan.
Strategic plan #1 – Migrate Systems
Allow the migration from legacy platforms to on-premises and cloud platforms to happen in an instant.
Migration difficulties start with converting table structures and data types between platforms. Proving just as tricky is writing load utility scripts to transfer the data. And finally, utilizing the proper utility based on the table’s size makes providing the maximum load speed performances challenging.
A project that involves the migration between two database platforms can take months to prepare, and resolving issues and errors during the load process can be timely, and data loss can occur.
The Nexus can not only set up the migration between two platforms instantly, but it can set up the migration between all platforms immediately. Thousands of tables can set up to move in minutes, and the transfer speeds are fast. The Nexus can not only execute the load process between two platforms with the best performance speeds, but it can execute flawlessly between any combination of systems.
While Nexus supports ETL teams with a few clicks of the mouse, the Nexus allows teams to make changes to the columns and rows they want to move, the target table definition (DDL), and provide the ability to transform data before or after the data transfers.
In the current environment, entire teams take months per system to migrate legacy systems.
Strategic plan #2 – Transferring data for individuals
Because companies need to analyze data across the enterprise, the process is iterative and ever-changing. As new data becomes available, users need to integrate data frequently. Users who need to transfer tables and data into their sandbox need to do so in real-time with no complexity.
Nexus makes it as simple as point-click and move. All data movement utilities and the table and data type conversions from source to target are automatic and transparent to the user, and the Nexus Server performs the movement. Users can even receive an answer set and drag it to their system tree to create a table on any system. Users can also join answer sets they receive from different platforms inside their desktop.
Business discovery from every user across the enterprise will positively change time to market magnitudes of order. The new standard will cut costs by making users independent and self-sufficient.
Strategic plan #3 – Automate SQL Creation and increase knowledge through join menus
Most users do not know what tables join together, and it can take months and even years to gain proficient knowledge. A department expert can define the joins one time, and all users can use the Nexus Super Join Builder for Ad hoc queries. The Super Join Builder shows a table or view visually. The columns and their data types display for the user along with a checkbox next to each column. The table also offers a menu that shows what other tables or views are eligible to join. Users can join tables together with ease. As the user places a checkmark on a column, the Nexus builds the SQL perfectly.
Providing an environment where users can rapidly learn how the business manages data gives everyone the tools they need to perform their assignments.
The current environment uses complex diagrams to display how tables model and join. Many business users do not have the skillset or desire to develop queries, so IT teams do it for them. Time and energy are costly when the facts are that users can do it themselves.
Strategic plan #4 – Give users the ability to join data across platforms
Joining tables and views across platforms refers to a federated query. The data must move to a common platform where it joins, and then the system removes the temporary tables. The Nexus Super Join builder is the best-federated tool in the industry. Users can merge data from any number of systems, and Nexus does all of the work. The users choose the columns they want on their report, and Nexus delivers the answer set. Users can even choose where they want to process the join. The join can process transparently on the user’s desktop for smaller data less than 10,000 rows. The join can process on the Nexus server for large data, and the join can process on any on-premises or cloud platform within the enterprise.
Users can even use the Nexus Cache Tree to search for joining columns such as customer_number, and a new tree displays every table or view that contains customer_number on every system. This live tree is the fastest and easiest way to join data across the enterprise.
Currently, there is no ability to join data across platforms, and ETL teams spend enormous time and effort to place data in data lakes or another system so a business user can join data. Data is too large to store on a single platform, so the best alternative is to join the data from many different venues through a Nexus Server.
Strategic plan #5 – Provide a remote workforce the ability for high-speed processing
The physical location of a user should no longer matter with Nexus Server technology. Working from any site, ETL teams can prepare migrations; business users can transfer data or set up federated queries because they can schedule them to execute on a Nexus Server.
The only way to manage a multi-database environment is to pull data from the source through the Server and onto the target. The speeds are blistering, and because the Nexus Server uses the user’s credentials, the security remains consistent. No user can pull or see data unless they have security rights access.
In today’s challenging environment, remote workers use a VPN or remote desktop to connect to the office, but these solutions are not appropriate for transferring data.
Why not try out a Proof of Concept (POC) of Yellowbrick and Nexus. The POC is free for 30 days. Please contact me for more information or check out www.Yellowbrick.com.
Tom Coffing, better known as Tera-Tom, is the founder of Coffing Data Warehousing where he has been CEO for the past 25 years. Tom has written over 75 books on all aspects of Teradata, Netezza, Yellowbrick, Snowflake, Redshift, Aurora, Vertica, SQL Server, and Greenplum. Tom has taught over 1,000 classes worldwide, and he is the designer of the Nexus Product Line.