Dive into Dazzling Databricks Date Functions

Below the video are dazzling date functions on Databricks. However, you can first witness a the amazing video that shows how our customers are using Nexus to integrate Databricks with all other relevant databases, including Snowflake, BigQuery, Amazon Redshift, and Azure Synapse.
You will soon be able to download our incredible book on Databricks, but you can now download our books on Synapse, Redshift, Google BigQuery, DB2, Oracle, Yellowbrick, Vertica, MySQL, Postgres, and more for FREE.
You can download the PDFs from our CoffingDW.com website immediately.
Here is how you do it. Go to CoffingDW.com and click on Books from the top right menu. You will then be in the CoffingDW bookstore, where you will see the Snowflake book. Add the Snowflake book to your cart.
Add any book you want to your cart.
When you check out, you use the coupon FRIEND2023.
You won’t have to enter any credit card information. Just put in your name and email, and the price will be discounted to zero. You can then download the book immediately at no cost to you.
Unlocking the Power of Databricks: Discover Essential Keywords for Date, Timestamp, and Timezone Retrieval. These keywords are reserved by Databricks and are ready at your fingertips when you need them.

This SQL query in Databricks retrieves two timestamps. It assigns them aliases: current_timestamp as timestamp1: This part of the query retrieves the current timestamp from the system clock and assigns it the alias “timestamp1.” The current_timestamp function returns the current date and time when the query is executed. now() as now1: This part of the query also retrieves the current timestamp, but it uses the now() function instead of current_timestamp. It assigns this timestamp the alias “now1.” The now() function serves the same purpose as the current_timestamp in most databases, returning the current date and time. Both timestamps are included in the query result, with their respective aliases, making it easy to reference them in subsequent calculations or for display purposes.

When you add or subtract a number from a date, you are adding or subtracting days.

This Databricks query retrieves and processes various timestamps and dates:
CURRENT_TIMESTAMP: Retrieves the current date and time.
CAST(CURRENT_TIMESTAMP as date) as date_ts: Converts the current timestamp to a date, assigning it the alias “date_ts.”
DATE(CURRENT_TIMESTAMP) as date_fn: Also converts the current timestamp to a date, assigning it the alias “date_fn.”
DATE(‘2023-12-31’) as eoy: Converts the specified date ‘2023-12-31’ to a date and assigns it the alias “eoy,” which typically stands for “end of year.”
This query demonstrates different ways to work with timestamps and dates in Databricks.

In Databricks, the TO_DATE function is used to convert a timestamp or a string representing a date into a date format. It can be helpful for manipulating and formatting date data. Here’s an explanation of the provided query:
CURRENT_TIMESTAMP: Retrieves the current date and time.
CAST(CURRENT_TIMESTAMP as date) as date_ts: Converts the current timestamp to a date using a CAST operation and assigns it the alias “date_ts.”
TO_DATE(CURRENT_TIMESTAMP) as date_fn: Converts the current timestamp to a date using the TO_DATE function and assigns it the alias “date_fn.” This is another way to achieve the same result as in step 2.
TO_DATE(‘2023-12-31’) as eoy: Converts the specified date ‘2023-12-31’ (provided as a string) to a date using the TO_DATE function and assigns it the alias “eoy,” which typically stands for “end of year.”
TO_DATE(CURRENT_TIMESTAMP, ‘yyyy-MM-dd’) as formatted: Converts the current timestamp to a date using the TO_DATE function and specifies a custom format ‘yyyy-MM-dd’ as the second argument. This allows you to format the date as per the specified pattern and assigns it the alias “formatted.”
The query showcases how the TO_DATE function can be used to convert timestamps and strings into date objects and how you can apply custom date formatting if needed.

In Databricks, the TO_TIMESTAMP function is used to convert a date or a string representing a date into a timestamp format. It is useful for converting date data into a timestamp data type. Here’s an explanation of the provided query:
CURRENT_TIMESTAMP: Retrieves the current date and time as a timestamp.
CURRENT_DATE as today: Retrieves the current date.
TO_TIMESTAMP(CURRENT_DATE) as timestamp_fn: Converts the current date into a timestamp using the TO_TIMESTAMP function and assigns it the alias “timestamp_fn.” This essentially takes the date and assumes a default time of midnight (00:00:00) for the timestamp.
TO_TIMESTAMP(CURRENT_DATE, ‘yyyy-MM-dd’) as formatted: Converts the current date into a timestamp using the TO_TIMESTAMP function while specifying a custom format ‘yyyy-MM-dd’ as the second argument. This allows you to format the timestamp with the date from CURRENT_DATE and a default time of midnight, as per the specified pattern, and assigns it the alias “formatted.”
The query demonstrates how the TO_TIMESTAMP function can be used to convert dates into timestamp data types, either using a default time or a custom format for the resulting timestamp.

This query retrieves data from the “order_table,” calculates the difference in days between the order date and the current date for each order, and presents the results in ascending order based on the order date. This can help you analyze how long ago each order was placed in days.

In Databricks, the MONTHS_BETWEEN function is used to calculate the difference in months between two dates. This query calculates the difference in months between the order date and the current date for each order in the “order_table.” It then presents the results in ascending order based on the order date. This can help you analyze how many months ago each order was placed.

In Databricks, the ADD_MONTHS function is used to add a specified number of months to a given date. It’s helpful for performing date arithmetic. This query calculates a new due date that is two months later than the order date for each order in the “Order_Table.” It also includes the order total, and the results are sorted by the order date. This could be useful, for example, in scenarios where you need to determine when payments are due for orders placed on various dates.

The Add_Months command adds months to any date. Below, we use a great technique to add one year to the order_date.

The provided query in Databricks calculates a new date by adding 12 months multiplied by 5 (i.e., 60 months or 5 years) to the “order_date” for each row in the “order_table.”

In Databricks, the EXTRACT function is used to extract specific components (e.g., year, month, day, hour, minute, etc.) from a date or timestamp. It allows you to break down a date or timestamp into its individual parts for further analysis or filtering. This query uses the EXTRACT function to break down the “order_date” into its year and day components. It then filters the results to include only orders placed in the month of September (month component equals 9). This allows you to analyze and retrieve specific date components from the orders placed in September.

In Databricks, you can use the date_part function to extract specific components from a date or timestamp. The date_part function returns a numerical value representing the specified date or time part. This query demonstrates how to use the date_part function in Databricks to extract specific date and time components such as day, minute, second, and quarter from a timestamp. It also showcases how to use date_part with an interval to extract the hour component.

In Databricks, the date_format function is used to format a date or timestamp into a specific string format. It allows you to customize the representation of date and time components.
Explanation of the query:
current_date as today: Retrieves the current date and assigns it the alias “today.”
date_format(current_date, ‘y’) as yr: Uses the date_format function to format the “current_date” into a string representing the year (e.g., ‘2023’) and assigns it the alias “yr.”
date_format(current_date, ‘M’) as mo: Formats the “current_date” into a string representing the month (e.g., ’10’ for October) and assigns it the alias “mo.”
date_format(current_date, ‘d’) as day: Formats the “current_date” into a string representing the day of the month (e.g., ’05’ for the 5th day of the month) and assigns it the alias “day.”
date_format(current_date, ‘D’) as day_of_yr: Formats the “current_date” into a string representing the day of the year (e.g., ‘278’ for the 278th day of the year) and assigns it the alias “day_of_yr.”
date_format(current_date, ‘E’) as day_of_wk: Formats the “current_date” into a string representing the day of the week (e.g., ‘Tue’ for Tuesday) and assigns it the alias “day_of_wk.”
date_format(current_date, ‘a’) as am_pm: Formats the “current_date” into a string representing whether it’s AM or PM (e.g., ‘AM’ or ‘PM’) and assigns it the alias “am_pm.”
date_format(current_date, ‘q’) as qtr: Formats the “current_date” into a string representing the quarter of the year (e.g., ‘4’ for the fourth quarter) and assigns it the alias “qtr.”
date_format(current_date, ‘G’) as era: Formats the “current_date” into a string representing the era (e.g., ‘AD’ or ‘BC’) and assigns it the alias “era.”
In summary, this query uses the date_format function in Databricks to format various date and time components of the “current_date” into different string representations, such as year, month, day of the month, and more. These formatted values can be useful for generating custom date displays or reports.

the DATEDIFF function is used to calculate the difference between two dates or timestamps in terms of a specified date part, such as years, months, days, hours, minutes, or seconds. It returns an integer representing the number of date or time units between the two dates.
Explanation of the query:
DATEDIFF(month, ‘2014-01-01’, ‘2023-01-01’) as mo_bet: Calculates the difference in months between the date ‘2014-01-01’ and ‘2023-01-01’. It assigns the result to the alias “mo_bet.” This calculates how many months are between these two dates.
DATEDIFF(year, ‘2014-01-01’, current_date) as years: Calculates the difference in years between the date ‘2014-01-01’ and the current date (represented by current_date). It assigns the result to the alias “years.” This calculates how many years have passed between the specified date and the current date.
DATEDIFF(quarter, ‘2014-01-01’, current_date) as quarters: Calculates the difference in quarters (3-month periods) between the date ‘2014-01-01’ and the current date. It assigns the result to the alias “quarters.” This calculates how many quarters have passed between the specified date and the current date.
DATEDIFF(hour, ‘2014-01-01’, current_date) as hours: Calculates the difference in hours between the date ‘2014-01-01’ and the current date. It assigns the result to the alias “hours.” This calculates how many hours have passed between the specified date and the current time.
DATEDIFF(minute, ‘2014-01-01’, current_date) as minutes: Calculates the difference in minutes between the date ‘2014-01-01’ and the current date. It assigns the result to the alias “minutes.” This calculates how many minutes have passed between the specified date and the current time.
DATEDIFF(second, ‘2014-01-01’, current_date) as seconds: Calculates the difference in seconds between the date ‘2014-01-01’ and the current date. It assigns the result to the alias “seconds.” This calculates how many seconds have passed between the specified date and the current time.
The query below uses the DATEDIFF function to calculate the differences between various date and time units (months, years, quarters, hours, minutes, and seconds) between specified dates and the current date or time. It’s useful for various time-based calculations and analysis.

In Databricks, the DATE_ADD function is used to add or subtract a specified number of date or time units to a given date or timestamp. It’s a versatile function for performing date arithmetic.
Explanation of the query:
order_date: This column represents the date of each order in the “order_table.”
cast(dateadd(day, 3, order_date) as date) as three_days: Uses the DATE_ADD function to add 3 days to the “order_date” for each row in the “order_table.” The result is cast as a date data type and assigned the alias “three_days.”
cast(dateadd(week, 1, order_date) as date) as one_week: Adds 1 week to the “order_date” for each row. The result is cast as a date and assigned the alias “one_week.”
cast(dateadd(month, 2, order_date) as date) as two_months: Adds 2 months to the “order_date” for each row. The result is cast as a date and assigned the alias “two_months.”
cast(dateadd(quarter, -2, order_date) as date) as minus_2q: Subtracts 2 quarters (6 months) from the “order_date” for each row. The result is cast as a date and assigned the alias “minus_2q.”
The query below uses the DATE_ADD function in Databricks to perform various date arithmetic operations on the “order_date” column in the “order_table.” It adds or subtracts days, weeks, months, or quarters from the order date, and the results are cast as dates. This can be useful for tasks such as calculating delivery dates, due dates, or forecasting future dates based on order dates.

These three Databricks queries are similar in structure but use the DATEADD function to add different units of time (hours, minutes, and seconds) to the current timestamp. Here’s an explanation of each query:
CURRENT_TIMESTAMP: Retrieves the current timestamp, including date and time.
DATEADD(HOUR, 2, CURRENT_TIMESTAMP()): Uses the DATEADD function to add 2 hours to the current timestamp.
The second query is similar to the first query, but it adds 2 minutes to the current timestamp using the DATEADD function.
This query adds 2 seconds to the current timestamp using the DATEADD function.
The result is assigned to the alias “twosec.”
In summary, these queries demonstrate how to use the DATEADD function in Databricks to perform date and time calculations by adding a specified number of hours, minutes, or seconds to the current timestamp. Each query provides a different unit of time added to the current timestamp and assigns the result to a named alias for easy reference.

In Databricks, the Date_Trunc function is used to truncate a timestamp or date to a specific level of precision. It allows you to “zero out” or reset certain components of a timestamp or date to a lower value.
Explanation of the query:
Date_Trunc(‘day’, current_timestamp) as day: This part of the query uses the Date_Trunc function to truncate the current_timestamp to the precision of a day. It effectively sets the time component (hours, minutes, seconds) to zero, retaining only the date information. The result is assigned the alias “day.”
Date_Trunc(‘hour’, current_timestamp) as hour: Similarly, this part of the query truncates the current_timestamp to the precision of an hour. It sets the minutes and seconds to zero, retaining only the date and hour information. The result is assigned the alias “hour.”
Date_Trunc(‘minute’, current_timestamp) as minute: This part of the query truncates the current_timestamp to the precision of a minute. It sets the seconds to zero, retaining the date, hour, and minute information. The result is assigned the alias “minute.”
In summary, the Date_Trunc function in Databricks allows you to truncate a timestamp to a specified level of precision, effectively removing higher-level time components. This query demonstrates how to truncate the current_timestamp to the precision of a day, hour, and minute, respectively, and assigns each truncated result to a named alias for reference.

In Databricks, the NEXT_DAY function is used to find the date of the next occurrence of a specified day of the week based on a given date. It’s a convenient function for finding future dates that match a specific day.
Explanation of the query:
CURRENT_DATE as today: Retrieves the current date and assigns it the alias “today.”
NEXT_DAY(today, ‘Wednesday’) as next_wed: Uses the NEXT_DAY function to find the date of the next Wednesday after the “today” date. It returns the next occurrence of Wednesday and assigns it the alias “next_wed.”
NEXT_DAY(today, ‘Th’) as next_thursday: Similarly, this part of the query uses the NEXT_DAY function to find the date of the next Thursday after the “today” date. It uses the abbreviated day name ‘Th’ to specify Thursday and assigns the result the alias “next_thursday.”
NEXT_DAY(CURRENT_DATE, ‘fr’) as next_friday: This part of the query uses the NEXT_DAY function to find the date of the next Friday based on the current date (retrieved using CURRENT_DATE). It uses the abbreviated day name ‘fr’ to specify Friday and assigns the result the alias “next_friday.”
In summary, this query demonstrates how to use the NEXT_DAY function in Databricks to find future dates that match specific days of the week (Wednesday, Thursday, and Friday) based on the current date or a given date. It assigns these future dates to named aliases for easy reference.

Tom Coffing, better known as Tera-Tom, founded Coffing Data Warehousing, where he has been CEO for the past 30 years. Tom has written over 90 books on Architecture, performance tuning, and SQL on all relevant systems. Tom has taught over 1,000 classes worldwide, and he is the designer of the Nexus, which celebrates 20 years of brilliantly integrating all databases.