DataWarehouse – Generate Date Dimension Optimally

Hi

Recently I was working on a data warehouse project and was tasked to generate a date dimension table.

Now some of the available options for me to achieve the same are listed out as below:

  1. Using the loops in SQL to iterate and populate dates
  2. Using the date wizard in SSAS (multidimensional)
  3. Copy the table data from other pre-populated table

But I stumbled upon a technique which caught my interest to populate the date dimension via the following SQL query


DECLARE @StartDate DATE = '20000101', @NumberOfYears INT = 15;
DECLARE @CutoffDate DATE = DATEADD(YEAR, @NumberOfYears, @StartDate);
SELECT d
FROM
(
  SELECT d = DATEADD(DAY, rn - 1, @StartDate)
  FROM 
  (
    SELECT TOP (DATEDIFF(DAY, @StartDate, @CutoffDate)) 
      rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
    FROM sys.all_objects AS s1
    CROSS JOIN sys.all_objects AS s2
    -- on my system this would support > 5 million days
    ORDER BY s1.[object_id]
  ) AS x
) AS y;

And from the above, I found it more smarter and optimal way of achieving the same, hence I set out to share.

Hope it helps…

2 Replies to “DataWarehouse – Generate Date Dimension Optimally”

  1. When executing the steps in this query. I am running into an issue with the

    DECLARE @CutoffDate DATE = DATEADD(YEAR, @NumberOfYears, @StartDate);

    I get a message of Msg 137, Level 15, State 2, Line 2
    Must declare the scalar variable “”CutOffDate”.

    Should I be adding something in this syntax? Meaning “year” should be defined? or something along those lines?

Leave a Reply

Your email address will not be published. Required fields are marked *

*