Recently I was working on a data warehouse project and was tasked to generate a date dimension table.
Now some of the available options for me to achieve the same are listed out as below:
- Using the loops in SQL to iterate and populate dates
- Using the date wizard in SSAS (multidimensional)
- Copy the table data from other pre-populated table
But I stumbled upon a technique which caught my interest to populate the date dimension via the following SQL query
DECLARE @StartDate DATE = '20000101', @NumberOfYears INT = 15; DECLARE @CutoffDate DATE = DATEADD(YEAR, @NumberOfYears, @StartDate); SELECT d FROM ( SELECT d = DATEADD(DAY, rn - 1, @StartDate) FROM ( SELECT TOP (DATEDIFF(DAY, @StartDate, @CutoffDate)) rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id]) FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2 -- on my system this would support > 5 million days ORDER BY s1.[object_id] ) AS x ) AS y;
And from the above, I found it more smarter and optimal way of achieving the same, hence I set out to share.
Hope it helps…
2 Replies to “DataWarehouse – Generate Date Dimension Optimally”
When executing the steps in this query. I am running into an issue with the
DECLARE @CutoffDate DATE = DATEADD(YEAR, @NumberOfYears, @StartDate);
I get a message of Msg 137, Level 15, State 2, Line 2
Must declare the scalar variable “”CutOffDate”.
Should I be adding something in this syntax? Meaning “year” should be defined? or something along those lines?
Thank you for your comments. I have tested again the script and it seems to be working fine on my end.
“YEAR” is the key word used for the “DATEADD” function.
You can get more information about the same on the following MSDN link.