Alternatives to the (slow) FORMAT function
At first sight the FORMAT function introduced in SQL Server 2016 looks like a useful addition. It allows freedom in formatting dates and numbers using C#-style format strings along with locale awareness. Although formatting is strictly the preserve of the presentation layer, it’s still useful to have a straightforward and memorable way of formatting dates for data importing and general mash-ups – isn’t it?
/* FORMAT: very tempting */ SELECT FORMAT(DateValue, 'yyyy-MM-dd hh:mm:ss'), ...
Unfortunately not: FORMAT is best avoided in most situations. It’s just too slow (on current versions). It’s also non-deterministic, which rules it out from certain uses. But mostly it’s the performance – which is CPU usage – that rules it out.
Preferred alternatives to FORMAT are at the end of the article, but first the tests.
Performance test
Code
The following code creates a table DateFormatTest
containing 2m rows with a date column.
CREATE TABLE DateFormatTest (DateFormatTestId int NOT NULL, RandomNbr int NOT NULL, DateValue datetime NOT NULL) -- Load table with 2m rows, comprising Id and a date ranging over 100 year period. DECLARE @TotalRows bigint = 2000000 DECLARE @RandomNumberRange int = 4000 ;WITH Pass0 as (select 1 as C union all select 1), --2 rows Pass1 as (select 1 as C from Pass0 as A, Pass0 as B),--4 rows Pass2 as (select 1 as C from Pass1 as A, Pass1 as B),--16 rows Pass3 as (select 1 as C from Pass2 as A, Pass2 as B),--256 rows Pass4 as (select 1 as C from Pass3 as A, Pass3 as B),--65536 rows Pass5 as (select 1 as C from Pass4 as A, Pass4 as B),--4,294,967,296 rows Tally as (select row_number() over(order by C) as Number from Pass5) INSERT DateFormatTest (DateFormatTestId, RandomNbr, DateValue) SELECT TOP(@TotalRows) Number, (ABS(CHECKSUM(NewId())) % @RandomNumberRange) + 1, DATEADD(DAY, Number % (365 * 100), '1920-01-01') FROM Tally -- Add clustered index ALTER TABLE DateFormatTest ADD CONSTRAINT PK_DateFormatTest PRIMARY KEY CLUSTERED (DateFormatTestId)
Three tests for comparison: the first returns the rows with no formatting change, the second uses CONVERT
, the last uses FORMAT
.
SELECT DateFormatTestId, DateValue INTO #d0 FROM dbo.DateFormatTest SELECT DateFormatTestId, [yyyy-mm-dd] = CONVERT( VARCHAR(10), DateValue, 120) INTO #d1 FROM dbo.DateFormatTest SELECT DateFormatTestId, [yyyy-mm-dd] = FORMAT(DateValue, 'yyyy-MM-dd') INTO #d2 FROM dbo.DateFormatTest drop table #d0, #d1, #d2
And after the test, cleanup:
-- remove test table DROP TABLE IF EXISTS dbo.DateFormatTest
Results
I ran the test script, four times each, against an i5 laptop and against Azure SQL Database (Standard, 300DTU – currently £580/month).
- On the laptop,
FORMAT
was 23.8 times as expensive (in elapsed time) as returning the data with no transformation; and 15.04 times as expensive asCONVERT
. - on Azure SQL DB elapsed times were slower (it also took about 4 times as long to insert the test data). However the relative times were closer.
FORMAT
averaged 7.17 times as expensive asCONVERT
. - All statements went parallel. The laptop used 8 threads, Azure SQL DB used three.
Where FORMAT might be used
There are a few situations where the performance of FORMAT is not an issue and the clarity of its syntax might be an advantage:
- on the right-hand-side of a search parameter, where it will be evaluated once
- for formatting dates in error descriptions
Preferred date formatting
- yyyy-MM-dd 2021-02-18
SELECT CONVERT(CHAR(10), GETDATE(), 120)
- yyyyMMdd 20210218 (date key format)
SELECT CONVERT(CHAR(8), GETDATE(), 112) -- returns string
SELECT CONVERT(INT, CONVERT( CHAR(8), GETDATE(), 112)) -- returns integer