Mystery of the Changing CASE WHEN Output and Data Type Precedence

I was working on a T-SQL query that had some very particular output requirements for decimal places. The problem seemed fairly easy, but when I tried to form a CASE WHEN statement that produced slightly different decimal lengths for each condition, the results were inconsistent. The reason wasn’t immediately apparent but involves SQL Data Type Precedence. I thought this would be a good blog post because it is something that anyone can unwittingly stumble into without realizing what the SQL query processor is really doing. It is also hard to discover what the solution to a problem is when you don’t know what you are looking for?

REQUIREMENTS
If number < 100 always display 3 decimals
If number >=100 and < 250 always display 2 decimals
If number > 250 always display 1 decimal

And decimals are truncated (not rounded) at the desired decimal place and must display desired number of places even if it is zero. I know this is something that should be handled by the presentation layer and not a SQL query, but that is a whole other blog post we won’t get into here.

EXAMPLE:
317.0000 should display as 317.0
155.1023 should display as 155.10
90.1203 should display as 90.120

This seemed easy enough and was accomplished using a CTE to perform some initial math and then perform a CASE WHEN statement to format the numbers using a different decimal cast DECIMAL(12,3) or DECIMAL(12,2) or DECIMAL(12,1) to return the appropriate number of decimal positions. However, the CASE statement would always return 3 decimals for every output even though the WHEN statement logic was branching correctly.

ALTERNATIVE?
The data warehouse person I was working on this requirement with provided a solution to this problem by wrapping each THEN statement with extremely long SUBSTRING and CHARINDEX functions to dissect both sides of the number and reconstruct it with strings and slapping a decimal in there. It did work, but needless to say it was very ugly and since the final code would return 5 million rows I knew their had to be a better solution.

TEST CASE
I prepared a unit test to discover what was going on. What I found was that the exact CAST statement with the decimal CAST works fine when run by itself outside the CASE WHEN statement but the same CAST doesn’t seem to work within the CASE WHEN statement.

[code language=”sql” light=”true”]
declare @credits decimal(18,5);
set @credits = 317.004;
select
cast(floor(@credits *1000) /1000 as decimal(18,3)) as ‘3’,
cast(floor(@credits *100) /100 as decimal(18,2)) as ‘2’,
cast(floor(@credits *10) /10 as decimal(18,1)) as ‘1’,
case
when @credits < 100 then cast(floor(@credits * 1000) / 1000 as decimal(18,3))
when @credits >= 100 and @credits < 250 then cast(floor(@credits * 100) / 100 as decimal(18,2))
when @credits >= 250 then cast(floor(@credits * 10) / 10 as decimal(18,1))
end as ‘CASE’,
case
when @credits < 100 then ‘<100’
when @credits >= 100 and @credits < 250 then ‘>=100 and < 250’
when @credits >= 250 then ‘>=250’
end as condition
[/code]

Output
Case-When

MYSTERY REVEALED
The CASE column above should match the 1 decimal column output, but instead it has 3 decimals. Why did this happen? Why do the CAST statements alone produce the correct output but when run inside the CASE WHEN statement, they produce the wrong output. I started to think about how the query processor would look at this query. When it starts performing the CASE statement it says, I need to allocate some storage for the output from this CASE statement but I see 3 different types of decimal output, so I will pick the largest one to satisfy all possible outcomes.

Even though I wasn’t doing anything drastic like having the CASE statement return either a string or int, I believe I was still running into a problem of SQL Data Type Precedence. Which states: “When an operator combines two expressions of different data types, the rules for data type precedence specify that the data type with the lower precedence is converted to the data type with the higher precedence. If the conversion is not a supported implicit conversion, an error is returned. When both operand expressions have the same data type, the result of the operation has that data type.” Here are some more examples of this in action.

SOLUTION
So, to solve this problem, I simply let the CASE statement know all output will be VARCHAR(20) and this allowed the decimal CAST statements to return their expected results. In the example below the credits already had the math performed in the CTE and this query handles the formatting requirements. Needless to say, this solution was a lot better than performing cumbersome string manipulations to format the numbers.

[code language=”sql” light=”true”]

case
when credits < 100 then cast(cast(credits as DECIMAL(12,3)) as VARCHAR (20))
when credits >= 100 and credits < 250 then cast(cast(credits as DECIMAL(12,2)) as VARCHAR(20))
when credits >=250 then cast(cast(credits as DECIMAL(12,1)) as VARCHAR(20))
end

[/code]

This entry was posted in T-SQL. Bookmark the permalink.