Faking Null Values
- 7 minsI came across a weird scenario wherein I had two tables, one essentially a staging table and the other a slightly transformed version of that. However, they are more or less the same and both have a large amount of applications and ORMs tied to them directly. As a first step to remove the ETL process that moves the data from one table to the other unnecessarily, I planned to turn one into a view of the other.
Due to restrictions, I essentially have to do this in a way that won’t cause the ORM database mappings to break - so the view has to perfectly mimic the table - same name, same data types, same null-able status for each column. Naming and data types are easy to adjust, casting or converting where needed. Changing whether a column is null-able or not is a little more complicated, though.
I first started by tackling base columns that are null-able but had been made non-null during the ETL process by wrapping them in an ISNULL:
When we check the output of the view, we see that the [SalesPersonName_NotNull] column is what we expect, NOT NULL:
This is a pretty quick and obvious solution and has been written about a lot, and the inverse works pretty similarly. When we want to turn a non-null column into null-able, a CASE statement is an easy way to achieve this:
Which gives us a null-able [Amount] column, even though the underlying [Sales] table has the column as non-null:
At this point things are going great and most use cases for adjusting the null-ability of a column are covered. However, I came across a strange situation: the view needed to return a static integer value that wasn’t from a table, yet needed be null-able to fit with the original table’s makeup. The ETL process had just been setting the column equal to 1000 for every single row, so my view should do the same. My first thought was to use a CASE statement with a hard coded logical comparison to fool SQL Server into making it null-able, the opposite of what I had just done above:
But that didn’t work, SQL Server knows this will always evaluate to true and therefor leaves the column as non-null:
So next I began to think of a non-deterministic logic that would for all intensive purposes always return true, but still be potentially false, thereby tricking SQL Server while always giving me the desired result:
Lo and behold, this worked!
I now had my static value returning as a null-able column in my view. And while it’s pretty safe to assume neither me nor the database will be time traveling back to pre-2016, something about this approach left me a little unsettled. Who knows what obscure bug or sequence of events could end up giving me a bad date value. So what was something more reliable? I thought about using @@SERVERNAME, @@VERSION (how often do you downgrade to SQL 2000 ?!), or some other @@ value, but having to make assumptions or do some string manipulation seemed like an over complication and a route that would lead to reduced readability, making it even harder to discern what I was trying to achieve with this crazy CASE statement.
I finally settled on something that fulfilled my criteria and didn’t leave a pit in my stomach: an approach using a built-in function for low cost, no extraneous manipulation to compare values, and astronomically low odds that I would ever accidentally get a NULL value when I didn’t want one: GUIDs.
Since the odds of two generated GUIDs on the same server are extremely low and right off the bat it’s obvious what the statement is doing, I think this is the best approach to take for this very specific, strange problem.