Application & Data Migration Blog Posts | Mobilize.Net

Migrating Scalar Functions from T-SQL to Snowflake

Written by Andrés Campos | Jun 30, 2022 8:48:40 PM

Code migration always presents unique and interest challenges. 

As the longtime leaders in migration tooling, we’ve seen a lot of those challenges. When we find one that’s interesting, we like to blog about it. For SQL Server, one of those interesting transformations we’ve seen is for Scalar Functions in T-SQL. These are functions characterized by returning one value per invocation. Think of this as returning one value per row. And as in most cases, this is a conversion that can be automated by Mobilize.Net SnowConvert for SQL Server.

 

Snowflake supports this type of function, and the body of the function can be written in three different languages (JavaScript, Java, or SQL). But the SQL body has a constraint that only allows one query inside the body of the function (enforcing the "one value per invocation” idea) since more than one query could cause the function to return multiple rows.   

 

This is where SnowConvert comes in. How can we transform multiple queries and statements found inside the body of the function into one valid query in Snowflake?   

 

Scalar Functions in SQL Server

 

Let's take a look at a quick example. Say we take some of the most common statements you can find inside a function in T-SQL such as DECLARE, SET, SELECT, and RETURN:

 

-- Basic function in SQL Server.
CREATE OR ALTER FUNCTION purchasing.GetVendorName()
RETURNS NVARCHAR(50) AS 
BEGIN
	DECLARE @result NVARCHAR
	DECLARE @BusinessEntityID INT
	
	SET @BusinessEntityID = 1492
	
	SELECT @result = Name FROM purchasing.vendor WHERE BusinessEntityID = @BusinessEntityID
	
	RETURN @result
	
END

 

While this may be a simple function with few statements, if you were to write the same code in Snowflake using the SQL body, the output would be limited to only one of those statements. The most farfetched solution would be to write one auxiliary function per statement, but that would drastically change the source code. (Not to mention confuse the user as to how the code works.) Multiple new lines of code will be required to support each of those new functions, which will increase the chance of bugs being created or unexpected behaviors in the target code.  

 

Utilizing Common Table Expressions (CTE)

 

But fear not. The Mobilize.Net Engineering Team has come up with an excellent solution that not only ensures the functional equivalence of the code but also allows us to keep all the statements in one function by using Common Table Expressions (CTE’s).  A CTE is a named subquery defined inside a WITH clause. You can think of them as a temporary view for use in the statement that defines them. The CTE defines a temporary view’s name, an optional list of columns, and a query expression, which is usually a SELECT statement. The result of the query inside the CTE is essentially a table. You can use as many CTE’s as you need inside the WITH clause, and all of them will be executed before the main SELECT. 

 

An appropriate transformation for the above SQL Server code using CTE’s would be something like this:  

 

-- The same basic function from SQL Server in Snowflake.
CREATE OR REPLACE FUNCTION purchasing.GetVendorName() 
RETURNS VARCHAR(50)
AS
$$
	WITH
		CTE1 AS (SELECT 1492 AS BusinessEntityID),
		CTE2 AS (SELECT NAME AS result FROM purchasing.vendor WHERE BusinessEntityID = (SELECT BusinessEntityID FROM CTE1))
		SELECT result FROM CTE2
$$;

 

Here are a few notes about the above code in Snowflake Scripting:

  • The declare statements without an assignment are removed.
  • Since the name of the variable is not required until it is set, the SET statements are transformed into SELECT statements.
  • SELECT does not require any transformation, and it can be added inside a Common Table Expression.  
  • If there is a need to access a previous value inside a CTE, it must be done the same way you would access any table just as it is done in the given sample with the variable BusinessEntityID inside CTE2.
  • The return statement will be the main SELECT of the function body, and the WITH clause is attached to this SELECT. 
  • Each Common Table Expression will be deployed in order before the main SELECT, so if the function body has an equivalent statement in Snowflake, this conversion can be done.  

Not all functions will be transformed this way. In fact, functions not transformed similarly to what is shown above will either have their body changed to JavaScript or the entire function changed to a stored procedure. There are some limitations to what can be transformed into this type of target code. As a result, SnowConvert analyzes the code before the transformation to appropriately convert the code.  

Some of the things SnowConvert looks for are control-of-flow statements (such as a WHILE statement), actions on the database (like INSERT or UPDATE), or references to other user-defined functions. Doing this allows for a bigger picture of the source code to be created, and provides an accurate depiction of what the target code should be.  

There are many things SnowConvert takes into consideration to provide a smooth transition for your code, and these are just a few. For an in-depth analysis of transformations like these, you can talk to an engineer like me at Mobilize.Net.

Hope this helps on your journey to a successful migration from SQL Server to Snowflake!