Spark to Snowpark Migration Guide
by Brandon Carver, on Jun 14, 2022 10:28:18 AM
It's always helpful to get advice from someone who has been where you're trying to go. When Snowflake needed to build automated migration tools for converting SQL, sprocs, and scripts to the Data Cloud, they came to the migration experts at Mobilize.Net. The result? SnowConvert and around 1 billion lines of code converted from Teradata, Oracle, and SQL Server. Snowflake and Mobilize.Net are mobilizing the world’s data together.
When we were building a suite of tools that is designed to support development across the Data Cloud, we got advice from the experts at Snowflake Professional Services on the real capabilities of the Data Cloud. The result? BlackDiamond Studio, a cross-language productivity workbench designed to support the ability to develop and manage both your code and your data. BlackDiamond Studio is the code cloud for the Snowflake Data Cloud.
Getting a guide who has been where you’re going and can be your partner makes all the difference. That’s why we’ve recently partnered with Snowflake on a migration guide for going from Spark to Snowflake using the Snowpark API. You may have seen a lot of recent conversation around Snowpark. If you’re not familiar with it already, the advantages of Snowpark are real. We know because we’ve been there. We’ve been developing in Snowpark with Snowflake since it’s infancy, and we’ve been moving workloads and applications from Spark to Snowpark ever since.
Given that we’ve been there, we’re here to tell you that it is possible to take your Spark application and use it with the Snowpark API in Snowflake. You can accelerate and automate this migration by utilizing Mobilize.Net SnowConvert for Spark Scala, and the soon-to-be-released, SnowConvert for PySpark (Spark Python). Take your Spark application written in Scala or Python, and convert any references to the Spark API to the Snowpark API automatically with SnowConvert.
It's important to note (and of course, you can read more about this in the migration guide) that not all workloads are great candidates for migration. For example, most data pipeline, ETL, and data validation workloads are generally good candidates. Your data science workloads are likely convertible to Snowpark, but they will need a closer look. And like any other migration, it’s essential to have a plan. In our experience, getting the right plan in place is often the difference between a migration that is successful and a migration that stretches beyond time and cost barriers established at the start of the product. Combining the right plan with the automation benefits of SnowConvert for Spark Scala and Python, has been the formula that has proven successful so far.
All of that to say, if you’d like a self-guided plan to get you there, we’ve worked to get one published. If you’d like to leverage automation tools to accelerate that migration, we’ve also got you covered. If you need a workbench to develop and manage your SQL, Scala, Python, and any other coding language designed for the Data Cloud, BlackDiamond Studio is available now. And if you need an actual guide to partner with on your journey to Snowpark, we love talking Snowpark and have the experience to help you get the most out of your Snowflake account. Regardless of the guide you need for your journey, Mobilize.Net is here.