Handling month or day names in different languages in Snowflake
by Mauricio Rojas, on Sep 29, 2022 2:10:21 PM
Recently I received a request inspired by this StackOverflow question:
To convert a date-string that contains the name of the month in a data-field I could useselect to_date('01October2022', 'DDMMMMYYYY')
..I didn't find a way to convert the string if the month name is not english e.g. in german like 01Oktober2022
This is interesting especially if you are dealing with international users.
Snowflake has a plethora of DATE functions like DAYNAME or MONTHNAME but they only return English names.
So I will first address a way to handle translating names or months to different languages in Snowflake and then we will answer the original question.
For providing a solution for this task, we will create some JavaScript UDFs. Javascript provides functionality to return date values according to different locales. The locales accepted by JavaScript are on this page: https://www.npmjs.com/package/locale-codes for a list of locale codes.
I usually just use the short tags like 'pt' for Portuguese or 'de' for German, but you have many options.
Based on that we can then create these UDFs:
CREATE OR REPLACE FUNCTION GET_MONTHNAME_LONG_UDF(idx float, locale STRING) returns STRING language javascript as $$ // ( 0 - January, 1 - February, etc), this will work var objDate = new Date(); objDate.setDate(1); objDate.setMonth(IDX-1); return objDate.toLocaleString(LOCALE, { month: "long" }); $$; CREATE OR REPLACE FUNCTION GET_MONTHNAME_SHORT_UDF(idx float, locale STRING) returns STRING language javascript as $$ // ( 0 - January, 1 - Februaary, etc), this will work var objDate = new Date(); objDate.setDate(1); objDate.setMonth(IDX-1); return objDate.toLocaleString(LOCALE, { month: "short" }); $$; CREATE OR REPLACE FUNCTION GET_WEEKDAY_LONG_UDF(idx float, locale STRING) returns STRING language javascript as $$ // ( 0 - Sunday, 1 - Monday, etc), this will work: var objDate = new Date("2022/01/02"); // This is sunday objDate.setDate(objDate.getDate() + IDX); return objDate.toLocaleString(LOCALE, { weekday: "long" }); $$; CREATE OR REPLACE FUNCTION GET_WEEKDAY_SHORT_UDF(idx float, locale STRING) returns STRING language javascript as $$ // Assuming it starts on Sunday // (as in getDay: 0 - Sunday, 1 - Monday, etc), this will work: var objDate = new Date("2022/01/02"); // This is sunday objDate.setDate(objDate.getDate() + IDX); return objDate.toLocaleString(LOCALE, { weekday: "short" }); $$;
Those UDFs always assume indexes start with 0. So Sunday will be 0 and Monday 1 and so on. And For months January will be 0, February 1 and so on.
When you call them the idx parameter is just the index as described before, the locale will be a locale code (in theory you can use any code from locale table).
I provided two sets of udfs for short and long names.
Ok, I think that is good. So now we have a way to get weekdays and month names in different languages.
But the question was different. In that question, we had a string with a day in a foreign language and we needed to use a conversion from text to date.
Ok so we can do that. Here comes another UDFs to the rescue:
CREATE OR REPLACE FUNCTION TRANSLATE_MONTHNAME_LONG_UDF( datestring STRING, type STRING, replaceType String, locale STRING) returns STRING language javascript as $$ // ( 0 - January, 1 - Februaary, etc), this will work function pad2(number) { return (number < 10 ? '0' : '') + number } var replacement = "number"; if (REPLACETYPE.toUpperCase() == "TEXT") { replacement = "text"; } var objDate = new Date(); objDate.setDate(1); var month_names = []; for (var i=0;i<12;i++) { objDate.setMonth(i); month_name_locale = objDate.toLocaleString(LOCALE, { month: "long" }); var replaceValue = (replacement=="number")?pad2(i):objDate.toLocaleString("en-us", { month: "long" }); DATESTRING = DATESTRING.replace(new RegExp(month_name_locale,'i'),replaceValue); } return DATESTRING; $$;
This UDF will replace your foreign language month names and change them to English and then we can use our date functions.
So to answer the previous question, this will be a way to run it:
select '01Oktober2022' test_string, TRANSLATE_MONTHNAME_LONG_UDF('01Oktober2022','long','text','de') translate_test_string, to_date(translate_test_string,'DDMMMMYYYY');
And this is how it will look when this is run in Snowflake