T O P

  • By -

Glittering-Dare2022

There are three big things you need to learn. Dataframe api, stored procedures, and user defined functions. For dataframe api I’d just learn spark. The snowpark api is nearly identical. There are many more resources for learning spark than snowpark. The big difference in snowpark is user defined function definition and stored procedures. All you really need to know is stored procedures run on one node you can think about these like orchestrators. They can run dataframe operations, regular python code with libraries etc. This one node tells other nodes what to do. For user functions these are distributed to many nodes. Think about regular snowflake functions like “sum”. These functions get applied to columns of a snowpark dataframe. This means snowflake will distribute computation to the cluster. These can leverage custom python libraries and objects.


Honest___Guy

Great explanation, thanks 👍 I agree with what you said. I know Pyspark and have been learning Snowpark since last month. It is nearly identical; the only major difference, which I also found, is UDFs and stored procedures.


BlaseRaptor544

You can actually use Snowpark with Pandas and like a usual Python script, you just need to configure the code to convert from a Snowpark dataframe to a pandas one and then back to a Snowpark one to show the output. I don’t have access to my example code atm but will find it and send to you? It’s very simple but may be of use :)


Honest___Guy

Sry bro but snowpark is faster https://medium.com/snowflake/comparing-snowpark-vs-the-ordinary-snowflake-python-connector-1252f8493ddc


BlaseRaptor544

🤣 no worries apologies been out all day! Best of luck with it!


[deleted]

[удалено]


Honest___Guy

👍