Running SQL queries on Spark DataFrames

SQL (Structured Query Language) is one of most popular way to process and analyze data among developers and analysts. Because of its popularity, Spark support SQL out of the box when working with data frames. We do not have to do anything different to use power and familiarity of SQL while working with Spark. In this article, we will learn how to run SQL queries on spark data frames and how to create data frame from SQL query result.

Creating Table From DataFrame

Before we can run queries on Data frame, we need to convert them to temporary tables in our spark session. These tables are defined for current session only and will be deleted once Spark session is expired.

Now that we have created a table for our data frame, we can run any SQL query on it. This is really powerful as you can go to SQL when you are comfortable.

Running more SQL queries on Spark Dataframe

Once we have created table, we can run many queries on data frame as we can do on any SQL table. Below are some examples of queries on our data frame.

You can see that how powerful is this. With SQL, we can run complex analytics queries easily on data frames.

Converting SQL results to Data Frames

You have converted data frame to table, run your queries. Now you want to come back to the data frame world. You will need to do this either when you want to save your results or use operations which are only available in data frames like (partition and coalesce). We can easily convert our query result to spark data frame.

We can see that we have got data frame back. We can perform all data frame operation on top of it. This is the power of Spark. You can use any way either data frame or SQL queries to get your job done. And you can switch between those two with no issue.

Conclusion

In this article, we have learned how to run SQL queries on Spark DataFrame. This is adds flexility to use either data frame functions or SQL queries to process data. You can find code written in this blog at GitHub. If you have any questions, let me know. I hope you have found this useful. See you in the next article. Until then, keep learning.

Similar Posts

Leave a Reply

Your email address will not be published.