Sorting in Spark Dataframe

In this blog, we are going to write code to sort the spark dataframe. We can sort our data based on one or more columns just like we do it in SQL. Spark provides two function to sort data, “sort” & “orderBy”.

Both of these functions work in the same way. We will mostly be using “orderBy” as it is more close to SQL like syntax.

Sorting Dataframe based on Column Value

Consider our flight data, we want to sort our dataframe using number of flights.

We can similarly output using “orderBy”. As you can see, data is sorted in ascending order by default.

Sorting Rows Using Orderby
Sorting Rows Using Orderby

We can also use column expression or column functions with our sorting functions.

Sorting data in Descending order

If we want to change default sorting order for Spark dataframe, we have to use desc function.

Sorting Data in Descending Order
Sorting Data in Descending Order

As seen in output, we can sort data in desending order using sparks inbult desc function.

I hope you found this useful. See you in next blog.

Similar Posts

Leave a Reply

Your email address will not be published.