Pivot rows to columns in Hive

Updated On February 12, 2021 | By Mahesh Mogal

Hello everyone. In this article, we will learn how can we pivot rows to columns in the Hive.

Pivoting/transposing means we need to convert a row into columns. We need to do this to show a different view of data, to show aggregation performed on different granularity than which is present in the existing table.

Consider you have the following data from some company. It shows how much money they spent on each resource in each quarter for some projects in that company.

pivot data in hive - sample data

From this table, you want to show data like this.

pivot data in hive - final output after pivot

We will see how we can use CASE statements and COLLECT_SET to get these results from an existing table.

Step I - Using Case statments

First, we can use case statements to transpose required rows to columns.

This query will you give you result in the following way.

pivot data in hive - output after case when query

We can see that there are null values and each resource is having multiple records. This is expected as we have not grouped this on resource id yet and for each resource, there may not be an entry for each quarter.

Step II - Aggregating Data with COLLECT_SET

In this step, we will use COLLECT_SET and group records on basic of resource id to get the desired result.

pivot data in hive - step 2 output

This is almost like our desired result.

Step III - Transposing Result with Another Case Statement

We can prettify our results in step two by modifying our query with CASE statements. This step is optional and it depends on in what form do you want to achieve your final results.

This will be our output.

pivot data in hive - final output

AS we can see, by using CASE and COLLECT_SET we can easily convert / pivot rows into columns in Hive.

You can learn more about COLLECT_SET in Hive at COLLECT_SET AND COLLECT_LIST IN HIVE

pivot table - Transpose rows to columns in Hive

Mahesh Mogal

I am passionate about Cloud, Data Analytics, Machine Learning, and Artificial Intelligence. I like to learn and try out new things. I have started blogging about my experience while learning these exciting technologies.

Stay Updated with Latest Blogs

Get latest blogs delivered to your mail directly.

Recent Posts

Where and Filter in Spark Dataframes

In this blog, we will learn how to filter rows from spark dataframe using Where and Filter functions.

Where and Filter in Spark Dataframes
Read More
Distinct Rows and Distinct Count from Spark Dataframe

Getting distinct values from columns or rows is one of most used operations. We will learn how to get distinct values as well as count of distinct values.

Distinct Rows and Distinct Count from Spark Dataframe
Read More
Sorting in Spark Dataframe

In this blog, we will learn how to sort rows in spark dataframe based on some column values.

Sorting in Spark Dataframe
Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

2 comments on “Pivot rows to columns in Hive”

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram
Share via
Copy link
Powered by Social Snap