Altering Table Partitions in Hive

Updated On August 11, 2020 | By Mahesh Mogal

We have created partitioned tables, inserted data into them. Now, what if we want to drop some partition or add a new partition to the table? Can we have one partition at different locations? Let us try to answer these questions in this blog post.

Adding Partition To Table

We can run below query to add partition to table.

ALTER TABLE order_partition_extrenal 
ADD PARTITION(year='2014', month='02')

Hive is metastore for tables. Here we are adding new information about partition to table metadata. We are telling hive this partition for this table is has its data at this location.

Does this mean we can have our partitions at diffrent locations?

Of course we can. Not just in different locations but also in different file systems.

Consider use case, you have a huge amount of data but you do not use old data that frequently (something like log data). In that case, you can set up a job that will move old data to S3 ( It is Amazons cheap store service. You can learn more about it here). And then point those old partitions to S3 location. So your latest data will be in HDFS and old partitions in S3 and you can query that hive table seamlessly.

ALTER TABLE log_messages ADD PARTITION (year = 2019, month = 12)
LOCATION 's3n://bucket_name/logs/2019/12';

Updating & Renaming Partitions in Hive Tables

With Alter table command, we can also update partition table location.

ALTER TABLE log_messages PARTITION (year = 2019, month = 12)
SET LOCATION '/maheshmogal.db/order_new/year=2019/month=12';

We can also rename existing partitions using below query.

ALTER TABLE order_partition_extrenal PARTITION (year=2013,month=7)
RENAME TO PARTITION(year=2019,month=07);

Dropping Partition from Hive Tables

We can also drop partition from hive tables.

ALTER TABLE order_partition_extrenal
DROP PARTITION (year=2019, month=7);

This will delete the partition from the table. But what about data when you have an external hive table? Hive doe not drop that data. It just removes these details from table metadata. If you also want to drop data along with partition fro external tables then you have to do it manually.

hadoop fs -rmr /maheshmogal.db/order_new/year=2019/month=7

This was a short article, but quite useful. I hope you will find it useful. Partitioning is one of the important topics in the Hive. Partitioning is also one of the core strategies to improve query performance in a hive. In the last few articles, we have covered most of the details of Partitioning in Hive. Next, we will start learning about bucketing an equally important aspect in Hive with its unique features and use cases. Hope to see you there.

altering hive table partition

Mahesh Mogal

I am passionate about Cloud, Data Analytics, Machine Learning, and Artificial Intelligence. I like to learn and try out new things. I have started blogging about my experience while learning these exciting technologies.

Stay Updated with Latest Blogs

Get latest blogs delivered to your mail directly.

Recent Posts

Select Expr in Spark Dataframe

In this blog, we will learn how to use select and expr in the Spark data frame. We will learn multiple use cases along with selectExpr.

Read More
Add, Rename, Drop Columns in Spark Dataframe

We will go through common column operations like add, rename, list, select, and dropping a column from spark dataframe.

Read More
MSCK Repair - Fixing Partitions in Hive Table

We will learn how to add multiple partitions to hive table using msck repair table command in hive.

msck repair hive
Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram
Share via
Copy link
Powered by Social Snap