Altering Partitions in Hive

We have created partitioned tables, inserted data into them. Now, what if we want to drop some partition or add a new partition to the table? Can we have one partition at different locations? Let us try to answer these questions in this blog post.

Adding Partition To Table

We can run below query to add partition to table.

ALTER TABLE order_partition_extrenal 
ADD PARTITION(year='2014', month='02')
LOCATION 
'/apps/hive/warehouse/maheshmogal.db/order_partition/year=2014/month=02';

ALTER TABLE order_partition_extrenal

ADD PARTITION(year='2014', month='02')

LOCATION

'/apps/hive/warehouse/maheshmogal.db/order_partition/year=2014/month=02';

Hive is metastore for tables. Here we are adding new information about partition to table metadata. We are telling hive this partition for this table is has its data at this location.

Does this mean we can have our partitions at diffrent locations?

Of course we can. Not just in different locations but also in different file systems.

Consider use case, you have a huge amount of data but you do not use old data that frequently (something like log data). In that case, you can set up a job that will move old data to S3 ( It is Amazons cheap store service. You can learn more about it here). And then point those old partitions to S3 location. So your latest data will be in HDFS and old partitions in S3 and you can query that hive table seamlessly.

ALTER TABLE log_messages ADD PARTITION (year = 2019, month = 12)
LOCATION 's3n://bucket_name/logs/2019/12';

1 2	ALTER TABLE log_messages ADD PARTITION (year = 2019, month = 12) LOCATION 's3n://bucket_name/logs/2019/12';

Updating & Renaming Partitions in Hive Tables

With Alter table command, we can also update partition table location.

ALTER TABLE log_messages PARTITION (year = 2019, month = 12)
SET LOCATION '/maheshmogal.db/order_new/year=2019/month=12';

1 2	ALTER TABLE log_messages PARTITION (year = 2019, month = 12) SET LOCATION '/maheshmogal.db/order_new/year=2019/month=12';

We can also rename existing partitions using below query.

ALTER TABLE order_partition_extrenal PARTITION (year=2013,month=7)
RENAME TO PARTITION(year=2019,month=07);

1 2	ALTER TABLE order_partition_extrenal PARTITION (year=2013,month=7) RENAME TO PARTITION(year=2019,month=07);

Dropping Partition from Hive Tables

We can also drop partition from hive tables.

ALTER TABLE order_partition_extrenal
DROP PARTITION (year=2019, month=7);

1 2	ALTER TABLE order_partition_extrenal DROP PARTITION (year=2019, month=7);

This will delete the partition from the table. But what about data when you have an external hive table? Hive doe not drop that data. It just removes these details from table metadata. If you also want to drop data along with partition fro external tables then you have to do it manually.

hadoop fs -rmr /maheshmogal.db/order_new/year=2019/month=7

1	hadoop fs -rmr /maheshmogal.db/order_new/year=2019/month=7

This was a short article, but quite useful. I hope you will find it useful. Partitioning is one of the important topics in the Hive. Partitioning is also one of the core strategies to improve query performance in a hive. In the last few articles, we have covered most of the details of Partitioning in Hive. Next, we will start learning about bucketing an equally important aspect in Hive with its unique features and use cases. Hope to see you there.

Alter Table Partitions in Hive

Adding Partition To Table

Updating & Renaming Partitions in Hive Tables

Dropping Partition from Hive Tables

Static vs Dynamic Partitioning in Hive

Data Types in Hive

Create, Alter, Delete Tables in Hive

Hive – What is the difference between Collect Set and Collect List

Hive Data Manipulation – Loading Data to Hive Tables

Pivot rows to columns in Hive

Adding Partition To Table

Updating & Renaming Partitions in Hive Tables

Dropping Partition from Hive Tables

Similar Posts