Big Data | Page 3 of 4 | Analyticshut

Hive

Bucketing in Hive

ByMahesh Mogal August 24, 2020November 25, 2024

With Bucketing in Hive, we can group similar kinds of data and write it to one single file. This allows better performance while reading data & when joining two tables.

Hive

Alter Table Partitions in Hive

ByMahesh Mogal August 19, 2020November 25, 2024

We have created partitioned tables, inserted data into them. Now, we will learn how to drop some partition or add a new partition to the table in hive.

Hive

Static vs Dynamic Partitioning in Hive

ByMahesh Mogal August 17, 2020November 25, 2024

Hive supports Static and Dynamic Partitions. Let us understand what is difference between them and their use cases.

Hive

Partitioning in Hive

ByMahesh Mogal August 15, 2020November 25, 2024

Using Partitioning, We can increase hive query performance. But if we do not choose partitioning column correctly it can create small file issue.

Spark

Adding Custom Schema to Spark Dataframe

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to specify our custom schema with column names and data types for Spark data frames.

Spark

Reading data from a file in Spark

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to load data from JSON, CSV, TSV, Pipe Delimited or any other type for delimited file to spark Dataframe.

Hive

Hive Data Manipulation – Loading Data to Hive Tables

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to load and populate data to hive table. We will also learn how to copy data to hive tables from local system.

Hive

Create, Alter, Delete Tables in Hive

ByMahesh Mogal August 11, 2020November 25, 2024

We will learn how to create Hive tables, also altering table columns, adding comments and table properties and deleting Hive tables.

Hive

Creating Database in Hive

ByMahesh Mogal August 11, 2020November 25, 2024

We will learn how to create databases in Hive with simple operations like listing database, setting database location in HDFS & deleting database.

Hive

Data Types in Hive

ByMahesh Mogal May 27, 2020November 25, 2024

Hive supports multiple data types like SQL. On top of that, there are multiple complex data types in hive which makes it easy to process data in Hive.

Hive

External Vs Internal(Managed) Tables in Hive

ByMahesh Mogal May 26, 2020November 25, 2024

Hive has two types of tables, external and managed. In this blog, we will learn about them and decide which use case is suitable for each table.

HDFS

What is HDFS – Overview of Hadoop’s distributed file system

ByMahesh Mogal December 8, 2019November 25, 2024

HDFS is file system designed by Google and used by Hadoop. It provides reliable, highly available store for data processing. Let us take a look at HDFS and its architecture.