Skip to content
Analyticshut
  • Home
  • AWSExpand
    • IAM
    • S3
  • Big DataExpand
    • Spark
    • Hive
    • Sqoop
    • HDFS
  • Kafka
Analyticshut

Big Data

bucketing in hive
Hive

Bucketing in Hive

ByMahesh Mogal August 24, 2020November 25, 2024

With Bucketing in Hive, we can group similar kinds of data and write it to one single file. This allows better performance while reading data & when joining two tables.

Read More Bucketing in HiveContinue

altering hive table partition
Hive

Alter Table Partitions in Hive

ByMahesh Mogal August 19, 2020November 25, 2024

We have created partitioned tables, inserted data into them. Now, we will learn how to drop some partition or add a new partition to the table in hive.

Read More Alter Table Partitions in HiveContinue

static vs dynamic
Hive

Static vs Dynamic Partitioning in Hive

ByMahesh Mogal August 17, 2020November 25, 2024

Hive supports Static and Dynamic Partitions. Let us understand what is difference between them and their use cases.

Read More Static vs Dynamic Partitioning in HiveContinue

Partitioning in Hive
Hive

Partitioning in Hive

ByMahesh Mogal August 15, 2020November 25, 2024

Using Partitioning, We can increase hive query performance. But if we do not choose partitioning column correctly it can create small file issue.

Read More Partitioning in HiveContinue

Adding Custom Schema to Spark Dataframe
Spark

Adding Custom Schema to Spark Dataframe

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to specify our custom schema with column names and data types for Spark data frames.

Read More Adding Custom Schema to Spark DataframeContinue

Reading data from a file in Spark
Spark

Reading data from a file in Spark

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to load data from JSON, CSV, TSV, Pipe Delimited or any other type for delimited file to spark Dataframe.

Read More Reading data from a file in SparkContinue

Loading Data to Hive Tables
Hive

Hive Data Manipulation – Loading Data to Hive Tables

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to load and populate data to hive table. We will also learn how to copy data to hive tables from local system.

Read More Hive Data Manipulation – Loading Data to Hive TablesContinue

manage tables in hive -2
Hive

Create, Alter, Delete Tables in Hive

ByMahesh Mogal August 11, 2020November 25, 2024

We will learn how to create Hive tables, also altering table columns, adding comments and table properties and deleting Hive tables.

Read More Create, Alter, Delete Tables in HiveContinue

Create Database in Hive
Hive

Creating Database in Hive

ByMahesh Mogal August 11, 2020November 25, 2024

We will learn how to create databases in Hive with simple operations like listing database, setting database location in HDFS & deleting database.

Read More Creating Database in HiveContinue

Hive data types
Hive

Data Types in Hive

ByMahesh Mogal May 27, 2020November 25, 2024

Hive supports multiple data types like SQL. On top of that, there are multiple complex data types in hive which makes it easy to process data in Hive.

Read More Data Types in HiveContinue

external vs Managed tables in Hive
Hive

External Vs Internal(Managed) Tables in Hive

ByMahesh Mogal May 26, 2020November 25, 2024

Hive has two types of tables, external and managed. In this blog, we will learn about them and decide which use case is suitable for each table.

Read More External Vs Internal(Managed) Tables in HiveContinue

What is HDFS
HDFS

What is HDFS – Overview of Hadoop’s distributed file system

ByMahesh Mogal December 8, 2019November 25, 2024

HDFS is file system designed by Google and used by Hadoop. It provides reliable, highly available store for data processing. Let us take a look at HDFS and its architecture.

Read More What is HDFS – Overview of Hadoop’s distributed file systemContinue

Page navigation

Previous PagePrevious 1 2 3 4 Next PageNext
  • Contact
  • About Me
  • Privacy Policy
  • Sitemap

© 2025 Analyticshut

  • Like
  • Facebook
  • Twitter
  • LinkedIn
  • WhatsApp
  • Email
  • Home
  • AWS
    • IAM
    • S3
  • Big Data
    • Spark
    • Hive
    • Sqoop
    • HDFS
  • Kafka