Skip to content
Analyticshut
  • Home
  • AWSExpand
    • IAM
    • S3
  • Big DataExpand
    • Spark
    • Hive
    • Sqoop
    • HDFS
  • Kafka
Analyticshut

DATA Manipulation

Where and Filter in Spark Dataframes
Spark

Where and Filter in Spark Dataframes

ByMahesh Mogal October 22, 2020November 25, 2024

In this blog, we will learn how to filter rows from spark dataframe using Where and Filter functions.

Read More Where and Filter in Spark DataframesContinue

Removing White Spaces From Data in Spark
Spark

Removing White Spaces From Data in Spark

ByMahesh Mogal October 9, 2020November 25, 2024

White spaces can be a headache if not removed before processing data. We will learn how to remove spaces from data in spark using inbuilt functions.

Read More Removing White Spaces From Data in SparkContinue

ADDING SPACES DATA IN SPARK DATAFRAME
Spark

Adding White Spaces to Data in Spark Dataframe

ByMahesh Mogal October 6, 2020November 25, 2024

In this blog, we will learn how to use rpad and lpad functions to add padding to data in spark dataframe.

Read More Adding White Spaces to Data in Spark DataframeContinue

String Functions in Spark
Spark

String Functions in Spark

ByMahesh Mogal October 2, 2020November 25, 2024

This blog is intended to be a quick reference for the most commonly used string functions in Spark. It will cover all of the core string processing operations that are supported by Spark. In addition, it should serve as a useful guide for users who wish to easily integrate these into their own applications.

Read More String Functions in SparkContinue

Select Expr in Spark Dataframe
Spark

Select Expr in Spark Dataframe

ByMahesh Mogal September 17, 2020November 25, 2024

In this blog, we will learn how to use select and expr in the Spark data frame. We will learn multiple use cases along with selectExpr.

Read More Select Expr in Spark DataframeContinue

inserting data to hive partition
Hive

Inserting Data In Hive Partitioned tables

ByMahesh Mogal August 27, 2020November 25, 2024

In this blog, we will learn how to insert data in partitions in hive table. We will write queries to insert data in static as well as dynamic partitions.

Read More Inserting Data In Hive Partitioned tablesContinue

Adding Custom Schema to Spark Dataframe
Spark

Adding Custom Schema to Spark Dataframe

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to specify our custom schema with column names and data types for Spark data frames.

Read More Adding Custom Schema to Spark DataframeContinue

Reading data from a file in Spark
Spark

Reading data from a file in Spark

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to load data from JSON, CSV, TSV, Pipe Delimited or any other type for delimited file to spark Dataframe.

Read More Reading data from a file in SparkContinue

Loading Data to Hive Tables
Hive

Hive Data Manipulation – Loading Data to Hive Tables

ByMahesh Mogal August 12, 2020November 25, 2024

We will learn how to load and populate data to hive table. We will also learn how to copy data to hive tables from local system.

Read More Hive Data Manipulation – Loading Data to Hive TablesContinue

  • Contact
  • About Me
  • Privacy Policy
  • Sitemap

© 2025 Analyticshut

  • Home
  • AWS
    • IAM
    • S3
  • Big Data
    • Spark
    • Hive
    • Sqoop
    • HDFS
  • Kafka