Working With Timestamps in Spark

Updated On September 13, 2020 | By Mahesh Mogal

Working with timestamps while processing data can be a headache sometimes. Luckily Spark has some in-built functions to make our life easier when working with timestamps. Let us go over these functions.

current_timestamp - Getting Current Timestamp

We can get current timestamp using current_timestamp function.

>>> from pyspark.sql.functions import current_date,current_timestamp
>>>
>>> df = spark.range(2) \
...     .withColumn("current_date", current_date()) \
...     .withColumn("current_timestamp", current_timestamp())
>>> df.show()
+---+------------+--------------------+
| id|current_date|   current_timestamp|
+---+------------+--------------------+
|  0|  2020-08-19|2020-08-19 11:21:...|
|  1|  2020-08-19|2020-08-19 11:21:...|
+---+------------+--------------------+

date_trunc - Truncated Date From Timestamp

We can get truncated date from timestamp using code below.

from pyspark.sql.functions import date_trunc
df.select("current_timestamp", \
    date_trunc("year", col("current_timestamp")), \
    date_trunc("yyyy", col("current_timestamp")), \
    date_trunc("month", col("current_timestamp")), \
    ).show()
timestamp_trunc

Converting Timestamp to Unix Timestamp

Spark has provided function to convert timestamp to Unix Timestamp.

>>> from pyspark.sql.functions import unix_timestamp
>>> df.select("current_timestamp", \
...     unix_timestamp(col("current_timestamp")) \  
...     ).show()
+--------------------+------------------------------------------------------+
|   current_timestamp|unix_timestamp(current_timestamp, yyyy-MM-dd HH:mm:ss)|
+--------------------+------------------------------------------------------+
|2020-08-19 11:27:...|                                            1597850861|
|2020-08-19 11:27:...|                                            1597850861|

Get Timestamp from Unix Timestamp

>>> df_unix =  df.withColumn("unix_timestamp", unix_timestamp(col("current_timestamp")))
>>> 
>>> from pyspark.sql.functions import from_unixtime
>>> df_unix.select("unix_timestamp", \
...     from_unixtime(col("unix_timestamp")) \
...     ).show()
+--------------+--------------------------------------------------+
|unix_timestamp|from_unixtime(unix_timestamp, yyyy-MM-dd HH:mm:ss)|
+--------------+--------------------------------------------------+
|    1597850920|                               2020-08-19 11:28:40|
|    1597850920|                               2020-08-19 11:28:40|
+--------------+--------------------------------------------------+

Get Timestamp from Date

We can also get from timestamp using to_timestamp function

>>> from pyspark.sql.functions import to_timestamp,col
>>> df.select("current_date", \
...     to_timestamp(col("current_date"), "yyyy-MM-dd") \
...     ).show()
+------------+------------------------------------------+
|current_date|to_timestamp(`current_date`, 'yyyy-MM-dd')|
+------------+------------------------------------------+
|  2020-08-19|                       2020-08-19 00:00:00|
|  2020-08-19|                       2020-08-19 00:00:00|
+------------+------------------------------------------+

I hope you found this helpful. See you in in next blog.

.

Mahesh Mogal

I am passionate about Cloud, Data Analytics, Machine Learning, and Artificial Intelligence. I like to learn and try out new things. I have started blogging about my experience while learning these exciting technologies.

Stay Updated with Latest Blogs

Get latest blogs delivered to your mail directly.

Recent Posts

Sorting in Spark Dataframe

In this blog, we will learn how to sort rows in spark dataframe based on some column values.

Read More
Removing White Spaces From Data in Spark

White spaces can be a headache if not removed before processing data. We will learn how to remove spaces from data in spark using inbuilt functions.

Read More
Padding Data in Spark Dataframe

In this blog, we will learn how to use rpad and lpad functions to add padding to data in spark dataframe.

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram
Share via
Copy link
Powered by Social Snap