In this blog, we will learn how to filter rows from spark dataframe using Where and Filter functions.
Identity and Access Management (IAM) is a service provided by AWS to secure your AWS account and other AWS services. IAM allows you to manage yours and their level of access to AWS services. It is important to learn about IAM and how to use it to make AWS secure. In this article, we will understand what IAM is and its best practices to secure the AWS account with ease. If you have not created your free tier account yet, you can check this article and create your free AWS account.
IAM provides you easy ways to secure AWS account and resources. You can create new users and assign the required privileges to them. This will protect your resources from unauthorized access. You can set up granular permissions e.g. only read access to some S3 bucket for one user. You also have the option to create user groups and manage their privilege levels easily.
For extra security, you can start using multi-factor authentication from the IAM dashboard. It also allows you to set up your password policy like a number of characters or password expiry. IAM integrates with AWS services seamlessly and helps us manage access between different AWS services. IAM also supports PCI DSS compliance (associated with the payments industry). All such features make IAM really useful and must know service for everyone who wants to use AWS.
There are a few key terms you should know about IAM which IAM uses to give security features.
With IAM you can make AWS account and its resources secure. There are some best practices for how to use IAM for better security. Let us take a look at them.
When you log in to the AWS account and enter the IAM dashboard, you will there are multiple warnings. We will implement the following best practices and along the way pick up green ticks for all warnings.
When you register your AWS account, you begin with one user who can sign in to AWS account using email id that you used to create an account. This user is called a root user. This user has basically god mode active for your AWS account. It has access to all the AWS services and changes privilege levels for other users. It is advisable not to use the root user as much as you can. we should always create a user and use that to access AWS account even if you are only one using that account. So let us create our first user.
To create a user click on the User button from the left column. You will see the user page. Here click on Add User button. On the add user page, enter a User name for the new user. Then you need to select what kind of access this user will have.
Programmatic access is used for access AWS resources from API (Python, Java, etc) and AWS CLI ( Command Line Interface). AWS Management Console access is used for login into the AWS dashboard from the browser. Select both for this user. Once you select them then you choose a password for new user and Require password reset for a new user when he logs in for the first time. You can keep this as selected or change depending on your requirements.
Next, it will ask to add this new user to the group. We do not have a group so you can skip this step or you can create a new group using the steps mentioned below in creating a group section.
You can add tags to the new user. Tags are key-value pairs to identify AWS resources from your account.
On the next screen, you can review the new user details. After reviewing, click on Create user button at the bottom right corner of the screen. This will create a new user for your AWS account.
Please remember that you will only see your access key on this page. Once you leave this page you will not have any way to retrieve it. So please download user details using Download .csv button.
Think User groups in AWS as logical sets of users. In the real world, take a scenario, you have an HR team, Developer team, Testing team, and Admin team. All of the users belonging to each team will have different access requirements. For example, the Testing team will need read-only access to code while the Developer team will need write access to code. We can easily manage access levels for teams by creating Groups in IAM. Users in one group will have similar access levels.
AWS recommends managing user access by creating groups that assign permission to the group. This makes managing user access levels easy and streamlined. In the future, if you need to change access levels for the team then you only need to make a change at the group level.
To create your group, click on Groups on the left side panel. Then click on "Create New Group" button.
After that, the first step you need to do is enter a name for your group. Generally, you should name your group which identifies the access level or business function of users who are going to be part of that group. For example, Admin, HR, Testers, etc
On the next screen, you need to add access levels for this group. You can do it by attaching a specific policy for that group. AWS provides us with a set of managed policies that you can use to attach our group.
As we creating the admin group, we can attach the AdministratorAccess policy. You can see what that policy looks like in AWS.
On the last screen, we can review details about the group we are creating and finally click on "Create Group" button. Group is created and we can see that on groups dashboard in IAM.
With IAM you can set up a strong password policy for users. Having strong passwords avoids the risk of others guessing them.
For accessing password policy screen, click on "Account settings" option on the left panel. Here you can set different rules to force strong passwords. You can change the minimum password length. You can also make passwords to have at least one uppercase letter, lowercase letter, or a number. Choose your password policy rules and click on "Apply password policy" button.
Having a strong password is not that useful until you have a Multi-Factor authentication set for your account. AWS recommends setting MFA for your root account. This added an extra layer of security. Setting up MFA is not as difficult as it sounds. You can use your android or ios device, download authenticator app, link it up with your AWS account and you are done. Still a little bit confused, let's see how to set up MFA in few easy steps.
On the IAM dashboard, click on Activate MFA on your root account and then on "Manage MFA" button.
On the next screen, it will warn you that you are making changes to the root account. You can close that warning. Select Multi-Factor authentication.
Then you will have three options to choose from. As we are going to use the mobile application to set MFA, we have to choose a virtual MFA device. It will be selected by default. Click on the continue button on that popup.
On the next screen, you will see the QR code. Click on "Show QR code" button if that is not visible. From your mobile app store, download Google authenticator or any other authenticator app that you want to use. Open authenticator app and scan that QR code. You need to enter two QR codes for linking your app and the AWS account. Once you have entered two codes click on "Assign MFA" button. This will set up your MFA and you need to enter MFA code each time you are signing in into AWS root account.
Rember to take screen shot of this QR code. In case if you loose your mobile you can scan this QR code from new mobile and use this to login to your account.
When you create any new user in AWS, he/she has no access to any AWS service. It is best practice to start with no access level for any user and then keep on adding extra resources as per requirement. In this way, you will not give unnecessary access to users which they do not require and misuse them.
When you use AWS for some time, you will have many users who have been created for a specific purpose like testing some applications, deploying some code. Some users also leave your organization. In such cases, we need to delete those users' credentials from AWS. Keeping unused credentials active in your AWS is a risk. So it is recommended that you should regularly delete inactive users from AWS account.
More often than not you will face a situation where one of AWS services needs to use another AWS service. And you will be tempted to create a user with access key id and secret access key and use that to communicate between AWS services. But this is very insecure as anyone who has access to AWS services can use those credentials.
That is why AWS recommends using Roles. Using roles you can easily establish a connection between different AWS services. For example, you can create S3 read role and attach it to EC2 instance, so that EC2 can read from S3
You should schedule frequent overview of IAM and all of its policies and users. This way you can find out if everything is as it should be or not. Such reviews will help you catch any small issues which may get bigger in the future. You can find out unused users or extra permissions give to some users or groups. So it is a good habit to have review schedule for AWS IAM
The root account is the most powerful user of your AWS. It can do anything and everything on the AWS account. So securing it is one of the top priorities. We can always set a strong password ad MFA for the root account. Along with that is considered a good practice to change root account credentials regularly. This will protect your root account even if somehow somebody knows about the root account password.
Though this is not best practice, using IAM you can customize the sign-in link to AWS account. You can customize that link with your organization name or any name of your choice. The only condition is that name should be unique in all of AWS accounts.
I hope you will take all the above steps and secure your AWS account. AWS has a lot of resources and any unauthorized access can cause loss of data, systems and/or unexpected bills. It is always better to prevent such things. Keep your account safe and see you in the next article.
In this blog, we will learn how to filter rows from spark dataframe using Where and Filter functions.
Getting distinct values from columns or rows is one of most used operations. We will learn how to get distinct values as well as count of distinct values.