In the cloud (AWS, Azure, GCP, etc.), cost and access management are crucial for any organization to maximize profits and maintain the security of the cloud and its applications.
In today’s article, I will share some crucial hard lessons I’ve learned about AWS Cloud throughout my professional career. You could say these are the best practices for AWS.
Cost Management
In the cloud, cost is essential because your organisation is running to make profits. The Dots are connected:
- The organization wants maximum profits.
- Profit will come if your product/application works smoothly.
- Resilient Smooth Application will come from Cloud Services. It can also achieved by the Organization’s data centers but this will be much costlier as compared to Cloud Services on a rental basis.
- Cloud Services will be used which needs money.
- Money will come from the Organisation.
So, everything is connected. Hence, the Organization hires Cloud Engineers to deploy the application with minimal cost because there are multiple alternatives to deploy the applications:
- If you have a static website you can utilize an S3 bucket.
- If you have a dynamic web application you can use EC2 Instances.
- If you want to implement Disaster recovery then, AutoScaling will be used for EC2 Instances to scale the servers according to the application requirements.
- If you have a web application that can be broken into microservices you can jump on Elastic Kubernetes Cluster(EKS)
There are a lot of alternatives available according to your requirements. Still, it’s the Cloud or DevOps Engineer’s responsibility to use the best service to gain profit for business such as LightSail, Elastic Beanstalk, Lambda functions, etc.
So, At this point, I will share a few worst practices that you should ignore and follow best practices to optimize your cloud in cost management.
- When I started to learn AWS services, My main aim was to create services that were somewhat correct as the main focus was on creating services. But as you know, the Cloud model is based on ‘pay as you go’ which indicates you also take care of the cost. At that moment, I had never heard about one service which was AWS Cost Explorer. The Cost Explorer is one of the important services that play a crucial role in terms of cost management of the cloud.
- In AWS Cost Explorer, you will find all the detailed costs for each AWS service. It helps you to keep track of your AWS resources with pricing itself.
- But there was one concern, you get your bill at the end of the month or in between the months. But you want to set up an alarm kind of thing in AWS. For that, AWS Budget is one of the important services that help you to set the alarm for your AWS account. So In my Organisation, we estimated the cost according to the project which was 5000$ per month. After doing the cost estimation, we need to keep tracking those bills and we need to take care to not cross the limit of $ 5,000 in a month. So, I set up the AWS Budget for 5000$. Now, in the AWS Budget, there are three things Predefined budget amount, the Amount used till date, and forecasted amount. You can leverage all this to keep tracking the cost of your cloud.
Lambda Function is a Game Changer in AWS?
If you’re looking for advocates of AWS Lambda functions, you’ll find many who will tell you it’s a great serverless option for quickly deploying your code without needing to manage servers. Somewhere it is true but I would like to show you the dark side of lambda function.
First of all, let’s understand the costing structure and how AWS costs for lambda function.
The more requests a Lambda function receives, the higher the costs will be. You can check out the cost of lambda functions provided by AWS itself.
It’s looking somewhere cheaper and one more reason is you don’t need to manage any server yourself to deploy your code.
But there is one issue that we encountered.
Once you are going to select the Lambda function to deploy your code. The developer will be the main hero in your application deployment.
We decided to use the Lambda function in a production environment. We deployed our code to the lambda function. I can’t disclose which programming language we used.
But when we deployed our code, in the middle of the night we got multiple AWS Budget alerts. The next morning, when we checked the budget alerts it was very around 400% jump of the previous month’s bill.
Now, we need to find the RCA behind this jump of the bill.
To find RCA we simply went to Cost Explorer and saw which service generated this bill and we found its Lambda function. Now, this might be one of the key takeaways to find the reason behind getting unexpected bills.
Once we got to know that it was a lambda function, We quickly went to Cloudwatch to check what was happening. We saw that the lambda function runs in the loop which means because of a chunk of code, the lambda function calls itself which will say recursion and you know the more the requests get by the AWS Lambda function the more the bill will generated.
So, As I told you if the Lambda function is selected to deploy your code. Then, your developer will be the main hero. But here the developer was a villain.
Don’t take it seriously about the previous comment. But, the developer needs to create an optimized code. Otherwise, it will create a lot of vulnerabilities and many more than that which will not be good for your applications.
Access Management
So, Let me share one more story to help you understand the importance of Access Management in the AWS Cloud.
Story of giving God(Administrator) Permission to EKS node groups:
I set up an EKS Cluster through Terraform that needs a lot of services such as VPC, EC2, and most importantly IAM.
If you are not aware of the AWS EKS service let me just share a small brief about it.
EKS stands for Elastic Kubernetes Service which means Kubernetes on AWS Cloud. Now, Kubernetes helps to deploy your application on its node groups.
So, those nodes need a lot of permissions such as ECR access as our docker image was stored on ECR, and SSM Access because our parameters are stored on the AWS Parameter Store.
We had two ways to get the required things. The first one was to configure manually the AWS CLI and add secret and secret access keys and the other one was to add roles to the nodes. In which the best way we chose was adding roles with the nodes. To create a role, We need to provide policies as well otherwise, there would be no use of the roles.
According to the requirement, I added Administrator Access which was my biggest mistake. It’s good to add an Administrator Access policy to your domain project for testing purposes only. If you are working in an industry then, you should not give Administrator Access Privilege. Even AWS promotes the least privilege rule to permit any resources for any kind of resource.
After a few months, when I was reading about the least privilege rule I realized that I made a big blunder in adding Administrator access because if someone takes access to your EKS Cluster then he can do anything with your AWS Cloud let’s say he can do bitcoin mining which eventually generates the lots of cost in your AWS Cloud or he might configure vulnerabilities for your application or Database compromised.
So just after reading that article, I went to my Terraform configurations and then, I removed the Administrator Access first. After that, I added the required policies which were ECRFullAccess, and SSMReadOnlyAccess. Hope this story will help you to understand the importance of the IAM least privilege permission.
A few Best Practices related to access management and Cost Management you should follow:
- Instead of using the root account of AWS, try to create users through the IAM service with least privilege principles.
- Enable Cloudtrail to take care of logging of your AWS account which will inform you about every operation happening in your account.
- Use Cost Explorer to track your AWS Services cost usage and utilize the AWS Budget.
- Using Cost Explorer won’t help you to optimize your cloud. You need to keep improving your entire infrastructure by eliminating zombie resources.
- Start using NACLs along with the AWS Security group to deny the specific IP ranges for your application.
- If your backend applications/servers need to attach a load balancer, then try to choose the internal load balancer instead of the internet-facing load balancer.
Note: Zombie resources are those resources that are running without any purpose.
Conclusion
Lastly, I would like to emphasize the importance of continuous learning. The best ways to avoid unnecessary spending and excessive authority in AWS are always changing. Continue using tools such as AWS Cost Explorer and AWS Budgets to be aware of your expenditure.
In addition, ensure that you practice the least privilege principle for securing your resources.
Any issue you encounter with AWS Cloud is an opportunity to learn and improve. By adhering to these best practices, you can govern costs and secure your applications, thereby maximizing the benefits of using AWS. Keep on learning; stay awake; continuously optimize your cloud infrastructure for a better journey that will generate more profit.
Also, check how Uber manages millions of logs per second using ClickHouse.