Log Management solutions in Clouds☁️(AWS Part1)

Hey there!👋

As a Security Engineer, I work with logs all the time and there is no need to explain how important it is to have the right logs from the right system at the right time. The volume of logs is dramatically increasing each year. And we have to handle that volume, we have to collect/store/correlate/precess them.

Whatever who you are in IT, you work with logs and the more logs you have the more you want to store them in some fast, reliable, convenient solution. You know what would be better if your team cares only about applications and all server issues stay on cloud providers.
In the series of articles, I want to show you what main cloud providers may offer.

AWS

AWS Solutions Portfolio has Centralized Logging Solution. You can deploy it in your account in a few minutes. But you don't need to think about it as an only one way to have log management in AWS, think about it as one of the hundreds of ways how to combine together AWS services.
I want to show you the AWS services that you can use to create the best solution for you.

CloudWatch

CloudWatch is the main monitoring service in AWS. It has amazing functionality like log metrics, dashboards, advanced log searching, and many others. Also, it has a lot of integrations with other AWS services. You can stream logs directly from ec2 instances as well as from on-premises servers via CloudWatch agent. It is possible to export logs to S3/Elasticsearch services if you want to have a copy or long term storage.
pros:
  • Build-in functionality
  • Easy to get started to use
  • Serverless
cons:
  • Cost. You will literally pay for everything. As an example, just for sending logs to CloudWatch you will pay for Data Ingestion about $0.60 per GB. It actually stopped me to use CloudWatch as a main log management solution.

Amazon Elasticsearch Service

Many of us already know ElasticSearch and love it. But, what is the point to pay extra money to AWS when you can do the same on your own? Honestly, I love ElasticSearch in AWS even more. You don't need to be an expert in ElasticSearch to have full advantage of it. You are in charge only of the application level. Imagine, you've deployed your ElasticSearch cluster a couple of months ago and suddenly someone says that you will have more logs then you planned...😱. How long does it take to scale your cluster up or maybe after some time scale it down? In AWS it takes minutes and all that you have to do is a few clicks or API calls.

pros:
  • Solution that we already love
  • Сluster management and updates without headaches
cons:
  • You still need to manage indexes and define all application-level things like a number of shards and replicas. Also, you have to care about free storage space, but you already have a good friend for this - CloudWatch. A good practice is to create CloudWatch Alarm to follow the free space and make a notification or scale Elasticsearch Domain up if the free space goes below a certain threshold.

Amazon Kinesis

Amazon Kinesis is not one service, it is a family of four. All Kinesis services were built to help you out with streaming data. Kinesis services may become a very important part of your dataflow. I what to show you two of them that you may use for your own log management solution.
Kinesis data stream is a real-time data streaming service. You can put data into a stream via API using your own code, third-party products or Kinesis Agent. To get data from a stream you may use Kinesis Firehose, Lambda or other applications. The good thing about Kinesis Data Stream is you pay for Shard Hours. It means that you don't pay for data ingestion and together with a custom producer, it gives you amazing flexibility. You may aggregate short massages or compress before sending them to stream. Then a consumer processes your data on the fly, especially if you've chosen a lambda function as a consumer. I found a combination of Kinesis Data Stream and Lambda function as very powerful because they both are highly scalable, highly available and you keep everything as a code.
pros:
  • Predictable cost
  • Amazing flexibility
  • Serverless
cons:
  • Require software development skills and it takes more time to get started to use
Kinesis Data Firehose is the easy way to upload your streaming data to data lakes. Perhaps, AWS has implemented one of the common scenarios for us. You don't need to develop your own application if you want to store streaming data in one of the services that Kinesis Data Firehose supports(S3/Redshift/ElasticSearch/Splunk). The worth thing to mention that with Firehos you can transform source records before it stores to data lakes. It gives you the ability to pre-process data, but such a feature has its limits.
pros:
  • Easy to get started to use
  • Serverless
cons:
  • Solve a narrow, specific task
In the next article, I will show other useful AWS services that you may use. Also, I will try to show examples of how you may combine services to solve some common scenarios.