Ingest AWS CloudTrail through Cloudwatch & Logstash into Elasticsearch
For a while, I had the challenge where the AWS CloudTrail logs which I was ingesting into Elasticsearch were not compliant with Elastic Common Schema. Among the issue of usability of the logs, this also meant I created an index for each AWS service to prevent mapping conflicts, leading to multiple issues.
In this post, I will take you through how I setup a log ingest for CloudTrail logging to a CloudWatch trail with the end result of ECS compliant logs using out of the box parsing.
For users of Elasticsearch CloudTrail logs has an “annoying” charecteristic: every service has slightly different fields. Resulting in one of two things:
Out of the box solution
To ingest CloudTrail logs, Filebeat has an out of the box module which will work great, if you can use S3 and SQS. If for any reason you are not able to use either, you cannot use the default module… completely. However if we could use the pipeline to parse the logs, we will end up with the same result.
Depending on if your limitation is S3 or SQS you can go about this in different ways, my limitation was S3 so the following is tailored to that scenario; If SQS is your limitation, you can change the input of your logstash to accomodate for the usage of S3 as an input.
First step is to create a new CloudTrail trail pointing to CloudWatch. Open your AWS account, go to CloudTrail -> trails and configure one.
I assume you already have a Logstash instance running.
On your logstash instance, you create a new pipeline, which is relatively straight forward, i’ll explain the output section later on.
The next step is to get our cloudwatch logs to logstash. This can be done using multiple methods, the one I picked was to stream the logs to a Lambda function and use the lambda stream code from Jon Beilke. For this you need to create a python lambda function with network configuration (VPC) and access to your logstash endpoint. Also do not forget to update the parameters of the code to actually connect to your logstash.
Next we can use the pipeline of the Filebeat module in elasticsearch ingest to correctly parse our logs. This is doable because the log format of CloudTrail is the same in both S3 and CloudWatch. On a filebeat instance you can run
filebeat setup --modules aws to load the pipeline into elasticsearch.
Coming back to the logstash output
The logstash output has two lines which I want to explain, the first one is line 19. I chose to send all cloudtrail logs to the
filebeat-aws index, have setup ILM accordingly, and you can still use all of the OOTB features as the pattern
filebeat-* still matches. If you do not want to use this pattern, just remove the line. The second line is 23, this will tell elasticsearch to parse the messages using this pipeline (which you loaded from Filebeat). Make sure the version, in this case
7.14.1 matches with your environment.
Completing the setup
Lastly you can “open the flood gates” by going into AWS, opening the CloudWatch logs console and enable streaming of the events;
- Select a loggroup
- Actions -> Subscription filters -> Create Lambda subscription filter
- Select the Lambda function you created earlier
- Add a subscription filter name
- “Start streaming”
You are now able to use all of the features in the Elastic Stack which require ECS formatting of Cloudtrail; For example the detection rules and Machine Learning jobs.