QubOps

AWS Cost Optimization - Amazon Macie

Tips and strategies to manage Amazon Macie costs

4 min read time

#aws #macie

Amazon Macie can be a powerful tool to help you discover, classify, and protect sensitive data in your AWS environment. However, it can sometimes feel as though the cost is unavoidable and spiraling as your infrastructure grows.

Here are a few things to consider when managing your Amazon Macie costs.

1. Understand the cost and usage

Amazon Macie has a few different costs associated with it, and they are all related to the amount of data processed in Amazon S3:

  1. Monthly cost per GB of data classified: Tiered pricing based on volume and region.
  2. Monthly cost per bucket analysed: There is also a cost per bucket analysed. This cost is in addition to the cost per GB of data classified in each bucket.
  3. Automated data discovery: This is an additional cost for using the automated data discovery feature based on the number of objects it finds.

Please refer to the Amazon Macie Billing Codes to understand the breakdown of your bill in Cost Explorer.

2. Remove unnecessary buckets from scans

By default Amazon Macie scans all buckets in your account. This can lead to some items being scanned that may not need to be. Some of the most common examples include:

  • VPC Flow Logs
  • CloudTrail logs
  • Static content like image or video files

It is important to note that Amazon Macie is focused on finding sensitive information like PII or financial data. It also primarily focuses on text-based formats (e.g., documents, spreadsheets).

VPC Flow Logs and CloudTrail logs would never contain such data as they are network traffic logs and logs generated automatically by AWS.

Similarly, static content like images or videos, which may in some very obscure cases potentially contain this, are not currently supported by Macie so any buckets containing only these types of files should be excluded from scans.

You can exclude buckets in both Automated Discovery and Regular Data Discovery Jobs.

3. Optimise file storage

As Macie bills per GB of scanned data, there is a costs benefit to storing data in an optimal format.

For example, logs and other data can be compressed before being stored in S3.

In addition, parquet is often a more efficient format for data storage than CSV.

By reducing the size of your S3 buckets, you can reduce the cost of both Macie and the underlying S3 storage costs.

4. Use Macie only when necessary

Amazon Macie is a powerful tool, but it may not be necessary for all of your S3 buckets to be constantly scanned for sensitive data.

Ensure that you are not scheduling at higher frequency than required as each scan adds to your costs.

Review the schedules and ensure they accurate align with your business objectives.

5. Consider random object sampling

In some cases it may not be necessary to scan the entire bucket to understand if there is any sensitive data contained within.

One strategy is to perform random sampling over a percentage of the objects in a bucket by specifying the sampling depth.

Some examples of when this may be useful include:

  • A bucket storing logs from a single application with predictable formats or an application that is known to not handle any PII data.
  • A dataset with repeated patterns of encrypted sensitive information like PII or financial data.
  • Preliminary scans of large datasets to understand the data in a bucket before committing to a full scan.

6. Ensure your scan schedules align archiving strategies

If you are using S3 intelligent tiering or lifecycle policies to move data to move data to more cost-effective storage classes, ensure that your Macie scan schedules align with these policies.

If you have a schedule to re-scan entire buckets and these buckets contain archived data, without adding exclusions Macie can trigger a retrieval of the archived data which can incur additional costs.

Conclusion

Amazon Macie is a powerful tool for discovering and protecting sensitive data in your AWS environment. However, it is important to manage the costs associated with it to ensure that you are not overspending by just blanket scanning everything in S3.

If you require any assistance in managing your Amazon Macie costs, please reach out to us.

Join our newsletter for Cost Optimization tips and tricks

By subscribing you agree to our Privacy Policy and provide consent to receive updates from our company.