Why is there a discrepancy between my CloudWatch metrics and AWS CLI storage metrics for Amazon S3?

5 minute read
0

I'm seeing a discrepancy between the Amazon CloudWatch metrics and AWS Command Line Interface (AWS CLI) storage metrics for Amazon Simple Storage Service (Amazon S3). Why is there such a large difference in reported storage size between the two sources?

Short description

If there's a discrepancy between your CloudWatch storage metrics for Amazon S3 and the metrics calculated using the AWS CLI, check for the following:

  • Object versioning.
    Note: The object versioning feature in Amazon S3 retains multiple versions of an object in your bucket. By default, Amazon S3 object versioning is disabled on buckets, and you must explicitly enable the feature. Additionally, the AWS CLI storage metric calculations only count the newest version and size of each object that is stored in the bucket.
  • Incomplete multipart uploads.
    Note: Incomplete multipart uploads aren't included in the AWS CLI storage calculations, but are calculated as storage in your CloudWatch metrics.

To identify the cause of the reporting discrepancy, check whether you've enabled object versioning and look for any multipart uploads in your bucket. These two factors can result in an increased value of the calculated bucket size in CloudWatch. For more information, see Amazon S3 CloudWatch daily storage metrics for buckets.

Tip: If you do have incomplete multipart uploads in Amazon S3, consider creating a lifecycle configuration rule. This lifecycle configuration rule automatically cleans up any incomplete parts, lowering the cost of data storage. Note that lifecycle rules operate asynchronously, so there might be a delay with the operation. However, as soon as the objects are marked for deletion, you aren't billed for storage even if the object isn't removed yet.

Additionally, the Amazon S3 monitoring metrics are recorded daily, and therefore might not display the most updated information. Meanwhile, CloudWatch monitors your AWS resources and applications in real time.

Resolution

Daily storage metrics in CloudWatch

In CloudWatch, the BucketSizeBytes metric captures all Amazon S3 and Amazon S3 Glacier storage types, object versions, and any incomplete multipart uploads. This value is calculated by summing up all object sizes, metadata in your bucket (both current and noncurrent objects), and any incomplete multipart upload sizes. For example, the BucketSizeBytes metric will calculate the amount of data (in bytes) that are stored in an S3 bucket in all these object storage classes:

  • S3 Standard
  • S3 Intelligent-Tiering
  • S3 Standard-IA
  • S3 One Zone-IA
  • S3 Reduced Redundancy Storage
  • S3 Glacier Deep Archive
  • S3 Glacier

Additionally, the NumberOfObjects metric in CloudWatch contains the total number of objects that are stored in a bucket for all storage classes. This value counts all objects in the bucket (both current and noncurrent), along with the total number of parts for any incomplete multipart uploads. The NumberOfObjects metric also calculates the total number of objects for all versions of objects in your bucket. For example, if you have two versions of the same object, then the two versions will be counted as two separate objects. For more information, see Metrics and dimensions.

Daily storage calculations using the AWS CLI

To calculate the storage metrics for Amazon S3 using the AWS CLI, use the following command syntax:

aws s3 ls --summarize --human-readable --recursive s3://bucketname | grep -i total

This command syntax calculates the total number and size of objects in your Amazon S3 bucket. However, note that only the current version of each object that is stored in the bucket (and its size) are calculated. Multipart uploads, delete markers, and noncurrent versions of each object aren't calculated into the total bucket size or the total number of objects.

Incomplete multipart uploads

To review the list of incomplete multipart uploads, run the list-multipart-uploads command:

aws s3api list-multipart-uploads --bucket <bucket-example>

Then, list all the objects in the multipart upload, using the list-parts command and your UploadId value:

aws s3api list-parts --bucket <bucket-example> --key large_test_file --upload-id <examplevalue>

Creating a lifecycle rule

To automatically delete multipart uploads, create a lifecycle configuration rule. Follow these steps:

1.    Open the Amazon S3 console.

2.    Choose the Management tab.

3.    Choose Create new policy.

4.    Add the name of the policy.

5.    Choose Select - Delete expired delete markers or incomplete multipart uploads.

6.    (Optional) If your bucket isn't versioned, choose Delete incomplete multipart uploads.

Object versioning

To review and audit your Amazon S3 bucket for different versions of objects, use the Amazon S3 inventory list. An Amazon S3 inventory list file contains a list of the objects in the source bucket and metadata for each object. The inventory list file will capture metadata such as bucket name, object size, storage class, and version ID.


Related information

Example 8: Lifecycle configuration to abort multipart uploads

Expiring objects

AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago