Updated by Linode
Use promo code DOCS10 for $10 credit on a new account.
While deleting a few objects in an Object Storage bucket might not take that long, when the objects number in the thousands or even millions the time required to complete the delete operations can easily become unmanageable. When deleting a substantial amount of objects, it’s best to use lifecycle policies. These policies can be represented in XML; here’s an (incomplete) snippet of an action that will delete objects after 1 day:
1 2 3
<Expiration> <Days>1</Days> </Expiration>
A lifecycle policy is applied to a bucket. Policies are sets of rules that govern the management of objects after they have aged for a certain amount of time. For instance, you can create a lifecycle policy that deletes objects every thirty days, or once a week. This is useful for cases where the data in a bucket becomes outdated, such as when collecting activity logs.
In This Guide
This guide will first describe when policies are enforced and will then explain how to create and delete lifecycle policies with two tools:
s3cmd command line interface (CLI): In addition to deleting objects, more complicated policies can be managed with s3cmd, including deleting old versions of objects that have been retained, and failed multipart uploads.
Cyberduck desktop application (GUI): Cyberduck does not feature as many policy options, but they can be managed through a point-and-click interface.
Before You Begin
- Familiarize yourself with Linode Object Storage by reading the How to Use Object Storage guide.
- For demonstration purposes, you can create an Object Storage bucket with a few objects that you will later delete.
When Policies are Enforced
Lifecycle policies are triggered starting at midnight of the Object Storage cluster’s local time. This means that if you set a lifecycle policy of one day, the objects will be deleted the midnight after they become 24 hours old.
For example, if an object is created at 5PM on January 1, it will reach 24 hours in age at 5PM on January 2. The policy will then be enforced on the object at 12AM on January 3.
There is a chance that a lifecycle policy will not delete all of the files in a bucket the first time the lifecycle policy is triggered. This is especially true for buckets with upwards of a million objects. In cases like these, most of the objects are deleted, and any remaining objects are typically deleted during the next iteration of the lifecycle policy’s rules.
Create and Delete Lifecycle Policies
s3cmd allows users to set and manage lifecycle policies from the command line. In this section, you will find instructions on how to create and manage lifecycle policies to delete objects, previous versions of objects, and failed multipart uploads using s3cmd.
NoteIf you don’t have s3cmd set up on your computer, visit the Install and Configure s3cmd section of the How to Use Linode Object Storage guide.
Creating a Lifecycle Policy File
In S3-compatible Object Storage, a lifecycle policy is represented by an XML file. You can use your preferred text editor to create this XML file. Consider the following lifecycle policy file:
1 2 3 4 5 6 7 8 9 10
<LifecycleConfiguration> <Rule> <ID>delete-all-objects</ID> <Prefix></Prefix> <Status>Enabled</Status> <Expiration> <Days>1</Days> </Expiration> </Rule> </LifecycleConfiguration>
The above lifecycle policy deletes all objects in the bucket after one day. Each lifecycle policy file needs a
LifecycleConfiguration block and a nested
Rule block. The
Rule block must contain
Status, and at least one action, like the
Expiration block. It’s also a good idea to include an
||Defines a name for the lifecycle policy rule. If your lifecycle policy contains multiple rules, then the ID for each should be unique. If one is not specified in your policy file, then a random alphanumeric ID will be assigned to your policy when the policy is applied to a bucket.|
||This string is used to select objects for deletion with the same matching prefix. For example, objects that begin with
||A string value describing the status of the lifecycle policy. To enable the policy, set this value to
Other actions can also be specified in a rule:
NoncurrentVersionExpirationblock, and its child,
NoncurrentDays. These are used to control the lifecycle of objects with multiple older versions, and should only be used with buckets that have bucket versioning enabled. Using this option will delete objects that are not the newest, most current version. Below is an example of how to use
1 2 3 4 5 6 7 8 9 10
<LifecycleConfiguration> <Rule> <ID>delete-prior-versions</ID> <Prefix></Prefix> <Status>Enabled</Status> <NoncurrentVersionExpiration> <NoncurrentDays>1</NoncurrentDays> </NoncurrentVersionExpiration> </Rule> </LifecycleConfiguration>
AbortIncompleteMultipartUpload, and its child,
DaysAfterInitiation. These work similarly to
NoncurrentVersionExpiration, but instead of deleting previous versions of objects, they will delete failed multipart uploads. The following will delete failed multipart uploads three days after they were initiated:
1 2 3 4 5 6 7 8 9 10
<LifecycleConfiguration> <Rule> <ID>delete-incomplete-multipart-uploads</ID> <Prefix></Prefix> <Status>Enabled</Status> <AbortIncompleteMultipartUpload> <DaysAfterInitiation>3</DaysAfterInitiation> </AbortIncompleteMultipartUpload> </Rule> </LifecycleConfiguration>
About multipart uploads
Objects that are part of failed multipart uploads (the mechanism by which large files are uploaded) stay within Object Storage buckets, counting towards your total Object Storage costs. s3cmd will automatically initiate a multipart upload when a file is larger than 15MB. Lifecycle policies are a great way to clear out stale multipart uploads.
Multiple Actions in One Rule
More than one action can be specified in a single rule. For example, you may want to both expire the current version of an object after a set number of days and also remove old versions of it after another period of time. The following policy will delete the current version of an object after 10 days and remove any noncurrent versions of an object 3 days after they are demoted from the current version:
1 2 3 4 5 6 7 8 9 10 11 12 13
<LifecycleConfiguration> <Rule> <ID>delete-prior-versions</ID> <Prefix></Prefix> <Status>Enabled</Status> <Expiration> <Days>10</Days> </Expiration> <NoncurrentVersionExpiration> <NoncurrentDays>3</NoncurrentDays> </NoncurrentVersionExpiration> </Rule> </LifecycleConfiguration>
As a reminder, if a versioned object is deleted, only the current version of the object will be deleted and all older versions will be preserved in the bucket. For this reason, the above rule has the effect of deleting any objects if they are not updated within 10 days, and then removing the remaining object versions after 3 days.
A lifecycle policy file can only contain one
LifecycleConfiguration block, but the
LifecycleConfiguration block can contain more than one
Rule. For instance, if you had a bucket that contained both error and general output logs, you could set a lifecycle policy that saves error logs for a week but deletes standard logs at the end of every day:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
<LifecycleConfiguration> <Rule> <ID>delete-error-logs</ID> <Prefix>error</Prefix> <Status>Enabled</Status> <Expiration> <Days>7</Days> </Expiration> </Rule> <Rule> <ID>delete-standard-logs</ID> <Prefix>logs</Prefix> <Status>Enabled</Status> <Expiration> <Days>1</Days> </Expiration> </Rule> </LifecycleConfiguration>
Uploading the Lifecycle Policy to a Bucket
In order to apply a lifecycle policy to a bucket with s3cmd, you need to upload the lifecycle file to the bucket. This operation is not a normal PUT operation. Instead, the command to use is
setlifecycle, followed by the name of the lifecycle policy file, and the name of bucket:
s3cmd setlifecycle lifecycle_policy.xml s3://lifecycle-policy-example
You should see output like the following:
s3://lifecycle-policy-example/: Lifecycle Policy updated
Once the lifecycle policy has been uploaded, objects will be deleted according to the policy set in place.
Viewing a Bucket’s Lifecycle Policy
To view a lifecycle policy after it has been uploaded to a bucket, use the
getlifecycle command and provide the bucket name:
s3cmd getlifecycle s3://lifecycle-policy-example
You should see the contents of the XML file that was uploaded:
<?xml version="1.0" ?> <LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <Rule> <ID>delete-all</ID> <Prefix/> <Status>Enabled</Status> <Expiration> <Days>1</Days> </Expiration> </Rule> </LifecycleConfiguration>
Deleting a Lifecycle Policy
To delete a lifecycle policy that you’ve uploaded, effectively disabling it, use the
dellifecycle command and provide the bucket name:
s3cmd dellifecycle s3://lifecycle-policy-example
You’ll see a confirmation that the lifecycle policy was deleted:
s3://lifecycle-example/: Lifecycle Policy deleted
Cyberduck allows less control over lifecycle polices than the s3cmd CLI. In particular, Cyberduck does not allow you to set a lifecycle policy that removes outdated versions of objects stored in buckets where versioning is enabled, nor does it allow you to delete multipart uploads. Cyberduck also limits the length of a lifecycle policy to commonly used time spans. Below you will learn how to set a lifecycle policy using Cyberduck.
Enable a Lifecycle Policy
Right click or control + click on the bucket for which you would like to set a lifecycle policy. This will bring up the bucket info menu.
Click on the S3 tab to open the S3 bucket settings.
Click on the checkbox labeled Delete files and select a time interval from the drop-down menu below it.
This will enable the lifecycle policy and the objects within the bucket will be deleted after the designated time.
Disable a Lifecycle Policy
To disable a lifecycle policy, uncheck the box entitled Delete Files that you checked in the previous section.
Find answers, ask questions, and help others.
This guide is published under a CC BY-ND 4.0 license.