One place for hosting & domains

      Objects

      Manage Objects with Lifecycle Policies


      Updated by Linode

      Contributed by
      Linode

      While deleting a few objects in an Object Storage bucket might not take that long, when the objects number in the thousands or even millions the time required to complete the delete operations can easily become unmanageable. When deleting a substantial amount of objects, it’s best to use lifecycle policies. These policies can be represented in XML; here’s an (incomplete) snippet of an action that will delete objects after 1 day:

      1
      2
      3
      
      <Expiration>
          <Days>1</Days>
      </Expiration>

      A lifecycle policy is applied to a bucket. Policies are sets of rules that govern the management of objects after they have aged for a certain amount of time. For instance, you can create a lifecycle policy that deletes objects every thirty days, or once a week. This is useful for cases where the data in a bucket becomes outdated, such as when collecting activity logs.

      In This Guide

      This guide will first describe when policies are enforced and will then explain how to create and delete lifecycle policies with two tools:

      • s3cmd command line interface (CLI): In addition to deleting objects, more complicated policies can be managed with s3cmd, including deleting old versions of objects that have been retained, and failed multipart uploads.

      • Cyberduck desktop application (GUI): Cyberduck does not feature as many policy options, but they can be managed through a point-and-click interface.

      Before You Begin

      • Familiarize yourself with Linode Object Storage by reading the How to Use Object Storage guide.
      • For demonstration purposes, you can create an Object Storage bucket with a few objects that you will later delete.

      When Policies are Enforced

      Lifecycle policies are triggered starting at midnight of the Object Storage cluster’s local time. This means that if you set a lifecycle policy of one day, the objects will be deleted the midnight after they become 24 hours old.

      For example, if an object is created at 5PM on January 1, it will reach 24 hours in age at 5PM on January 2. The policy will then be enforced on the object at 12AM on January 3.

      Note

      There is a chance that a lifecycle policy will not delete all of the files in a bucket the first time the lifecycle policy is triggered. This is especially true for buckets with upwards of a million objects. In cases like these, most of the objects are deleted, and any remaining objects are typically deleted during the next iteration of the lifecycle policy’s rules.

      Create and Delete Lifecycle Policies

      s3cmd

      s3cmd allows users to set and manage lifecycle policies from the command line. In this section, you will find instructions on how to create and manage lifecycle policies to delete objects, previous versions of objects, and failed multipart uploads using s3cmd.

      Note

      If you don’t have s3cmd set up on your computer, visit the Install and Configure s3cmd section of the How to Use Linode Object Storage guide.

      Creating a Lifecycle Policy File

      In S3-compatible Object Storage, a lifecycle policy is represented by an XML file. You can use your preferred text editor to create this XML file. Consider the following lifecycle policy file:

      lifecycle_policy.xml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      
      <LifecycleConfiguration>
          <Rule>
              <ID>delete-all-objects</ID>
              <Prefix></Prefix>
              <Status>Enabled</Status>
              <Expiration>
                  <Days>1</Days>
              </Expiration>
          </Rule>
      </LifecycleConfiguration>

      The above lifecycle policy deletes all objects in the bucket after one day. Each lifecycle policy file needs a LifecycleConfiguration block and a nested Rule block. The Rule block must contain Prefix and Status, and at least one action, like the Expiration block. It’s also a good idea to include an ID block:

      Block Description
      ID Defines a name for the lifecycle policy rule. If your lifecycle policy contains multiple rules, then the ID for each should be unique. If one is not specified in your policy file, then a random alphanumeric ID will be assigned to your policy when the policy is applied to a bucket.
      Prefix This string is used to select objects for deletion with the same matching prefix. For example, objects that begin with error_report- could be targeted for deletion by providing this prefix. This Prefix can be empty if you want a rule to apply to all files in a bucket.
      Status A string value describing the status of the lifecycle policy. To enable the policy, set this value to Enabled. To disable the policy set the value to Disabled.
      Expiration Contains the Days block. The Days block is the number of days before this rule will be enforced. In the above example, the Days is set to 1, meaning that the objects in the bucket will be deleted after one day.

      Additional Actions

      Other actions can also be specified in a rule:

      • NoncurrentVersionExpiration block, and its child, NoncurrentDays. These are used to control the lifecycle of objects with multiple older versions, and should only be used with buckets that have bucket versioning enabled. Using this option will delete objects that are not the newest, most current version. Below is an example of how to use NoncurrentVersionExpiration:

        lifecycle_policy_noncurrent_versions.xml
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        
        <LifecycleConfiguration>
            <Rule>
                <ID>delete-prior-versions</ID>
                <Prefix></Prefix>
                <Status>Enabled</Status>
                <NoncurrentVersionExpiration>
                    <NoncurrentDays>1</NoncurrentDays>
                </NoncurrentVersionExpiration>
            </Rule>
        </LifecycleConfiguration>
      • AbortIncompleteMultipartUpload, and its child, DaysAfterInitiation. These work similarly to NoncurrentVersionExpiration, but instead of deleting previous versions of objects, they will delete failed multipart uploads. The following will delete failed multipart uploads three days after they were initiated:

        lifecycle_policy_multipart_upload.xml
         1
         2
         3
         4
         5
         6
         7
         8
         9
        10
        
        <LifecycleConfiguration>
            <Rule>
                <ID>delete-prior-versions</ID>
                <Prefix></Prefix>
                <Status>Enabled</Status>
                <AbortIncompleteMultipartUpload>
                    <DaysAfterInitiation>3</DaysAfterInitiation>
                </AbortIncompleteMultipartUpload>
            </Rule>
        </LifecycleConfiguration>



        About multipart uploads

        Objects that are part of failed multipart uploads (the mechanism by which large files are uploaded) stay within Object Storage buckets, counting towards your total Object Storage costs. s3cmd will automatically initiate a multipart upload when a file is larger than 15MB. Lifecycle policies are a great way to clear out stale multipart uploads.

      Multiple Actions in One Rule

      More than one action can be specified in a single rule. For example, you may want to both expire the current version of an object after a set number of days and also remove old versions of it after another period of time. The following policy will delete the current version of an object after 10 days and remove any noncurrent versions of an object 3 days after they are demoted from the current version:

      lifecycle_policy_multipart_upload.xml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      
      <LifecycleConfiguration>
          <Rule>
              <ID>delete-prior-versions</ID>
              <Prefix></Prefix>
              <Status>Enabled</Status>
              <Expiration>
                  <Days>10</Days>
              </Expiration>
              <NoncurrentVersionExpiration>
                  <NoncurrentDays>3</NoncurrentDays>
              </NoncurrentVersionExpiration>
          </Rule>
      </LifecycleConfiguration>

      Note

      As a reminder, if a versioned object is deleted, only the current version of the object will be deleted and all older versions will be preserved in the bucket. For this reason, the above rule has the effect of deleting any objects if they are not updated within 10 days, and then removing the remaining object versions after 3 days.

      Multiple Rules

      A lifecycle policy file can only contain one LifecycleConfiguration block, but the LifecycleConfiguration block can contain more than one Rule. For instance, if you had a bucket that contained both error and general output logs, you could set a lifecycle policy that saves error logs for a week but deletes standard logs at the end of every day:

      lifecycle_policy_error_and_standard_logs.xml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      16
      17
      18
      
      <LifecycleConfiguration>
          <Rule>
              <ID>delete-error-logs</ID>
              <Prefix>error</Prefix>
              <Status>Enabled</Status>
              <Expiration>
                  <Days>7</Days>
              </Expiration>
          </Rule>
          <Rule>
              <ID>delete-standard-logs</ID>
              <Prefix>logs</Prefix>
              <Status>Enabled</Status>
              <Expiration>
                  <Days>1</Days>
              </Expiration>
          </Rule>
      </LifecycleConfiguration>

      Uploading the Lifecycle Policy to a Bucket

      In order to apply a lifecycle policy to a bucket with s3cmd, you need to upload the lifecycle file to the bucket. This operation is not a normal PUT operation. Instead, the command to use is setlifecycle, followed by the name of the lifecycle policy file, and the name of bucket:

      s3cmd setlifecycle lifecycle_policy.xml s3://lifecycle-policy-example
      

      You should see output like the following:

      s3://lifecycle-policy-example/: Lifecycle Policy updated
      

      Once the lifecycle policy has been uploaded, objects will be deleted according to the policy set in place.

      Viewing a Bucket’s Lifecycle Policy

      To view a lifecycle policy after it has been uploaded to a bucket, use the getlifecycle command and provide the bucket name:

      s3cmd getlifecycle s3://lifecycle-policy-example
      

      You should see the contents of the XML file that was uploaded:

      <?xml version="1.0" ?>
      <LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
        <Rule>
          <ID>delete-all</ID>
          <Prefix/>
          <Status>Enabled</Status>
          <Expiration>
            <Days>1</Days>
          </Expiration>
        </Rule>
      </LifecycleConfiguration>
      

      Deleting a Lifecycle Policy

      To delete a lifecycle policy that you’ve uploaded, effectively disabling it, use the dellifecycle command and provide the bucket name:

      s3cmd dellifecycle s3://lifecycle-policy-example
      

      You’ll see a confirmation that the lifecycle policy was deleted:

      s3://lifecycle-example/: Lifecycle Policy deleted
      

      Cyberduck

      Cyberduck allows less control over lifecycle polices than the s3cmd CLI. In particular, Cyberduck does not allow you to set a lifecycle policy that removes outdated versions of objects stored in buckets where versioning is enabled, nor does it allow you to delete multipart uploads. Cyberduck also limits the length of a lifecycle policy to commonly used time spans. Below you will learn how to set a lifecycle policy using Cyberduck.

      Note

      Enable a Lifecycle Policy

      1. Right click or control + click on the bucket for which you would like to set a lifecycle policy. This will bring up the bucket info menu.

      2. Click on the S3 tab to open the S3 bucket settings.

        Open the Cyberduck bucket settings menu.

      3. Click on the checkbox labeled Delete files and select a time interval from the drop-down menu below it.

        Click on the "S3" tab and then check the box labeled "Delete files."

      This will enable the lifecycle policy and the objects within the bucket will be deleted after the designated time.

      Disable a Lifecycle Policy

      To disable a lifecycle policy, uncheck the box entitled Delete Files that you checked in the previous section.

      Find answers, ask questions, and help others.

      This guide is published under a CC BY-ND 4.0 license.



      Source link

      A Beginner's Guide to Kubernetes, Part 3: Objects


      Updated by Linode

      Contributed by

      Linode

      A Beginner's Guide to Kubernetes

      Note

      In Kubernetes, there are a number of objects that are abstractions of your Kubernetes system’s desired state. These objects represent your application, its networking, and disk resources – all of which together form your application.

      In this guide you will learn about Pods, Services, Volumes, and Namespaces.

      Pods

      In Kubernetes, all containers exist within Pods. Pods are the smallest unit of the Kubernetes architecture, and can be viewed as a kind of wrapper for your container. Each Pod is given its own IP address with which it can interact with other Pods within the cluster.

      Usually, a Pod contains only one container, but a Pod can contain multiple containers if those containers need to share resources. If there is more than one container in a Pod, these containers can communicate with one another via localhost.

      Pods in Kubernetes are “mortal,” which means that they are created, and destroyed depending on the needs of the application. For instance, you might have a web app backend that sees a spike in CPU usage. This might cause the cluster to scale up the amount of backend Pods from two to ten, in which case eight new Pods would be created. Once the traffic subsides, the Pods might scale back to two, in which case eight pods would be destroyed.

      It is important to note that Pods are destroyed without respect to which Pod was created first. And, while each Pod has its own IP address, this IP address will only be available for the life-cycle of the Pod.

      Below is an example of a Pod manifest:

      my-apache-pod.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      
      apiVersion: v1
      kind: Pod
      metadata:
       name: apache-pod
       labels:
         app: web
      spec:
        containers:
        - name: apache-container
          image: httpd

      Each manifest has four necessary parts:

      • The version of the API in use
      • The kind of resource you’d like to define
      • Metadata about the resource
      • Though not required by all objects, a spec which describes the desired behavior of the resource is necessary for most objects and controllers.

      In the case of this example, the API in use is v1, and the kind is a Pod. The metadata field is used for applying a name, labels, and annotations. Names are used to differentiate resources, while labels are used to group like resources. Labels will come into play more when defining Services and Deployments. Annotations are for attaching arbitrary data to the resource.

      The spec is where the desired state of the resource is defined. In this case, a Pod with a single Apache container is desired, so the containers field is supplied with a name, ‘apache-container’, and an image, the latest version of Apache. The image is pulled from Docker Hub, as that is the default container registry for Kubernetes.

      For more information on the type of fields you can supply in a Pod manifest, refer to the Kubernetes Pod API documentation.

      Now that you have the manifest, you can create the Pod using the create command:

      kubectl create -f my-apache-pod.yaml
      

      To view a list of your pods, use the get pods command:

      kubectl get pods
      

      You should see output like the following:

      NAME         READY   STATUS    RESTARTS   AGE
      apache-pod   1/1     Running   0          16s
      

      To quickly view which Node the Pod exists on, issue the get pods command with the -o=wide flag:

      kubectl get pods -o=wide
      

      To retrieve information about the Pod, issue the describe command:

      kubcetl describe pod apache-pod
      

      You should see output like the following:

      ...
      Events:
      Type    Reason     Age    From                       Message
      ----    ------     ----   ----                       -------
      Normal  Scheduled  2m38s  default-scheduler          Successfully assigned default/apache-pod to mycluster-node-1
      Normal  Pulling    2m36s  kubelet, mycluster-node-1  pulling image "httpd"
      Normal  Pulled     2m23s  kubelet, mycluster-node-1  Successfully pulled image "httpd"
      Normal  Created    2m22s  kubelet, mycluster-node-1  Created container
      Normal  Started    2m22s  kubelet, mycluster-node-1  Started container
      

      To delete the Pod, issue the delete command:

      kubectl delete pod apache-pod
      

      Services

      Services group identical Pods together to provide a consistent means of accessing them. For instance, you might have three Pods that are all serving a website, and all of those Pods need to be accessible on port 80. A Service can ensure that all of the Pods are accessible at that port, and can load balance traffic between those Pods. Additionally, a Service can allow your application to be accessible from the internet. Each Service is given an IP address and a corresponding local DNS entry. Additionally, Services exist across Nodes. If you have two replica Pods on one Node and an additional replica Pod on another Node, the service can include all three Pods. There are four types of Service:

      • ClusterIP: Exposes the Service internally to the cluster. This is the default setting for a Service.
      • NodePort: Exposes the Service to the internet from the IP address of the Node at the specified port number. You can only use ports in the 30000-32767 range.
      • LoadBalancer: This will create a load balancer assigned to a fixed IP address in the cloud, so long as the cloud provider supports it. In the case of Linode, this is the responsibility of the Linode Cloud Controller Manager, which will create a NodeBalancer for the cluster. This is the best way to expose your cluster to the internet.
      • ExternalName: Maps the service to a DNS name by returning a CNAME record redirect. ExternalName is good for directing traffic to outside resources, such as a database that is hosted on another cloud.

      Below is an example of a Service manifest:

      my-apache-service.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      
      apiVersion: v1
      kind: Service
      metadata:
        name: apache-service
        labels:
          app: web
      spec:
        type: NodePort
        ports:
        - port: 80
          targetPort: 80
          nodePort: 30020
        selector:
          app: web

      The above example Service uses the v1 API, and its kind is Service. Like the Pod example in the previous section, this manifest has a name and a label. Unlike the Pod example, this spec uses the ports field to define the exposed port on the container (port), and the target port on the Pod (targetPort). The type NodePort unlocks the use of nodePort field, which allows traffic on the host Node at that port. Lastly, the selector field is used to target only the Pods that have been assigned the app: web label.

      For more information on Services, visit the Kubernetes Service API documentation.

      To create the Service from the YAML file, issue the create command:

      kubectl create -f my-apache-service.yaml
      

      To view a list of running services, issue the get services command:

      kubectl get services
      

      You should see output like the following:

      NAME             TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
      apache-service   NodePort    10.99.57.13   <none>        80:30020/TCP   54s
      kubernetes       ClusterIP   10.96.0.1     <none>        443/TCP        46h
      

      To retrieve more information about your Service, issue the describe command:

      kubectl describe service apache-service
      

      To delete the Service, issue the delete command:

      kubcetl delete service apache-service
      

      Volumes

      A Volume in Kubernetes is a way to share file storage between containers in a Pod. Kubernetes Volumes differ from Docker volumes because they exist inside the Pod rather than inside the container. When a container is restarted the Volume persists. Note, however, that these Volumes are still tied to the lifecycle of the Pod, so if the Pod is destroyed the Volume will be destroyed with it.

      Linode also offers a Container Storage Interface (CSI) driver that allows the cluster to persist data on a Block Storage volume.

      Below is an example of how to create and use a Volume by creating a Pod manifest:

      my-apache-pod-with-volume.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      
      apiVersion: v1
      kind: Pod
      metadata:
        name: apache-with-volume
      spec:
        volumes:
        - name: apache-storage-volume
          emptyDir: {}
      
        containers:
        - name: apache-container
          image: httpd
          volumeMounts:
          - name: apache-storage-volume
            mountPath: /data/apache-data

      A Volume has two unique aspects to its definition. In this example, the first aspect is the volumes block that defines the type of Volume you want to create, which in this case is a simple empty directory (emptyDir). The second aspect is the volumeMounts field within the container’s spec. This field is given the name of the Volume you are creating and a mount path within the container.

      There are a number of different Volume types you could create in addition to emptyDir depending on your cloud host. For more information on Volume types, visit the Kubernetes Volumes API documentation.

      Namespaces

      Namespaces are virtual clusters that exist within the Kubernetes cluster that help to group and organize objects. Every cluster has at least three namespaces: default, kube-system, and kube-public. When interacting with the cluster it is important to know which Namespace the object you are looking for is in, as many commands will default to only showing you what exists in the default namespace. Resources created without an explicit namespace will be added to the default namespace.

      Namespaces consist of alphanumeric characters, dashes (-), and periods (.).

      Here is an example of how to define a Namespace with a manifest:

      my-namespace.yaml
      1
      2
      3
      4
      
      apiVersion: v1
      kind: Namespace
      metadata:
        name: my-app

      To create the Namespace, issue the create command:

      kubcetl create -f my-namespace.yaml
      

      Below is an example of a Pod with a Namespace:

      my-apache-pod-with-namespace.yaml
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      
      apiVersion: v1
      kind: Pod
      metadata:
        name: apache-pod
        labels:
          app: web
        namespace: my-app
      spec:
        containers:
        - name: apache-container
          image: httpd

      To retrieve resources in a certain Namespace, use the -n flag.

      kubectl get pods -n my-app
      

      You should see a list of Pods within your namespace:

      NAME         READY   STATUS    RESTARTS   AGE
      apache-pod   1/1     Running   0          7s
      

      To view Pods in all Namespaces, use the --all-namespaces flag.

      kubectl get pods --all-namespaces
      

      To delete a Namespace, issue the delete namespace command. Note that this will delete all resources within that Namespace:

      kubectl delete namespace my-app
      

      For more information on Namespaces, visit the Kubernetes Namespaces API documentation

      Next Steps

      To continue in the Beginner’s Guide to Kubernetes series, visit part 4:

      More Information

      You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.

      Find answers, ask questions, and help others.

      This guide is published under a CC BY-ND 4.0 license.



      Source link