Lets talk AWS Simple Storage Service

Nicholas Martinez
4 min readDec 30, 2017

Simple Storage Service (S3) is a cost-effective, redundant and highly-available object based storage solution provided by AWS. AWS S3 offers unlimited storage space with an object size limit of 5 terabytes. S3 buckets are saved in a universal namespace and must be globally unique.

https://s3.console.aws.amazon.com/s3/buckets/{bucket-name}/{sub-dir}/…/{item}

S3 is a simple key/value storage service that allows console level or programmatic uploads with version control (more on that later). Using programmatic access to S3 requires you configure an IAM user with an S3 permission’s policy. The user will be issued a set of credentials which can then be used within your code base to access your S3 services. You can read more about configuring IAM roles here and below is an example of a function that uploads and image to S3 written in Python.

def upload_s3(self, img):
conn = boto.connect_s3(credentials["AWS_ACCESS_KEY_ID"], credentials["AWS_SECRET_ACCESS_KEY"])
bucket = conn.get_bucket(credentials["bucket_name"])
k = Key(bucket)
k.key = '{}/{}'.format(self.get_s3_dir(), os.path.basename(img))
k.set_contents_from_filename(img)

The function above uploads an image to a previously created S3 bucket. The ‘conn’ variable opens the connection to AWS with our S3 credentials created in the IAM user process. You then open the bucket and assign it to the ‘bucket’ variable. The variable ‘k’ sets the ‘key’ to the bucket we fetched, and we then call ‘k.key’ to define the directory path end point for the image we wish to store. ‘k.set_contents_from_filename’ will take the local file-path for image we wish to upload. The object (the value) is uploaded to the bucket (the key). Alternatively, you can copy objects from your local directory into S3 right out of the command line:

aws s3 cp test.txt s3://mybucket/test2.txt

This approach requires that one configure the AWS CLI tool with the user credentials we created in the IAM user creation above.

S3 Data Consistency:

The AWS S3 consistency model has read after write consistency for PUTS of new objects. This means that an object may be read immediately after creation. However, new S3 objects have eventual consistency for overwrite PUTS or DELETES. Updating or overwriting files may take some time to propagate. This means that after an overwrite upload or delete of an object on S3, the user may retrieve an outdated version of the object if a retrieval attempt is made immediately after the overwrite is executed, likewise, an object may still be retrieved if an attempt is made immediately after a programmatic delete attempt.

S3 supports tiered storage options and life-cycle management for all objects. Life-cycle management allows a user to select how long data lives in a specific tier of storage. For example, you may wish to store an object in a standard S3 bucket for thirty days, then migrate that object to a Glacier Archive after the first month. S3 also offers Infrequently Accessed storage (S3-IA), a service that costs less than Standard S3, maintains similar retrieval times, but has a higher per cost retrieval fee structure. Glacier is an even less expensive long term storage option than IA, however comes with even longer retrieval times (up-to four hours) as well as higher retrieval costs.

Cross-region replication is an additional option for your S3 buckets and objects. This allows a user to create a bucket in a new region, on the other side of the globe and replicate all of the objects within in that newly created bucket. Doing so requires versioning to be enabled on each bucket. You then visit the management tab of your S3 buckets :

Management > Replication > Select Source > Enable

Depending on the size of the objects being replicated they may take some time to propagate but all of the data you had stored in the Virginia will soon traverse the globe and be replicated for faster access in Sydney, or any of the other available regions you choose.

The last feature I will mention about S3 is versioning. From the documentation:

Versioning is a means of keeping multiple variants of an object in the same bucket. You can use versioning to preserve, retrieve, and restore every version of every object stored in your Amazon S3 bucket. With versioning, you can easily recover from both unintended user actions and application failures.

With versioning enabled on a bucket a user is able to store objects with the same Key in the same bucket and they will be versioned with different version ID numbers. This protects users from unwanted overwrites, careless deletes or even allows simple rollbacks if they are required. A user can also enable multi-factor-authentication for deletes on their buckets. This will require any user that wishes to delete an object from an S3 bucket to use a physical or digital MFA device that will provide a unique pass-code before the deletion can be confirmed.

If EC2 is the backbone of AWS, churning the work and processing its data, than S3 are its supplies. AWS’s Simple Storage Service offers its users highly available data that can be transferred around the globe seamlessly. It can be used simply as a backup for your personal data or as a store for data your users provide your business, to be called on later for use and analysis. The durability, availability, and accessibility of S3 data make it an ideal storage solution for just about all of your object storage needs.

--

--