S3 Object Storage: A Great Design

take 14 minutes to read
Home Points Main article

Getting started with serverless software development requires knowing this story: why did AWS release S3 Object Storage in 2006, followed by AWS Lambda (serverless) in 2014? Let's explore the story behind it and hopefully inspire you to migrate your software projects to a serverless platform.

25,000 #CloudMadness voters can't be wrong. Amazon S3 is the greatest cloud service of all time.

Really, you never know how a popularity contest like this is going to go. But in this case, I think the voters made the right decision. And I have a reason!

S3 is an OG cloud service

Depending on how you define it, 'Simple Storage Service / Simple Storage Service' may not be the first product from AWS -- SQS was actually launched first, but was only used in production environments some time after launch. Regardless, a message queue management product alone is not going to convince most enterprises to consider using the public cloud. A distributed storage solution that can scale as massively and elastically as your data without any additional O&M investment.

As AWS's Mike Deck said on Twitter during the frenzy of voting, "I don't think you appreciate what a revolution it was in the late 00s to have virtually unlimited, highly durable, on-demand storage that didn't require you to manage any hardware. "

Remember the days when you had to back up your drives on a cyclic basis every seven days? Or driving a whole truckload of data backup tapes to an offsite DR location? But, the generation that grew up with S3 doesn't remember these things.

But S3 has become more than just a data repository. As a static web server, S3 serves content to thousands of websites, including Netflix, Wikipedia, and the New York Times. In fact, the world has 'standardized' on S3's API, so much so that Google's comparable competitor has support for the S3 API out of the box.

That's why, when S3 has a rare availability burst in a region, it seems that half of the internet services are down. It's hard to imagine that other cloud services - except perhaps CDNs, such as the closely related CloudFront - would have such an impact. At this point, S3 is the basic Internet infrastructure that will continue to exist for a long time because...

S3 is an engineering marvel.

Distributed storage remains one of the most difficult problems in computer science, especially at scale. Historically many storage management services have emerged, but all have been eliminated because of their inability to maintain the integrity of customer data, which is their most valuable resource.

With that in mind, the S3's durability guarantee -- 11 decimal places to 9, are you kidding me? -- represents a jaw-dropping feat of engineering. To put that in perspective, you are personally 400 times more likely to be hit by a meteor than the likelihood of losing one in a million S3 objects ...... is 400 times greater. Check out this pretty amazing Invent keynote from S3 VP Mai-Lan Tomsen Bukovec and try not to be surprised by the numbers: megabytes of data, tens of trillions of objects, over 235 microservices, distributing that data over an unknown number of physical facilities.

She's talking about canonical 'correctness verification algorithms', checksums between loosely coupled systems, and complex actuarial predictive models for when hard drives will fail.AWS has automated 'endurance audits' that repeatedly grab every byte of S3 data to verify that it's correct when you retrieve the data you've stored in it. They are constantly updating these tools based on their experience from nearly 15 years of running S3 at unimaginable scale.

All this so I could type 's3 sync' at the command line seven years ago to upload random source files. To be honest, at the time this made me feel a bit worthless.

Sure, S3 has added a ton of features over the years, some more specialized than others (S3 Access Points / S3 Access Points, anyone know what that is?) ), but the core value concept hasn't changed: you can put as many objects as you want, store them for as long as you want, and they never crash. Drives fail, data centers go offline, but S3 is still there. That's why we developers take it for granted, you can build an entire architecture around it: just like the sun rises, S3 will be there in the morning, barring a planetary extinction event. That's what makes it truly, unquestionably great.

Without S3, you can't spell serverless

Of course, when most of us hear the word 'serverless', our brains jump to a different service -- AWS Lambda, the original FaaS (Function as a Service / Function as a Service) that kicked off a generation of stateless applications and HackerNews discussion spotlight. (It's no coincidence that Lambda was a close second in the #CloudMadness poll).

However, Ben Kehoe, who has been building serverless applications at iRobot for years, argued strongly in the poll for S3, telling me that "S3 is the epitome of serverless cloud computing," adding that "it solves a very difficult problem that everyone needs to solve, it has a (relatively) simple API that scales to whatever your Internet service can generate storage access traffic, but you only have to pay for storage at standard rates. And, it keeps getting better and better, without requiring any action from the user".

Tim Allen Wagner, who invented Lambda, says that Lambda "actually started as an offshoot of S3, not EC2. so S3 brought something else revolutionary to the world!

That's right: the whole serverless revolution started with her as a service that built triggers for S3 events, and Wagner remembers one of the 'scariest moments' of his career was integrating Lambda with S3: 'In those days, it was like pointing a firehose at a Dixie cup. Fortunately, S3 had an amazing architecture and team, and Lambda grew into very large 'shoes' that could use it at scale.'

"While S3 is probably the greatest cloud service ever," Kehoe added, "Lambda also deserves credit for successfully changing the focus of cloud users from "this is part of my server that is a cloud service managed by someone else and is useful" to "wow, I can do everything with these services. This is an evolutionary step in managed computing resources (cloud computing), but it has completely changed the thinking of developers.'

That's the real bottom line, isn't it?S3, along with some other foundational AWS services like EC2 and Elastic Load Balancer, established some of the fundamentals that have led to a Cambrian explosion of higher-level product innovation over the past decade. While S3 may no longer be the shiny new thing, it's worth looking back to appreciate what AWS has provided us over the years with Storage Optimization Services that Working quietly like an engine under the hood, it's actually not that simple.

After all, you have to stand above the clouds to see the world to be considered standing on the shoulders of giants. And S3 is a giant.

IPhone 14 Is Likely To Support Always On Display
« Prev 05-25
Polestar Wants The Ultimate In Speed Charging, With A Range Of 160 Kilometres In Just 5 Minutes Of Charging
Next » 05-25