Everything you ever wanted to know about serverless computing but were afraid to ask

We’ve heard the buzzword, we hear the excitement, but what exactly is serverless computing and why should I care about it?

Serverless computing is running an application in the cloud in such a way that the application owner does not have to manage the underlying servers that are running the application. The servers are still there, but they are managed completely and invisibly by the cloud service provider. From the standpoint of the application owner, the servers are invisible to them, hence the term serverless.

While it is also commonly referred to as ‘Function-as-a-Service’, a better name for serverless computing in my opinion would be ‘Compute-as-a-Service’ (CaaS – if it wasn’t taken already) because it offers the ability to purchase compute in small increments, not functions in small increments.

Understanding serverless computing is critical as it is rapidly becoming a component of enterprise digital strategies. In fact at New Relic we recently surveyed more than 500 customers on their adoption of dynamic cloud technologies and found 64% of respondents had deployed serverless technologies in some form of production or pilot, with another 13% investigating with eyes towards a pilot.

What’s servers got to do with it?

One of the burdens that most IT organizations within fast growing digital enterprises must deal with is deciding how many servers to allocate for a given application in the cloud. They must allocate enough servers for the application to run effectively for however many of their users may try to use the application. If they allocate too many servers, they waste money and resources. If they allocate too few servers, the application may fail by not functioning properly or crashing completely for their users.

Additionally, if an application sees a sudden spike in traffic for some unforeseen reason – such as a news site suddenly getting a surge in visitors because of a breaking story – the additional load can overwhelm the existing servers and make the application unresponsive. We’ve all experienced this as a digital consumer. We go to a website that is currently very popular, and the website is slow to respond or doesn’t respond at all. The process of making sure enough servers is available to the application at any time is called ‘application scaling’.

If the application is run on a serverless cloud, however, IT does not have to worry about how many servers are needed to run the application. The cloud service provider will make sure that sufficient servers are always available to handle the application’s needs. As the needs of the application change, the number of allocated compute resources can be adjusted automatically.

The cloud service provider does this typically by maintaining a shared pool of servers across all their customers and allocates those computing resources as needed for a particular customer’s applications only when they are needed. When the application no longer needs the server capacity, the computing resources are pulled back into the shared pool and made available for another customer’s use.

Server shuffle – bearing the cost of servers

For IT organizations, there are two main advantages of this approach. First, the application can respond to sudden spikes in traffic automatically without the IT team involved in the scaling of the application. This is especially useful for applications that often see sudden and unforeseeable traffic increases, such as a news site covering breaking news.

Second, IT only has to pay for the actual compute resources they consume. They do not have to pay for idle servers lying around and unused due to low traffic volume. They are only charged for the actual compute resources they consume. When the application is busy, they pay more for the needed compute resources. When the application is less busy, they pay less for the needed resources.

There are benefits for cloud service providers too. Serverless computing allows them to manage computing resources across a larger customer set, which averages out traffic more and makes it easier for them to predict demand. This is because the larger the number of customers, the more uniform the average traffic needs are, and the better they can optimize usage. Additionally, by ‘hiding’ the servers and implementation of the service from the consumer, they can optimize the implementation based on their predicted needs and requirements.

From a financial standpoint, the cloud service provider’s ability to predict demand accurately is critical for them being able to support their customers while maintaining very thin business financial margins. Additionally, due to the extra flexibility provided to their customers, cloud service providers can usually charge a premium price for these compute resources.

When to go serverless

Like all tools, knowing when it is useful and when it is not is important to make effective use of the tool. Understanding when and how to use serverless computing involves three main considerations.

Cost & traffic

Serverless computing works best when a company’s computing needs are quite variable, with very high highs and very low lows in traffic volume. If this is the case, companies only pay for the resources they actually consume, so they may pay more at times of higher utilisation and less at times of lower utilisation. For very spiky applications, this will save money in the long term. However, if an application’s use of computing is much more uniform, the advantages of serverless are less dramatic and the premium price for the resources can cause serverless computing to be significantly more expensive for an organization than managing their own servers. So, serverless computing is useful mainly for applications with variable traffic profiles.

Setup & operation

Serverless computing is often seen as harder to setup and manage than traditional server-based computing. This is mostly because the existing tools that IT professional have commonly relied on, for years, are optimized for deploying applications to server-based environments. Newer tools are needed to make better use of serverless computing and make it easier to manage large serverless applications. Those tools will eventually be created. Today, however, the current tools are mostly immature or non-existent.

Additionally, the need for diagnostic tools for solving problems with serverless computing are fundamentally different than for solving standard server-based applications. This means that new tools and capabilities must be developed to keep serverless applications running optimally. While there are tools on the market which currently support serverless computing, these tools must continue to evolve to meet the needs of these new compute paradigms before they can provide the same level of support as they do for server based applications.

Standardization & portability

There are also no standards today for how application owners interface with serverless computing. Each cloud service provider provides a different and unique method for offering serverless computing. Amazon Lambda works very differently from Microsoft Azure Functions, which works very differently from Google Cloud Functions. This means that an application owner who wants to take advantage of serverless computing will find they can be locked into a single cloud service provider to a greater degree than if they use more standardized traditional server-based computing.

Different flavors of serverless services

When thinking about serverless, it is easy to focus on serverless computing, such as those capabilities provided by Amazon Lambda, Microsoft Azure Functions and Google Cloud Functions. However, there are many other cloud based services that offer similar advantages to serverless computing, meaning they allow the application owner to scale the use of the service without having to worry about allocating reserved servers for the service to use.

Classic examples of this include serverless databases such as Amazon DynamoDB and Google Cloud Datastore. But there are other services, such as object stores (Amazon S3), queuing and notification services (SQS, SNS), and email distribution services that offer similar scalable capabilities without the need for allocating and managing servers. Using these services involves the same sets of considerations as does serverless computing.

The bottom line

Serverless computing offers a valuable toolset digital enterprises can use in building their applications, especially applications with huge variability in traffic usage. However, like any tool, they have a use and a purpose and it typically does not make sense to use serverless for all of an IT organization’s computing needs. Traditional server-based computing still has advantages and uses and will likely remain that way for some time to come.

Used properly, serverless computing can help you build your application to scale to your greatest needs without breaking the bank financially. But it should be used in conjunction with – not as a replacement for – other tools and computing capabilities to form a complete application solution.

Article, written by me, originally appeared in Diginomica, Aug 2017.

AWS Lambda v Amazon ECS — two paths to one goal, which is best?

Launched in parallel two and a half years ago by Amazon Web Services (AWS), AWS Lambda and Amazon EC2 Container Service (ECS) are two distinct services that each offer a new, leaner way of accessing compute resources. Amazon ECS lets developers tap into container technology on a pay-as-you-go basis. AWS Lambda offers what is often known as ‘serverless’ computing, or function-as-a-service — the ability to access specific functions, again on pay-as-you-go terms.

On the surface, they both serve the same goal — provide a compute environment for applications, services and microservices that allows developers to focus on the application, not on the infrastructure.

But why are there two distinct services? What’s the difference between them? And, most importantly, when would I use one versus the other?

Great questions. Let’s take a look at each service … But first, for clarity, a quick explanation. To avoid confusion of the term ‘service’ in this article, I will refer to applications, whether they are monolithic or elementally broken into services, as application services or simply applications. I will refer to the AWS services such as AWS Lambda and Amazon ECS generically simply as cloud services or AWS services. OK, now that’s clear, let’s move on.

What is AWS Lambda?

AWS Lambda allows custom code to execute in response to triggers caused by activity from other AWS resources, services, and web apps. AWS Lambda provides this capability by allowing specially constructed code segments (called Lambda functions) to execute in an environment where the infrastructure becomes totally invisible and irrelevant.

Scaling and server management are handled transparently by AWS. The user isn’t even aware of, and has no visibility into, how the servers are organized to execute the functions — this is all hidden from view by AWS.

The downside of this approach is that the code segments (functions) that run in AWS Lambda are quite limited in what they can do — they must be relatively small and simple. These requirements are enforced not only by the execution environment provided, but by the pricing model put in place for the cloud service.

What is Amazon ECS?

Amazon ECS allows running Docker containers in a standardized, AWS-optimized environment. The containers can contain any code or application module written in any language.

Rather than being handled by AWS, scaling and server management has to be set up by the user. The containers themselves run on standard Amazon EC2 instances that are configured with special Amazon ECS software. These underlying Amazon EC2 instances within an individual cluster of servers can be of any size or quantity, depending on your application’s scaling needs. Via the Amazon ECS software, configuration and management of the underlying cluster is used to determine where, how many, and how each container is to execute on the given cluster. The Amazon EC2 instances in the cluster must be sized and scaled by the user to handle the quantity and execution demands of the containers.

AWS Lambda v Amazon ECS

AWS Lambda and Amazon ECS are similar in many regards. The code that the two AWS services execute does not have to have any visibility into the underlying infrastructure. The infrastructure decisions you must make in operating the service can be made independently from application coding decisions. If constructed properly, the code on either AWS service can provide significantly valuable scaling capabilities.

However, the two services differ in some very substantial ways. AWS Lambda does not provide any visibility into the server infrastructure environment used to run the application code, while Amazon ECS actively exposes the servers used in the cluster as standard Amazon EC2 instances and allows (or more correctly requires) the user to size and scale their fleet themselves.

AWS Lambda functions must be written in one of a handful of supported languages and are restricted in the type of actions they can perform. Amazon ECS, on the other hand, can run any container using any code that is capable of running in a container (which is almost any application that runs on a typical Linux operating system).

AWS Lambda is optimized for simple and quick functions to execute. Larger and more complex functions create execution complexity (and significant execution cost) to the user. Amazon ECS, on the other hand, can be used with any reasonable size and complexity container.

With AWS Lambda, all scaling and sizing decisions are made automatically and continuously by AWS. This allows a complete hands-off solution where the user can ignore most scaling issues. Amazon ECS, on the other hand, requires the user to knowingly understand the required server fleet sizing and make active decisions to resize the fleet as necessary as scaling needs change.

Which AWS service should I use?

Either one of these services can be used to run applications or application services. So, which AWS service should you use for a particular purpose? The answer depends on the needs of the application. If you want to run very small actions that are relatively simple in complexity, AWS Lambda provides a compelling hands-off solution to a highly scalable application. If your application or applications services have any complexity to them at all, Lambda may be too restrictive and too expensive to operate, and Amazon ECS may provide better options for you.

Of course, it is perfectly reasonable for different application services within a single application to separately use either of these two AWS services. As such, some of your application may run in AWS Lambda, and other parts of your application run in Amazon ECS.

I personally would like to see another option. I believe AWS should support a hybrid service. That is, a service with the infrastructure opacity and ease of management that Lambda provides, but which allows the code that is executed to be written and executed within a container environment. This will allow the best of each offering: versatility of container-based applications with the simplified infrastructure management available from AWS Lambda. This would be the best of both worlds, and I hope AWS is considering such a service.

Originally published at diginomica.com on June 29, 2017.

The London Sunday Times: Raconteur: Serverless computing

Serverless computing is one of the hottest trends in tech, however it’s also one of the most misunderstood. From the article:

Lee Atchison, senior director at analytics platform New Relic, warns: “Each service provides a different and unique method for offering serverless computing. This means that an IT professional who wants to take advantage of serverless computing will find they are locked into a single cloud service provider to a greater degree than if they use more standardised traditional server-based computing.”

Read More

Building Right-Sized Application Services: The Goldilocks Calculation

In the world of applications, services are standalone components that, when connected and working together, create an application that performs some business purpose. But services come in a wide variety of sizes, from tiny, super-specialized microservices up to services big and complete enough to form their own monolithic applications. Just like Goldilocks looking for the perfect fit, it’s not always easy to determine the right size for the services you need to build your organization’s apps and meet your business goals.

Read More

Goldilocks, serverless and DevOps: Five predictions for IT in 2017

Technological innovation drives every business, industry and sector - mostly positively, but not always. 2016 was no exception – from the first long-haul driverless cargo delivery to automated retail locations to the stiffening competition among ‘smart assistants’ we’re seeing big technological leaps at a breakneck pace.

At the same time, many of the enterprise trends of the last few years are continuing, such as traditional businesses leading big digital transformation and the move to public clouds, with the continued market dominance of Amazon’s $13B AWS business.

As 2016 draws to a close, it’s time to once again consider how the IT industry will grow, adapt, evolve and transform in the coming year, and to consider what lies in store for 2017. Here, I set out my top five predictions for what we can expect to see over the next 12 months and beyond.

Read More

Visibility into the Migration From Static to Dynamic Infrastructure [Video]

When I look back at my career over the last 30 years, it’s amazing to see how much the world has changed when it comes to building, running, and managing software. At my first job, for example, our company was trying to reduce its development cycle down to less than a year. Nowadays with cloud architectures we’re seeing development cycles of just weeks, days, or even hours. But that’s not to say that all cloud environments are dynamic and rapidly changing.

Read More

Why I Wrote the Book on ‘Architecting for Scale’

As applications grow, two things begin to happen: they become significantly more complicated (and hence brittle), and they handle significantly larger traffic volume (which more novel and complex mechanisms manage). This can lead to a death spiral for an application, with users experiencing brownouts, blackouts, and other quality-of-service and availability problems. “But your customers don’t care. They just want to use your application to do the job they expect it to do. If your application is down, slow, or inconsistent, customers will simply abandon it and seek out competitors that can handle their business. That’s how my new book, Architecting for Scale: High Availability for Your Growing Applications, begins.

Read More

Distributing the Cloud - AWS Architecture - Part 3

We all know the value of distributing an application across multiple data centers. The same philosophy applies to the cloud. As we put our applications into the cloud we need to watch where in the cloud they are located. How geographically and network topologically distributed our applications are is just as important as with normal data centers. While Amazon AWS won’t tell you specifically where your application is running, they do give you enough information to make diversification decisions. Interpreting and understanding this information, and using it to your advantage, requires an understanding of how AWS is architected. In part 1 of this article, we talked about the AWS Architecture of regions and availability zones. In part 2, we went into more detail about how availability zones are structured, and how we can utilize this information. In this final part, we discuss the availability zone to data center mapping, why it is important, and how to use all this information to make sure you have the highest diversification as possible for your application.

Read More

Distributing the Cloud - AWS Architecture - Part 2

We all know the value of distributing an application across multiple data centers. The same philosophy applies to the cloud. As we put our applications into the cloud we need to watch where in the cloud they are located. How geographically and network topologically distributed our applications are is just as important as with normal data centers. While Amazon AWS won’t tell you specifically where your application is running, they do give you enough information to make diversification decisions. Interpreting and understanding this information, and using it to your advantage, requires an understanding of how AWS is architected. In part 1 of this article, we talked about the AWS Architecture of regions and availability zones. In part 2, we will go into more detail about how availability zones are structured, and how we can utilize this information.

Read More

Distributing the Cloud - AWS Architecture - Part 1

We all know the value of distributing an application across multiple data centers. The same philosophy applies to the cloud. As we put our applications into the cloud we need to watch where in the cloud they are located. How geographically and network topologically distributed our applications are is just as important as with normal data centers. However, the cloud makes knowing where your application is located harder. The cloud also makes it harder to proactively make your application more distributed. Some cloud providers don’t even expose enough information to let you know where, geographically, your application is running. Luckily, larger providers like AWS are better. No, AWS won't tell you specifically where, geographically, your application is running, since they do not disclose their actual data center locations (I worked at AWS, and I have no idea, specifically, where the data centers are located). While they won’t tell you specifically where your application is running, they do give you enough information to make diversification decisions. Interpreting and understanding this information, and using it to your advantage, requires an understanding of how AWS is architected.

Read More

Scaling with Availability

One of the most important topics in architecting for scalable systems is availability. While there are some companies and some services where a certain amount of downtime is reasonable and expected, most businesses cannot have any downtime at all without it impacting their customer’s satisfaction, and ultimately their company’s bottom line. How do you keep your customers happily using your service and keep your company’s revenue coming in? You keep your service operational as much as possible. There is a direct and meaningful correlation between system availability, and customer satisfaction.

Read More

Why Use Microservices?

Traditionally, software companies created large, monolithic applications. The single monolith encompasses all business activities for a single application. As the company grew, so did the monolith. In this model, implementing an improved piece of business functionality requires developers to make changes within the single application, often with many other developers attempting to make changes to the same single application. Developers can easily step on each other’s toes and make conflicting changes that result in problems and outages. Development organizations get stuck in the muck, and applications slow down and become unreliable. The companies, as a result, end up losing customers and money. The muck is not inevitable, you can build and rearchitect your application to scale with your company, not against it.

Read More

Welcome!

Scaling web applications isn’t easy. As web applications grow, two things begin to happen. First, they become significantly more complicated and hence brittle. Second, they handle significantly larger traffic volume requiring more novel and complicated mechanisms to handle this traffic. This can lead to a death spiral for an application that can lead to brownouts, blackouts, and other quality of service and availability problems. My purpose for this blog is to provide techniques, guidance, and best practices for how to build web applications that scale to significant traffic volumes.

Read More