Cloud Native: Using Containers, Functions, and Data to Build Next-Generation Applications by Boris Scholl, Trent Swanson, Peter Jausovec
Book about cloud design patterns, and I got bored about halfway through, mostly because it had a heavy focus on distributed systems and coordinating multiple services using cloud infrastructure, which is not very relevant to smaller-scale projects, and only relevant as the product scales up.
Chapter 1: This book covers the principles of designing highly distributed applications that run natively on the cloud. Distributed applications must deal with issues like network availability, limitations of changing infrastructure, etc.
Chapter 2: Containers are useful as encapsulated components that can be deployed. Docker is the most popular option as it is lightweight and sits on top of the operating system. However, there are other options used by cloud providers to provide secure multi-tenancy. Kubernetes manages container orchestration. It has master components on a control plane, such as the scheduler and etcd, and node components on a data plane that handle network configuration and running containers. Technically, Kubernetes does not have Docker as a dependency and instead uses the Container Runtime Interface (CRI) to interface with containers.
Functions are more lightweight than containers, Functions as a Service (FaaS) like AWS Lambda; they execute a function in response to a specific event and immediately shut down. The process of app modernization typically involves first containerizing the application and then building it into microservices, which involves container orchestration. The benefit of microservices is improved service boundaries, but it presents more challenges in engineering as it involves more distributed systems and is challenging to deploy.
Chapter 3: Fundamental principles of cloud applications. Security in depth means you should have multiple layers of redundant security and access measures. In general, applications should be stateless. This way, you don’t have to worry about routing a user to the same server; instead, the data lives in its own layer outside the service. Service choreography is using the eventing system for communication instead of any synchronous request-response in a traditional architecture, is recommended.
Functions are stateless, triggered by events, and store data in a separate data service if needed. The benefits include high scalability, although cold start latency might be an issue. However, they are more difficult to orchestrate and develop.
API versioning is challenging because it’s hard to get everyone to upgrade to the latest version, and maintaining multiple versions can be difficult. The best compromise is to maintain some level of backward compatibility whenever possible, for example, by not requiring new fields to be added. Services can communicate with each other over HTTP and JSON, or other binary protocols that are more optimized.
Publish-subscribe is an alternative to the traditional request-response architecture. In this model, the publisher sends a message to a broker like Redis or Kafka, and multiple subscribers receive the event and handle it. Since the network may be unreliable, this should be idempotent. This enables looser coupling between services compared to the request-response model.
Gateway is a proxy that performs cross-cutting concerns like routing, compression, caching, and SSL, but not any application-specific functionality. A service mesh is a different type of proxy layer that sits beneath various services written in different languages and handles communication logic like traffic control and failure handling, so it does not need to be reimplemented multiple times in different languages.
Chapter 4: Working with Data. Cloud applications need to deal with data that is spread across many places and generally prefer object storage services, which are cheaper than file system or disk block storage. The options for database engines include relational, key-value, document, and graph. Streams and queues are used for communication between services; streams are generally immutable and can handle a large amount of data, while queues are more short-term, and messages can be modified.
Data often exists in different systems, and in these cases, change data capture (CDC) is often needed to synchronize data across services. Here, you will write to one database and propagate the changes to other places. A transaction supervisor is a service that monitors the data for consistency issues, eg: a checkout service might save the order with a pending status, then call the payment API, and then set the status to complete, and the supervisor service checks for discrepancies.
A service might use an operational store for its internal operations that it can change easily, and a different integration store for consumption by other services like data analytics, and the integration store has a more stable contract that others can rely on.
How clients may access data – clients might go through a service to get the data, but this places a heavy load on the service. Instead, the service might grant a limited access token for the client to access the data directly. Some databases like GraphQL offer authentication as well for clients to query directly and have fine-grained access controls.
Some strategies to scale the data layer are sharding, a cache layer, or CDN to serve static assets. A data lake contains more raw and unstructured data, whereas a data warehouse contains more structured data with a schema, and these are usually queried with a distributed engine, which can be either open source or hosted by a cloud provider. For storage volumes on Kubernetes, Kubernetes has a concept of persistent storage that has a different lifecycle from containers and is mounted on a pod.