Friday, February 26, 2021

Kubernetes 101

Immutable infrastructure is a practice where servers, once deployed, are never modified.

Containers offer a way to package code, runtime, system tools, system libraries, and configs altogether. This shipment is a lightweight, standalone executable.

Kubernetes provides the ability to run dynamically scaling, containerized applications, and utilizing an API for management.

Kubernetes has become the standard for running containerized applications in the cloud, with the main Cloud Providers (AWS, Azure, GCE, IBM and Oracle) now offering managed Kubernetes services.

K8s objects

  • Pod. A group of one or more containers.
  • Service. An abstraction that defines a logical set of pods as well as the policy for accessing them.
  • Volume. An abstraction that lets us persist data. (containers are ephemeral, data is deleted when container is deleted)
  • Namespace. A segment of the cluster dedicated to a certain purpose, for example a certain project
  • Node. A Virtual host on which containers/pods are running

K8s controllers

  • ReplicaSet (RS). Ensures the desired amount of pod is what’s running.
  • Deployment. Offers declarative updates for pods and RS.
  • StatefulSet. A workload API object that manages stateful applications, such as databases.
  • DaemonSet. Ensures that all or some worker nodes run a copy of a pod.
  • Job. Creates one or more pods, runs a certain task(s) to completion, then deletes the pod(s).

A docker container image – an executable image containing everything you need to run your application; application code, libraries, a runtime, environment variables and configuration files. At runtime, a container image becomes a container which runs everything that is packaged into that image.

Key k8s features make containerized application scale efficiently:

  • Horizontal scaling.Scale your application as needed from command line or UI.
  • Automated rollouts and rollbacks.
  • Service discovery and load balancing.
  • Storage orchestration.
  • Secret and configuration management.
  • Self-healing.
  • Batch execution.
  • Automatic binpacking.

Kafka 101

When I do interviews with candidates, they usually talk about Kafka, so I ask them Kafka architecture, more often than not, the candidates cannot answer this properly or completely, so I summarize some key concepts of Kafka.

Kafka cluster typically consists of multiple brokers. Kafka broker uses Zookeeper to maintain the cluster state. Zookeeper also performs Kafka broker leader election.

Producers in Kafka push message to brokers. Consumers in Kafka consume message, by using partition offset the Kafka Consumer maintains that how many messages have been consumed because Kafka brokers are stateless. 

Kafka has four core APIs, producer API, Consumer API, Streams API, and Connector API.

Kafka topic is a logical channel to which producers publish message and from which the consumers receive messages. In a Kafka cluster, a topic is identified by its name and must be unique. There can be any number of topics, there is no limitation. 

Topics are split into Partitions and also replicated across brokers. There can be any number of Partitions, there is no limitation. In one partition, messages are stored in the sequenced fashion, and each message is assigned an incremental id, also called offset.

Topic replication takes place in the partition level only. For a given partition, only one broker can be a leader, other brokers will have in-sync replica.

If we can add a key to a message, we will get ensured that all these messages will end up in the same partition. With this, Kafka offers message sequencing guarantee. Without a key, message is written to partitions randomly.

Consumer Group can have multiple consumer process/instance running.

读书笔记 - What It Takes

 In this book, Blackstone CEO Stephen Schwarzman talked about lessons in the pursuit of excellence.