KUBERNETES

kubernetes使用本地磁盘进行动态管理pv

前言使用本地磁盘作为pv kubernetes从1.10版本开始支持local volume（本地卷），workload（不仅是statefulsets类型）可以充分利用本地磁盘的优势，从而获取比remote volume（如nas, nfs, cephfs、RBD）更好的性能。在local volume出现之前，statefulsets也可以利用本地磁盘，方法是配置hostPath，并通过nodeSelector或者nodeAffinity绑定到具体node上。但hostPath的问题是，管理员需要手动管理集群各个node的目录，不太方便。以上无论是hostPath还是local volume都不支持动态扩容，并且程序移植改动比较大。由于项目的需要，需要支持动态创建和扩容pv/pvc 本文参考了以下两个开源项目： https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner https://github.com/rancher/local-path-provisioner 经过测试： kubernetes-sigs版不支持动态扩容/动态供给dynamically provisioning，而且需要提前手动在node节点上创建并且mount对应的挂载点。 Rancher版本的local-path-provisioner支持动态创建挂载点，动态创建pv 下面两种方法都介绍一下安全和使用方式，最后推荐使用第三章介绍的local-path-provisioner来进行动态创建pv 第一章使用sig-storage-local-static-provisioner 1.1 拉取官方源码进行安装

git clone https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner.git
cd sig-storage-local-static-provisioner/
git checkout tags/v2.6.0 -b v2.6.0
helm template ./helm/provisioner -f ./helm/provisioner/values.yaml > local-volume-provisioner.generated.yaml
kubectl create -f local-volume-provisioner.generated.yaml

1

2

3

4

5

git clone https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner.git

cd sig-storage-local-static-provisioner/

git checkout tags/v2.6.0 -b v2.6.0

helm template ./helm/provisioner -f ./helm/provisioner/values.yaml > local-volume-provisioner.generated.yaml

kubectl create -f local-volume-provisioner.generated.yaml

1.2创建storageclass

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: fast-disks
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

1

2

3

4

5

6

kind: StorageClass

apiVersion: storage.k8s.io/v1

metadata:

provisioner: kubernetes.io/no-provisioner

volumeBindingMode: WaitForFirstConsumer

1.3挂载磁盘其Provisioner本身其并不提供local volume，但它在各个节点上的provisioner会去动态的“发现”挂载点（discovery directory），当某node的provisioner在/mnt/fast-disks目录下发现有挂载点时，会创建PV，该PV的local.path就是挂载点，并设置nodeAffinity为该node。可以用以下脚本通过mount bind方式创建和挂载磁盘

#!/bin/bash
for i in $(seq 1 5); do
  mkdir -p /mnt/fast-disks-bind/vol${i}
  mkdir -p /mnt/fast-disks/vol${i}
  mount --bind /mnt/fast-disks-bind/vol${i} /mnt/fast-disks/vol${i}
done

1

2

3

4

5

6

#!/bin/bash

for i in $(seq 1 5); do

mkdir -p /mnt/fast-disks-bind/vol${i}

mkdir -p /mnt/fast-disks/vol${i}

mount --bind /mnt/fast-disks-bind/vol${i} /mnt/fast-disks/vol${i}

done

下面是在各个node节点用以上脚本创建挂载点：执行该脚本后，等待一会，执行查询pv命令，就可以发现自动创建了 1.4测试pod是否可以运行

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: local-test
spec:
  serviceName: "local-service"
  replicas: 3
  selector:
    matchLabels:
      app: local-test
  template:
    metadata:
      labels:
        app: local-test
    spec:
      containers:
      - name: test-container
        image: busybox
        command:
        - "/bin/sh"
        args:
        - "-c"
        - "sleep 100000"
        volumeMounts:
        - name: local-vol
          mountPath: /tmp
  volumeClaimTemplates:
  - metadata:
      name: local-vol
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "fast-disks"
      resources:
        requests:
          storage: 2Gi

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

apiVersion: apps/v1

kind: StatefulSet

metadata:

spec:

serviceName: "local-service"

replicas: 3

selector:

matchLabels:

app: local-test

template:

metadata:

labels:

app: local-test

spec:

containers:

- name: test-container

image: busybox

command:

- "/bin/sh"

args:

- "-c"

- "sleep 100000"

volumeMounts:

- name: local-vol

mountPath: /tmp

volumeClaimTemplates:

- metadata:

spec:

accessModes: [ "ReadWriteOnce" ]

storageClassName: "fast-disks"

resources:

requests:

storage: 2Gi

可以看到，三个pod都正常运行起来了：第二章使用local-path-provisioner 2.1下载yaml文件

wget https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.25/deploy/local-path-storage.yaml

1	wget https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.25/deploy/local-path-storage.yaml

2.2 修改其中有几处需要做修改 2.2.1 删除调试模式删除–debug 2.2.2修改reclaimPolicy Read more…

By sean, 5 months11/23/2023 ago

KUBERNETES

kubectl 的奇技淫巧

Kubectl 是 Kubernetes 最重要的命令行工具。在 Flant，我们会在 Wiki 和 Slack 上相互分享 Kubectl 的妙用（其实我们还有个搜索引擎，不过那就是另外一回事了）。多年以来，我们在 kubectl 方面积累了很多技巧，现在想要将其中的部分分享给社区。我相信很多读者对这些命令都非常熟悉；然而我还是希望读者能够从本文中有所获益，进而提高生产力。下列内容有的是来自我们的工程师，还有的是来自互联网。我们对后者也进行了测试，并且确认其有效性。现在开始吧。获取 Pod 和节点我猜你知道如何获取 Kubernetes 集群中所有 Namespace 的 Pod——使用 –all-namepsaces 就可以。然而不少朋友还不知道，现在这一开关还有了 -A 的缩写。如何查找非 running 状态的 Pod 呢？ kubectl get pods -A –field-selector=status.phase!=Running | grep -v Complete 顺便一说，–field-selector 是个值得深入一点的参数。如何获取节点列表及其内存容量： kubectl get no -o Read more…

By sean, 8 months ago

KUBERNETES

构建多种系统架构支持的 Docker 镜像

最新在信创项目中，经常需要构建支持amd64和arm64架构的镜像，而有的场景在同一个 Kubernetes 集群中的节点是混合架构的，也就是说，其中某些节点的 CPU 架构是 x86 的，而另一些节点是 ARM 的。为了让我们的镜像在这样的环境下运行，一种最简单的做法是根据节点类型为其打上相应的标签，然后针对不同的架构构建不同的镜像，比如 demo:v1-amd64 和 demo:v1-arm64，然后还需要写两套 YAML：一套使用 demo:v1-amd64 镜像，并通过 nodeSelector 选择 x86 的节点，另一套使用 demo:v1-arm64 镜像，并通过 nodeSelector 选择 ARM 的节点。很显然，这种做法不仅非常繁琐，而且管理起来也相当麻烦，如果集群中还有其他架构的节点，那么维护成本将成倍增加。你可能知道，每个 Docker 镜像都是通过一个 manifest 来描述的，manifest 中包含了这个镜像的基本信息，包括它的 mediaType、大小、摘要以及每一层的分层信息等。可以使用 docker manifest inspect 查看某个镜像的 manifest 信息：

$ docker manifest inspect aneasystone/hello-actuator:v1
{
        "schemaVersion": 2,
        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
        "config": {
                "mediaType": "application/vnd.docker.container.image.v1+json",
                "size": 3061,
                "digest": "sha256:d6d5f18d524ce43346098c5d5775de4572773146ce9c0c65485d60b8755c0014"
        },
        "layers": [
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 2811478,
                        "digest": "sha256:5843afab387455b37944e709ee8c78d7520df80f8d01cf7f861aae63beeddb6b"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 928436,
                        "digest": "sha256:53c9466125e464fed5626bde7b7a0f91aab09905f0a07e9ad4e930ae72e0fc63"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 186798299,
                        "digest": "sha256:d8d715783b80cab158f5bf9726bcada5265c1624b64ca2bb46f42f94998d4662"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 19609795,
                        "digest": "sha256:112ce4ba7a4e8c2b5bcf3f898ae40a61b416101eba468397bb426186ee435281"
                }
        ]
}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

$ docker manifest inspect aneasystone/hello-actuator:v1

{

"schemaVersion": 2,

"mediaType": "application/vnd.docker.distribution.manifest.v2+json",

"config": {

"mediaType": "application/vnd.docker.container.image.v1+json",

"size": 3061,

"digest": "sha256:d6d5f18d524ce43346098c5d5775de4572773146ce9c0c65485d60b8755c0014"

},

"layers": [

{

"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",

"size": 2811478,

"digest": "sha256:5843afab387455b37944e709ee8c78d7520df80f8d01cf7f861aae63beeddb6b"

},

{

"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",

"size": 928436,

"digest": "sha256:53c9466125e464fed5626bde7b7a0f91aab09905f0a07e9ad4e930ae72e0fc63"

},

{

"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",

"size": 186798299,

"digest": "sha256:d8d715783b80cab158f5bf9726bcada5265c1624b64ca2bb46f42f94998d4662"

},

{

"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",

"size": 19609795,

"digest": "sha256:112ce4ba7a4e8c2b5bcf3f898ae40a61b416101eba468397bb426186ee435281"

}

]

}

可以加上 –verbose 查看更详细的信息，包括该 manifest 引用的镜像标签和架构信息：

$ docker manifest inspect --verbose aneasystone/hello-actuator:v1
{
        "Ref": "docker.io/aneasystone/hello-actuator:v1",
        "Descriptor": {
                "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
                "digest": "sha256:f16a1fcd331a6d196574a0c0721688360bf53906ce0569bda529ba09335316a2",
                "size": 1163,
                "platform": {
                        "architecture": "amd64",
                        "os": "linux"
                }
        },
        "SchemaV2Manifest": {
                ...
        }
}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

$ docker manifest inspect --verbose aneasystone/hello-actuator:v1

{

"Ref": "docker.io/aneasystone/hello-actuator:v1",

"Descriptor": {

"mediaType": "application/vnd.docker.distribution.manifest.v2+json",

"digest": "sha256:f16a1fcd331a6d196574a0c0721688360bf53906ce0569bda529ba09335316a2",

"size": 1163,

"platform": {

"architecture": "amd64",

"os": "linux"

}

},

"SchemaV2Manifest": {

...

}

我们一般不会直接使用 manifest，而是通过标签来关联它，方便人们使用。从上面的输出结果可以看出，该 manifest 通过 docker.io/aneasystone/hello-actuator:v1 这个镜像标签来关联，支持的平台是 linux/amd64，该镜像有四个分层，另外注意这里的 mediaType 字段，它的值是 application/vnd.docker.distribution.manifest.v2+json，表示这是 Docker 镜像格式（如果是 application/vnd.oci.image.manifest.v1+json 表示 OCI 镜像）。可以看出这个镜像标签只关联了一个 manifest ，而一个 manifest 只对应一种架构；如果同一个镜像标签能关联多个 manifest ，不同的 manifest 对应不同的架构，那么当我们通过这个镜像标签启动容器时，容器引擎就可以自动根据当前系统的架构找到对应的 manifest 并下载对应的镜像。实际上这就是多架构镜像（ multi-arch Read more…

By sean, 11 months06/09/2023 ago

KUBERNETES

docker下载镜像报错unknown method AddResource for service containerd.services.leases.v1.Leases: not implemented

导语：下载镜像到最后会提示unknown method AddResource: not implemented

<span class="pipeline-node-17">+ docker buildx build --output=type=local,name=openeuler,dest=./resources --platform=linux/amd64,linux/arm64 -f build/Dockerfile.os.openeuler .
#1 [internal] booting buildkit
#1 pulling image moby/buildkit:buildx-stable-1
#1 pulling image moby/buildkit:buildx-stable-1 11.0s done
#1 creating container buildx_buildkit_multi-platform0 0.0s done
#1 ERROR: Error response from daemon: No such image: moby/buildkit:buildx-stable-1</span>

1

2

3

4

5

6

<span class="pipeline-node-17">+ docker buildx build --output=type=local,name=openeuler,dest=./resources --platform=linux/amd64,linux/arm64 -f build/Dockerfile.os.openeuler .

#1 [internal] booting buildkit

#1 pulling image moby/buildkit:buildx-stable-1

#1 pulling image moby/buildkit:buildx-stable-1 11.0s done

#1 creating container buildx_buildkit_multi-platform0 0.0s done

#1 ERROR: Error response from daemon: No such image: moby/buildkit:buildx-stable-1</span>

原因，安装完docker启动后，还需要重启一下containerd服务 sudo systemctl restart containerd.service sudo systemctl restart docker 参考https://blog.csdn.net/sinat_14840559/article/details/114399166

By sean, 11 months06/07/2023 ago

KUBERNETES

Kubernetes 网络

Overview 本文将探讨 Kubernetes 中的网络模型，以及对各种网络模型进行分析。 Underlay Network Model 什么是 Underlay Network 底层网络 Underlay Network 顾名思义是指网络设备基础设施，如交换机，路由器, DWDM 使用网络介质将其链接成的物理网络拓扑，负责网络之间的数据包传输。 Underlay network topology underlay network 可以是二层，也可以是三层；二层的典型例子是以太网 Ethernet，三层是的典型例子是互联网 Internet。而工作于二层的技术是 vlan，工作在三层的技术是由 OSPF, BGP 等协议组成。 k8s 中的 underlay network 在 kubernetes 中，underlay network 中比较典型的例子是通过将宿主机作为路由器设备，Pod 的网络则通过学习路由条目从而实现跨节点通讯。 underlay network topology in kubernetes 这种模型下典型的有 flannel 的 host-gw 模式与 calico Read more…

By sean, 1 year03/08/2023 ago

KUBERNETES

信创系统安装docker-ce

在统信系统上安装docker-ce, 一般官方源只有docker-engine，对于基于yum源管理的系统，需要配置使用centos源来安装docker-ce 访问https://mirrors.aliyun.com/centos/，发现在阿里源当中，只有centos8提供了aarch64架构源，所以需要配置8版本阿里repo源地址： https://mirrors.aliyun.com/repo/ 一。首先配置阿里镜像源

curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-8.repo
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

1 2	curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-8.repo yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo

其中Centos-8.repo也可以换成Centos-vault-8.5.2111的源：

curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-vault-8.5.2111.repo

1	curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-vault-8.5.2111.repo

定义 yum 变量&修改 repo 修改 centos 和 docker repo文件中的 $releasever 为 centos_version ，原因是在统信服务器操作系统中 $releasever被修改为了 10，而我们需要使用 centos 8的镜像源，如果你不替换，基本上仓库的每一个地址都是404。

sed -i 's/$releasever/8/g' /etc/yum.repos.d/docker-ce.repo
sed -i 's/$releasever/8/g' /etc/yum.repos.d/CentOS-Base.repo

1 2	sed -i 's/$releasever/8/g' /etc/yum.repos.d/docker-ce.repo sed -i 's/$releasever/8/g' /etc/yum.repos.d/CentOS-Base.repo

yum makecache 安装docker: yum install docker-ce-20.10.17 docker-ce-cli-20.10.17 containerd.io-1.6.7

By sean, 1 year03/02/2023 ago

KUBERNETES

使用 GitHub Actions 编译 kubernetes 组件

在使用 kubernetes 过程中由于某些需求往往要修改一下 k8s 官方的源码，然后重新编译才行。本文就以修改 kubeadm 生成证书为默认 10 年为例，来讲解如何使用 GitHub Actions 来编译和发布生成的二进制文件。构建 clone repo 将 kubernetes 官方源码 fork 到自己的 repo 中

$ git clone https://github.com/rainingwalk/kubernetes.git 
$ cd kubernetes 
$ git remote add upstream https://github.com/kubernetes/kubernetes.git 
$ git fetch --all 
$ git checkout upstream/release-1.21 
$ git checkout -B kubeadm-1.21

1

2

3

4

5

6

$ git clone https://github.com/rainingwalk/kubernetes.git

$ cd kubernetes

$ git remote add upstream https://github.com/kubernetes/kubernetes.git

$ git fetch --all

$ git checkout upstream/release-1.21

$ git checkout -B kubeadm-1.21

workflow .github/workflows/kubeadm.yaml

---
name: Build kubeadm binary

on:
  push:
    tag:
      - 'v*'
jobs:
  build:
    runs-on: ubuntu-20.04
    # 使此git action在release时有写权限，不然会报GitHub release failed with status: 403 undefined
    permissions:
      contents: write
    # 这里我们选择以 tag 的方式触发 job 的运行
    if: startsWith(github.ref, 'refs/tags/')
    steps:
      - name: Checkout
        uses: actions/checkout@v2

      # 运行 build/run.sh 构建脚本来编译相应平台上的二进制文件
      - name: Build kubeadm binary
        shell: bash
        run: |
          bash -x build/run.sh make kubeadm KUBE_BUILD_PLATFORMS=linux/amd64
          bash -x build/run.sh make kubeadm KUBE_BUILD_PLATFORMS=linux/arm64

      # 构建好的二进制文件存放在 _output/dockerized/bin/ 中
      # 我们根据二进制目标文件的系统名称+CPU体系架构名称进行命名
      - name: Prepare for upload
        shell: bash
        run: |
          mv _output/dockerized/bin/linux/amd64/kubeadm kubeadm-linux-amd64
          mv _output/dockerized/bin/linux/arm64/kubeadm kubeadm-linux-arm64
          sha256sum kubeadm-linux-{amd64,arm64} > sha256sum.txt
          
      # 使用 softprops/action-gh-release 来将构建产物上传到 GitHub release 当中
      - name: Release and upload packages
        uses: softprops/action-gh-release@v1
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        with:
          files: |
            sha256sum.txt
            kubeadm-linux-amd64
            kubeadm-linux-arm64

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

---

on:

push:

tag:

- 'v*'

jobs:

build:

runs-on: ubuntu-20.04

# 使此git action在release时有写权限，不然会报GitHub release failed with status: 403 undefined

permissions:

contents: write

# 这里我们选择以 tag 的方式触发 job 的运行

if: startsWith(github.ref, 'refs/tags/')

steps:

- name: Checkout

uses: actions/checkout@v2

# 运行 build/run.sh 构建脚本来编译相应平台上的二进制文件

- name: Build kubeadm binary

shell: bash

run: |

bash -x build/run.sh make kubeadm KUBE_BUILD_PLATFORMS=linux/amd64

bash -x build/run.sh make kubeadm KUBE_BUILD_PLATFORMS=linux/arm64

# 构建好的二进制文件存放在 _output/dockerized/bin/ 中

# 我们根据二进制目标文件的系统名称+CPU体系架构名称进行命名

- name: Prepare for upload

shell: bash

run: |

mv _output/dockerized/bin/linux/amd64/kubeadm kubeadm-linux-amd64

mv _output/dockerized/bin/linux/arm64/kubeadm kubeadm-linux-arm64

sha256sum kubeadm-linux-{amd64,arm64} > sha256sum.txt

# 使用 softprops/action-gh-release 来将构建产物上传到 GitHub release 当中

- name: Release and upload packages

uses: softprops/action-gh-release@v1

env:

GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

with:

files: |

sha256sum.txt

kubeadm-linux-amd64

kubeadm-linux-arm64

build/run.sh : Run a command in a build docker container. Common invocations: build/run.sh make: Build just linux binaries in the container. Pass options and Read more…

By sean, 1 year02/06/2023 ago

KUBERNETES

为什么 kubernetes 环境要求开启 bridge-nf-call-iptables

为什么 kubernetes 环境要求开启 bridge-nf-call-iptables ? 背景 Kubernetes 环境中，很多时候都要求节点内核参数开启 bridge-nf-call-iptables:

sysctl -w net.bridge.bridge-nf-call-iptables=1

1 2	sysctl -w net.bridge.bridge-nf-call-iptables=1

参考官方文档 Network Plugin Requirements 如果不开启或中途因某些操作导致参数被关闭了，就可能造成一些奇奇怪怪的网络问题，排查起来非常麻烦。为什么要开启呢？本文就来跟你详细掰扯下。基于网桥的容器网络 Kubernetes 集群网络有很多种实现，有很大一部分都用到了 Linux 网桥: 每个 Pod 的网卡都是 veth 设备，veth pair 的另一端连上宿主机上的网桥。由于网桥是虚拟的二层设备，同节点的 Pod 之间通信直接走二层转发，跨节点通信才会经过宿主机 eth0。 Service 同节点通信问题不管是 iptables 还是 ipvs 转发模式，Kubernetes 中访问 Service 都会进行 DNAT，将原本访问 ClusterIP:Port 的数据包 DNAT 成 Service 的某个 Endpoint Read more…

By sean, 1 year01/12/2023 ago

KUBERNETES

Kubernetes高可用性的考虑

Kubernetes高可用性的考虑高可用性考虑因素本文档包含了社区提供的关于设置高可用性Kubernetes集群的注意事项。如果有什么地方不完整、不清楚或者需要更多的信息，请随时留言。概述当创建生产集群时，高可用性是必须的（集群在某些控制平面或工作节点失效时仍能保持运行的能力）。对于工作节点，假设有足够多的节点。也要在规划和设置集群时，需要考虑到控制平面节点和etcd实例的冗余。 kubeadm支持设置多个控制平面和多etcd集群。但仍有一些方面需要考虑和设置，这些方面并不是Kubernetes本身的一部分，因此项目文档中没有涉及。本文档提供了一些额外的信息和例子，在用kubeadm规划和引导HA集群时很有用。软件负载均衡的选项当创建一个具有多个控制平面的集群时，可以将API Server实例放在负载均衡后面，可以在运行kubeadm init时使用–control-plane-endpoint选项让新集群使用它来实现更高的可用性。当然，负载均衡器本身也应该是高度可用的。这通常是通过给负载均衡器增加冗余来实现的。为此，设置一个管理虚拟IP的主机集群，每台主机运行一个负载均衡器的实例，这样在其他主机处于待机状态时，总是使用当前持有vIP的主机上的负载均衡器。在某些环境中，例如在具有专用负载均衡组件（例如由某些云提供商提供）的数据中心中，该功能可能已经可用。如果没有，可以使用用户管理的负载均衡。在这种情况下，在启动集群之前需要做一些准备工作。由于这不是Kubernetes或kubeadm的一部分，所以必须单独处理。在下面的章节中，我们给出了一些例子，当然也有可能是其他几十种可能的配置。 keepalived 和 haproxy 对于从虚拟IP提供负载均衡，keepalived和haproxy的组合已经存在了很长时间，可以说是众所周知、久经考验。 keepalived提供了一个由可配置的健康检查管理的虚拟IP。由于虚拟IP的实施方式，协商虚拟IP的所有主机必须在同一IP子网中。 haproxy服务可以配置为简单的基于流的负载平衡，从而允许TLS终止由其后面的API服务器实例处理。这种组合既可以作为操作系统上的服务运行，也可以作为控制平面主机上的静态Pods运行。两种情况下的服务配置是相同的。 keepalived配置 keepalived配置由两个文件组成：服务配置文件和健康检查脚本，该脚本将定期被调用，以验证持有虚拟IP的节点是否仍在运行。这些文件位于/etc/keepalived目录中。但请注意，有些 Linux 发行版可能会把它们放在其他地方。下面的配置已经成功地用于keepalived1.3.5版本。

! /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
    router_id LVS_DEVEL
}
vrrp_script check_apiserver {
  script "/etc/keepalived/check_apiserver.sh"
  interval 3
  weight -2
  fall 10
  rise 2
}

vrrp_instance VI_1 {
    state ${STATE}
    interface ${INTERFACE}
    virtual_router_id ${ROUTER_ID}
    priority ${PRIORITY}
    authentication {
        auth_type PASS
        auth_pass ${AUTH_PASS}
    }
    virtual_ipaddress {
        ${APISERVER_VIP}
    }
    track_script {
        check_apiserver
    }
}

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

! /etc/keepalived/keepalived.conf

! Configuration File for keepalived

global_defs {

router_id LVS_DEVEL

}

vrrp_script check_apiserver {

script "/etc/keepalived/check_apiserver.sh"

interval 3

weight -2

fall 10

rise 2

}

vrrp_instance VI_1 {

state ${STATE}

interface ${INTERFACE}

virtual_router_id ${ROUTER_ID}

priority ${PRIORITY}

authentication {

auth_type PASS

auth_pass ${AUTH_PASS}

}

virtual_ipaddress {

${APISERVER_VIP}

}

track_script {

check_apiserver

}

bash变量样式中有一些占位符需要填写： ${STATE}设置一个主机是MASTER，其他主机是BACKUP，因此虚拟IP最初将分配给MASTER。 ${INTERFACE}是参与协商虚拟IP的网络接口，例如eth0。 ${ROUTER_ID}对于所有keepalived集群主机来说，应该是相同的，同时在同一子网的所有集群中是唯一的。许多发行版将其值预先配置为51。 ${PRIORITY} master上的优先级应高于backups。因此，101和100就足够了。 ${AUTH_PASS} 对所有keepalived集群主机而言，应该是相同的，例如42。 ${APISERVER_VIP}是keepalived集群主机之间协商的虚拟IP地址。上面的 keepalived 配置使用了一个健康检查脚本/etc/keepalived/check_apiserver.sh，负责确保在持有虚拟IP的节点上，API服务器是可用的。这个脚本可以是这样的。

#!/bin/sh

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"
if ip addr | grep -q ${APISERVER_VIP}; then
    curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"
fi

1

2

3

4

5

6

7

8

9

10

11

#!/bin/sh

errorExit() {

echo "*** $*" 1>&2

exit 1

}

curl --silent --max-time 2 --insecure https://localhost:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://localhost:${APISERVER_DEST_PORT}/"

if ip addr | grep -q ${APISERVER_VIP}; then

curl --silent --max-time 2 --insecure https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/ -o /dev/null || errorExit "Error GET https://${APISERVER_VIP}:${APISERVER_DEST_PORT}/"

fi

bash变量样式中有一些占位符需要填写： ${APISERVER_VIP}是keepalived集群主机之间协商的虚拟IP地址。 ${APISERVER_DEST_PORT} Kubernetes与API服务器对话的端口。 haproxy配置 haproxy配置由一个文件组成：服务配置文件，它在/etc/haproxy目录中。但请注意，有些Linux发行版可能会把它们放在其他地方。以下配置已经成功地用于haproxy2.1.4版本

# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    log /dev/log local0
    log /dev/log local1 notice
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          20s
    timeout server          20s
    timeout http-keep-alive 10s
    timeout check           10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend apiserver
    bind *:${APISERVER_DEST_PORT}
    mode tcp
    option tcplog
    default_backend apiserver

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserver
    option httpchk GET /healthz
    http-check expect status 200
    mode tcp
    option ssl-hello-chk
    balance     roundrobin
        server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check
        # [...]

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

# /etc/haproxy/haproxy.cfg

#---------------------------------------------------------------------

# Global settings

#---------------------------------------------------------------------

global

log /dev/log local0

log /dev/log local1 notice

daemon

#---------------------------------------------------------------------

# common defaults that all the 'listen' and 'backend' sections will

# use if not designated in their block

#---------------------------------------------------------------------

defaults

mode http

log global

option httplog

option dontlognull

option http-server-close

option forwardfor except 127.0.0.0/8

option redispatch

retries 1

timeout http-request 10s

timeout queue 20s

timeout connect 5s

timeout client 20s

timeout server 20s

timeout http-keep-alive 10s

timeout check 10s

#---------------------------------------------------------------------

# apiserver frontend which proxys to the masters

#---------------------------------------------------------------------

frontend apiserver

bind *:${APISERVER_DEST_PORT}

mode tcp

option tcplog

default_backend apiserver

#---------------------------------------------------------------------

# round robin balancing for apiserver

#---------------------------------------------------------------------

backend apiserver

option httpchk GET /healthz

http-check expect status 200

mode tcp

option ssl-hello-chk

balance roundrobin

server ${HOST1_ID} ${HOST1_ADDRESS}:${APISERVER_SRC_PORT} check

# [...]

同样，在bash变量样式中有一些占位符需要替换： ${APISERVER_DEST_PORT} Kubernetes与API服务器对话的端口。 ${APISERVER_SRC_PORT} API服务器实例使用的端口。 ${HOST1_ID}第一个负载均衡的API服务器主机的符号名称。 ${HOST1_ADDRESS}第一个负载均衡API服务器主机的可解析地址（DNS名、IP地址）。 Read more…

By sean, 2 years11/06/2021 ago

KUBERNETES

使用kube-vip做kubespray的apiserver的HA

使用kubespray搭建k8s集群, 如果没有外部LB做高可用的话, 对于node节点, kubespray默认为通过选择nginx或haproxy做apiserver的HA, 在master节点上, 是用127.0.0.1:6443来访问本机的apiserver, 而且程序调用apiserver时也没有做HA, 传统HA方式会搭建keepalived和haproxy, 本次是通过kube-vip来做HA 这是kube-vip官方对于ha的架构文档: https://github.com/kube-vip/kube-vip/blob/main/kubernetes-control-plane.md https://kube-vip.io/architecture/ 尝试着按照官方方法来运行: https://kube-vip.io/install_static/ , 始终失败, 由于docker run –network host –rm ghcr.io/kube-vip/kube-vip:v0.3.8 manifest pod的方式生成的配置默认是用的/etc/kubernetes/admin.conf来做backend的, 但是kubespray init集群时也是用的admin.conf来做认证的, 这就陷入了死循环. 后来通过直接配置文件的方式来做成功了. 参考: https://github.com/kube-vip/kube-vip/blob/main/kubernetes-control-plane.md https://www.codeleading.com/article/58065570523/ ansible如下: 注: 其中,templates/kube-vip.yaml.j2这个文件是通过这个来生成的: docker run –network host –rm 172.20.48.169:81/yks/kube-vip:v0.3.9 sample config | sudo tee /etc/kubernetes/manifests/kube-vip.yaml kubespray定义的默认k8s apiserver监听端口为6443(变量: kube_apiserver_port), 这里kube-vip定义的vip绑定的域名是k8s.apiserver.io, Read more…

By sean, 2 years11/06/2021 ago