k8s从v1.8开始使用metrics server来监控集群的资源使用情况,无论是Prometheus监控还是dashboard使用都是通过metrics server的api来进行。
在新版本集群部署中metrics server会出现一些意外情况,导致部署不成功,这里手动修改部署。
metrics server-yaml下载地址
https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
谷歌官方镜像v0.5.0
metrics-server:v0.5.0
镜像地址:https://download.csdn.net/download/w2909526/19358778
也可以自行下载
docker镜像需要翻墙下载之后load到本地,或者在dockerhub上搜索下载之后改名称使用
yaml文件有两处改动的地方,否则部署不成功
几处参数说明
–metric-resolution=15s:从 kubelet 采集数据的周期;
–kubelet-preferred-address-types:优先使用 InternalIP 来访问 kubelet,这样可以避免节点名称没有 DNS 解析记录时,通过节点名称调用节点 kubelet API 失败的情况(未配置时默认的情况);
–kubelet-insecure-tls:kubelet 的10250端口使用的是https协议,连接需要验证tls证书。–kubelet-insecure-tls不验证客户端证书
修改之后的文件
apiVersion: v1kind: ServiceAccountmetadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata: labels: k8s-app: metrics-server rbac.authorization.k8s.io/aggregate-to-admin: "true" rbac.authorization.k8s.io/aggregate-to-edit: "true" rbac.authorization.k8s.io/aggregate-to-view: "true" name: system:aggregated-metrics-readerrules:- apiGroups: - metrics.k8s.io resources: - pods - nodes verbs: - get - list - watch---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata: labels: k8s-app: metrics-server name: system:metrics-serverrules:- apiGroups: - "" resources: - pods - nodes - nodes/stats - namespaces - configmaps verbs: - get - list - watch---apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata: labels: k8s-app: metrics-server name: metrics-server-auth-reader namespace: kube-systemroleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: extension-apiserver-authentication-readersubjects:- kind: ServiceAccount name: metrics-server namespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata: labels: k8s-app: metrics-server name: metrics-server:system:auth-delegatorroleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:auth-delegatorsubjects:- kind: ServiceAccount name: metrics-server namespace: kube-system---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRoleBindingmetadata: labels: k8s-app: metrics-server name: system:metrics-serverroleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:metrics-serversubjects:- kind: ServiceAccount name: metrics-server namespace: kube-system---apiVersion: v1kind: Servicemetadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-systemspec: ports: - name: https port: 443 protocol: TCP targetPort: https selector: k8s-app: metrics-server---apiVersion: apps/v1kind: Deploymentmetadata: labels: k8s-app: metrics-server name: metrics-server namespace: kube-systemspec: selector: matchLabels: k8s-app: metrics-server strategy: rollingUpdate: maxUnavailable: 0 template: metadata: labels: k8s-app: metrics-server spec: containers: - args: - --cert-dir=/tmp - --secure-port=443 - --kubelet-insecure-tls #这里增加 - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname # - --kubelet-use-node-status-port 删除这里 - --metric-resolution=15s #采集数据的周期 image: k8s.gcr.io/metrics-server/metrics-server:v0.5.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /livez port: https scheme: HTTPS periodSeconds: 10 name: metrics-server ports: - containerPort: 443 name: https protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readyz port: https scheme: HTTPS initialDelaySeconds: 20 periodSeconds: 10 resources: requests: cpu: 100m memory: 200Mi securityContext: readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 volumeMounts: - mountPath: /tmp name: tmp-dir nodeSelector: kubernetes.io/os: linux priorityClassName: system-cluster-critical serviceAccountName: metrics-server volumes: - emptyDir: {} name: tmp-dir---apiVersion: apiregistration.k8s.io/v1kind: APIServicemetadata: labels: k8s-app: metrics-server name: v1beta1.metrics.k8s.iospec: group: metrics.k8s.io groupPriorityMinimum: 100 insecureSkipTLSVerify: true service: name: metrics-server namespace: kube-system version: v1beta1 versionPriority: 100
文件修改之后执行
镜像要导入到集群的每台服务器上,否则国内拉取不到镜像也会导致部署失败
kubectl apply -f components.yaml
版权声明:内容来源于互联网和用户投稿 如有侵权请联系删除