# 外部Prometheus监控k8s集群资源

## Prometheus监控k8s资源 <a href="#prometheus-e7-9b-91-e6-8e-a7k8s-e8-b5-84-e6-ba-90" id="prometheus-e7-9b-91-e6-8e-a7k8s-e8-b5-84-e6-ba-90"></a>

通过Prometheus监控k8s集群中各种资源：如微服务,容器资源指标 并在Grafana显示

## 思路 <a href="#e6-80-9d-e8-b7-af" id="e6-80-9d-e8-b7-af"></a>

* 可以通过外部prometheus通过连接apiserver去监控k8s集群内指标。（前提k8s集群内安装好相应的exports）
* 可以通过部署kube-prometheus(集群内部起了一套监控) 在通过联邦的方式,进行监控。

以下采用 外部prometheus监控cadvisor,kube-state-metrics来获取k8s集群指标资源

## 准备工作 <a href="#e5-87-86-e5-a4-87-e5-b7-a5-e4-bd-9c" id="e5-87-86-e5-a4-87-e5-b7-a5-e4-bd-9c"></a>

#### 1、插件介绍 <a href="#id-1-e3-80-81-e6-8f-92-e4-bb-b6-e4-bb-8b-e7-bb-8d" id="id-1-e3-80-81-e6-8f-92-e4-bb-b6-e4-bb-8b-e7-bb-8d"></a>

想要监控k8s比较全面的资源指标，我们需要在集群内安装相应的exports，这要借助cadvisor,kube-state-metrics

1. cadvisor: 集成在kubelet内，不需要单独去安装了,它可以收集集群内容器的cpu,内存等指标
2. kube-state-metrics： 轮询api-server，监听 add delete update等事件,换句话说 光有cadvisor这些基本指标去监控,维度是不够的\
   对于deployment，Pod、daemonset、cronjob等k8s资源对象并没有监控，比如：当前replace是多少？Pod当前状态（pending or running?）cadvisor并没有对具体的资源对象就行监控，因此就需引用新的exports来暴漏监控指标，kube-state-metrics

#### 2、kube-state-metrics安装部署 <a href="#id-2-e3-80-81kube-state-metrics-e5-ae-89-e8-a3-85-e9-83-a8-e7-bd-b2" id="id-2-e3-80-81kube-state-metrics-e5-ae-89-e8-a3-85-e9-83-a8-e7-bd-b2"></a>

1. 下载kube-state-metrics安装包

> 注意： 我的k8s版本为v1.23.6 所以,要在github上看下说明,根据自己k8s的版本按实际情况来选择kube-state-metrics的版本

`kube-state-metrics_v2.2.1`\
下载地址

```ruby
https://github.com/starsliao/Prometheus/tree/master/kubernetes
```

将安装包上传至服务器

2. 部署kube-state-metrics

```bash
[root@master1 kube-state-metrics_v2.2.1]# pwd
/tmp/Prometheus/kubernetes/kube-state-metrics_v2.2.1

[root@master1 kube-state-metrics_v2.2.1]# tree
.
├── cluster-role-binding.yaml
├── cluster-role.yaml
├── deployment.yaml
├── service-account.yaml
└── service.yaml

0 directories, 5 files
```

按需修改service.yml 中暴露端口，修改后如下：

```bash
apiVersion: v1
kind: Service
metadata:
#  annotations:
#    prometheus.io/scrape: 'true'
  labels:
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: v2.2.1
  name: kube-state-metrics
  namespace: ops-monit
spec:
  type: NodePort
  ports:
  - name: http-metrics
    port: 8080
    targetPort: http-metrics
    nodePort: 30866
  - name: telemetry
    port: 8081
    targetPort: telemetry
    nodePort: 30867
  selector:
    app.kubernetes.io/name: kube-state-metrics

```

* 在k8s集群中部署

```bash
kubectl create namespace ops-monit
cd kube-state-metrics
kubectl apply -f .
```

* 暴露端口（采集监控信息）

> 加上: --read-only-port=10255

```bash
(base) [root@master2 ansible]# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.6 --read-only-port=10255"   
```

* 获取服务名称

```bash
[root@master1 kube-state-metrics_v2.2.1]# more service-account.yaml 
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: v2.2.1
  name: kube-state-metrics          <----- 看这个服务账户
  namespace: ops-monit
```

* 获取秘钥名称

```bash
[root@master1 kube-state-metrics_v2.2.1]# kubectl get sa kube-state-metrics -n ops-monit -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"labels":{"app.kubernetes.io/name":"kube-state-metrics","app.kubernetes.io/version":"v2.2.1"},"name":"kube-state-metrics","namespace":"ops-monit"}}
  creationTimestamp: "2024-03-16T05:27:36Z"
  labels:
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: v2.2.1
  name: kube-state-metrics
  namespace: ops-monit
  resourceVersion: "34964"
  uid: 35302daa-1b1a-4310-8629-127f63da54eb
secrets:
- name: kube-state-metrics-token-n2lbl       <------ 下一步获取这个秘钥名字的 token
```

* 获取token信息

```bash
[root@master1 kube-state-metrics_v2.2.1]# kubectl describe secret kube-state-metrics-token-n2lbl -n ops-monit
Name:         kube-state-metrics-token-n2lbl
Namespace:    ops-monit
Labels:       <none>
Annotations:  kubernetes.io/service-account.name: kube-state-metrics
              kubernetes.io/service-account.uid: 35302daa-1b1a-4310-8629-127f63da54eb

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1099 bytes
namespace:  9 bytes
token:      eyJhbGciOiJSUzI1NiIsImtpZCI6IklOLTlyYnM0TjZvOThaSUhaRTNMWmlfb01QbWptNTB2QTVTdXVqOU5DWlUifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJvcHMtbW9uaXQiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlY3JldC5uYW1lIjoia3ViZS1zdGF0ZS1tZXRyaWNzLXRva2VuLW4ybGJsIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6Imt1YmUtc3RhdGUtbWV0cmljcyIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6IjM1MzAyZGFhLTFiMWEtNDMxMC04NjI5LTEyN2Y2M2RhNTRlYiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpvcHMtbW9uaXQ6a3ViZS1zdGF0ZS1tZXRyaWNzIn0.ce0mj6th5fLBVj_BXSbiTADYOe33pdtT8_k-KVb0khIIGoV0q0UhklXBTsyKBeBa2eoUBAlvkiqyP-YVy8eLlMVxvKeevA_BE0Z-fDuPvzFTbdlrhkNDER44QRNzFHkdSmKR_Z3IM_3QxDhZz6KOuepKOq_Kn6_CcHcLnlkTQgJ8IzkoLa7Nl2TNVarBD-QHpNeEdu1b9i-sU1Ro3Yzva2AlRsmU9AzfZgyY-ZJIsjtKCWUxrbZNw0_287cxKafLQ2ETP57MucODr0WCagpyOMpAZMqDbcgzQH1MoyzcXhsJ9XrroHw_eIrxD3_Ikd7Gy3wBLXrqqi7jfo6u_R8B6Q         <------ 记录这个秘钥，外部Prometheus会用到
```

#### **3、給予Prometheus調用** <a href="#id-3-e3-80-81-e7-b5-a6-e4-ba-88prometheus-e8-aa-bf-e7-94-a8" id="id-3-e3-80-81-e7-b5-a6-e4-ba-88prometheus-e8-aa-bf-e7-94-a8"></a>

* prometheus.yml 主配置文件目錄下，創建 k8s.token 名字文件，把上面 Token 信息放進去

<figure><img src="/files/WElKFzMhAXHh6WfBuSdG" alt=""><figcaption></figcaption></figure>

#### 4、配置Prometheus <a href="#id-4-e3-80-81-e9-85-8d-e7-bd-aeprometheus" id="id-4-e3-80-81-e9-85-8d-e7-bd-aeprometheus"></a>

* vim prometheus.yml

新增如下内容：

```bash
  - job_name: 'k8s-cadvisor'
    scrape_interval: 60s
    scrape_timeout: 60s
    metrics_path: /metrics/cadvisor
    kubernetes_sd_configs:  # kubernetes 自动发现
    - api_server: https://192.168.1.21:6443  # apiserver 地址
      role: node  # node 类型的自动发现
      namespaces:
        names:
        - ops-monit
      bearer_token_file: k8s.token
      tls_config:
        insecure_skip_verify: true
    bearer_token_file: k8s.token
    tls_config:
      insecure_skip_verify: true
    relabel_configs:
    - source_labels: [__address__]
      regex: '(.*):10250'
      replacement: '${1}:10255'
      target_label: __address__
      action: replace
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)

    metric_relabel_configs:
    - source_labels: [instance]
      separator: ;
      regex: (.+)
      target_label: node
      replacement: $1
      action: replace

    - source_labels: [pod_name]
      separator: ;
      regex: (.+)
      target_label: pod
      replacement: $1
      action: replace
    - source_labels: [container_name]
      separator: ;
      regex: (.+)
      target_label: container
      replacement: $1
      action: replace

  - job_name: kube-state-metrics-1
    kubernetes_sd_configs:
    - api_server: https://192.168.1.21:6443  # apiserver 地址
      role: endpoints  # node 类型的自动发现
      namespaces:
        names:
        - ops-monit      
      bearer_token_file: k8s.token
      tls_config:
        insecure_skip_verify: true
    bearer_token_file: k8s.token
    tls_config:
      insecure_skip_verify: true
    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
    - separator: ;
      regex: (.*)
      target_label: __address__
      replacement: 192.168.1.21:30866
    - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
      regex: kube-state-metrics
      replacement: $1
      action: keep
    - action: labelmap
      regex: __meta_kubernetes_service_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: k8s_namespace
    - source_labels: [__meta_kubernetes_service_name]
      action: replace
      target_label: k8s_sname
      
  - job_name: kube-state-metrics-2
    kubernetes_sd_configs:
    - api_server: https://192.168.1.21:6443  # apiserver 地址
      role: endpoints  # node 类型的自动发现
      namespaces:
        names:
        - ops-monit
      bearer_token_file: k8s.token
      tls_config:
        insecure_skip_verify: true
    bearer_token_file: k8s.token
    tls_config:
      insecure_skip_verify: true
    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
    - separator: ;
      regex: (.*)
      target_label: __address__
      replacement: 192.168.1.21:30867
    - source_labels: [__meta_kubernetes_service_label_app_kubernetes_io_name]
      regex: kube-state-metrics
      replacement: $1
      action: keep
    - action: labelmap
      regex: __meta_kubernetes_service_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: k8s_namespace
    - source_labels: [__meta_kubernetes_service_name]
      action: replace
      target_label: k8s_sname

```

2. 以上涉及到了一些段落说明一下\
   这是外部prometheus连接k8s集群，\
   `填写apiserver地址`\
   `命名空间写kube-state-metrics所在的ns空间`\
   `通信的token ： 在prometheus.yml同级目录创建了个k8s.token的文件，内容为k8s集群中ops-monit空间下的secret---prometheus-token-wq9fd的内容`

```bash
kubernetes_sd_configs:
    - api_server: https://192.168.1.21:6443  # apiserver 地址
      role: endpoints  # node 类型的自动发现
      namespaces:
        names:
        - ops-monit
      bearer_token_file: k8s.token
      tls_config:
        insecure_skip_verify: true
    bearer_token_file: k8s.token
    tls_config:
      insecure_skip_verify: true
```

#### 5、配置仪表盘 <a href="#id-5-e3-80-81-e9-85-8d-e7-bd-ae-e4-bb-aa-e8-a1-a8-e7-9b-98" id="id-5-e3-80-81-e9-85-8d-e7-bd-ae-e4-bb-aa-e8-a1-a8-e7-9b-98"></a>

Grafana8.5.27 导入id：13105

一些配置上细节的说明，可以参考模板的说明[https://grafana.com/grafana/dashboards/13105-1-k8s-for-prometheus-dashboard-20211010/](https://links.jianshu.com/go?to=https%3A%2F%2Fgrafana.com%2Fgrafana%2Fdashboards%2F13105-1-k8s-for-prometheus-dashboard-20211010%2F)

\ <br>

<figure><img src="/files/8RnW26vcylfZcXaPovoL" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://close.gitbook.io/yun-wei-bi-ji/centos/prometheus/wai-bu-prometheus-jian-kong-k8s-ji-qun-zi-yuan.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
