> For the complete documentation index, see [llms.txt](https://utm-1.gitbook.io/utm-docs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://utm-1.gitbook.io/utm-docs/dokumentaciya/utm-it/resheniya/monitoring/prometheus/monitoring-linux-serverov.md). # Мониторинг Linux серверов ## Ссылки Установка Prometheus + Alertmanager + node\_exporter на Linux: [www.dmosk.ru](https://www.dmosk.ru/instruktions.php?object=prometheus-linux#node) Шпаргалка по установке и настройке различных Prometheus exporter на Linux: [www.dmosk.ru](https://www.dmosk.ru/miniinstruktions.php?mini=exporter-prometheus) Вас много, а я одна: обзорная система мониторинга на Prometheus и Grafana: [habr.com](https://habr.com/ru/companies/doubletapp/articles/736602/) doubletapp / habr-dt-prometheus: [github.com](https://github.com/doubletapp/habr-dt-prometheus/blob/master/prometheus/alerts.yml) Настройка правил алертинга в Grafana: [platform-docs.v-serv.ru](https://platform-docs.v-serv.ru/online-documentation/home/user/maintain-tools/monitoring/2.1.0/alerting/alert-plugin-configure/) ## Информация Для подключения Linux сервера к мониторингу Prometheus необходимо выполнить несколько шагов: 1. На Linux сервер установить node\_exporter 2. В настройках Prometheus подключить target со ссылкой на Linux сервер который необходимо мониторить ## Установка node\_exporter Действия выполняются на сервере Linux, который необходимо подключить к мониторингу

Выполнить следующие действия

На [странице ](https://prometheus.io/download/#node_exporter)получить ссылку для скачивания node\_exporter {% hint style="info" %} {% endhint %} Выполнить на подключаемом Linux сервере: ```bash sudo su mkdir $HOME/tmp cd $HOME/tmp # Скачать дистрибутив wget https://github.com/prometheus/node_exporter/releases/download/v1.10.2/node_exporter-1.10.2.linux-amd64.tar.gz ``` Выполнить установку ```bash # Распакуем скачанный архив: tar -zxf node_exporter-*.linux-amd64.tar.gz # и перейдем в каталог с распакованными файлами: cd node_exporter-*.linux-amd64 # Копируем исполняемый файл в bin: cp node_exporter /usr/local/bin/ # Выходим из каталога и удаляем исходник: cd .. && rm -rf node_exporter-*.linux-amd64/ && rm -f node_exporter-*.linux-amd64.tar.gz ``` Назначить права ```bash # Создаем пользователя nodeusr: useradd --no-create-home --shell /bin/false nodeusr # Задаем владельца для исполняемого файла: chown -R nodeusr:nodeusr /usr/local/bin/node_exporter ``` Настройка автозапуска ```bash # Создаем файл node_exporter.service в systemd: nano /etc/systemd/system/node_exporter.service ``` Содержимое файла `/etc/systemd/system/node_exporter.service` ``` [Unit] Description=Node Exporter Service After=network.target [Service] User=nodeusr Group=nodeusr Type=simple ExecStart=/usr/local/bin/node_exporter --collector.systemd ExecReload=/bin/kill -HUP $MAINPID Restart=on-failure [Install] WantedBy=multi-user.target ``` {% hint style="info" %} Примечание: Можно указать * вариант 1 (сбор метрик в разрезе каждой службы systemd): ``` ExecStart=/usr/local/bin/node_exporter --collector.systemd ``` * вариант 2 (сбор метрик целиком по всему по серверу): ``` ExecStart=/usr/local/bin/node_exporter ``` Информация [отсюда](https://www.dmosk.ru/instruktions.php?object=prometheus-linux#metric) {% endhint %} Включаем: ```bash # Разрешаем автозапуск: systemctl enable node_exporter # Запускаем службу: systemctl start node_exporter # Или если нужно перезапустить systemctl daemon-reload systemctl restart node_exporter ``` Проверка. Открываем веб-браузер и переходим по адресу http\://\:9100/metrics — мы увидим метрики, собранные node\_exporter: ![Метрики, собранные node\_exporter](https://www.dmosk.ru/img/instruktions/prometheus-linux-06.jpg) ``` http://192.168.1.3:9100/metrics http://192.168.1.4:9100/metrics http://192.168.1.5:9100/metrics http://192.168.3.0:9100/metrics http://192.168.4.0:9100/metrics http://192.168.4.9:9100/metrics http://192.168.4.10:9100/metrics http://192.168.4.11:9100/metrics ``` Установка завершена.

## Настройка Prometheus ### Регистрация target В файле `argocd\env-prod\apps\monitoring\prometheus\values.yaml`

Изменения в файле argocd\env-prod\apps\monitoring\prometheus\values.yaml

Добавить ip сервера в секцию prometheus\prometheusSpec\additionalScrapeConfigs ```yaml additionalScrapeConfigs: | - job_name: 'node_exporter_clients' scrape_interval: 5s static_configs: - targets: - 192.168.4.9:9100 - :9100 ``` Сделать коммит в репозиторий GitHub с обновлением файла `values.yaml` Дождаться, когда Argo CD обновить приложение k8s-prometheus.

### Настройка правил оповещений В файле `argocd\env-prod\apps\monitoring\prometheus\values.yaml`

Изменения в файле argocd\env-prod\apps\monitoring\prometheus\values.yaml

Изменить секцию additionalPrometheusRulesMap ```yaml additionalPrometheusRulesMap: utm.alert.rules: groups: # Группа оповещений по серверам - name: utm_alert_group_nodes rules: # Сервер не доступен - alert: InstanceDown expr: up == 0 for: 1m labels: severity: critical annotations: description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute.' summary: Instance {{ $labels.instance }} down # Сервер не работает - alert: NodeDown expr: up{instance_type='node'} == 0 for: 5m labels: severity: critical annotations: summary: "Node {{ $labels.instance }} has been down for more than 5 minutes" # На сервере мало свободного места на диске - alert: NodeLowDiskSpace expr: node_filesystem_avail_bytes{mountpoint='/'} / node_filesystem_size_bytes * 100 <= 5 labels: severity: critical annotations: summary: "Node {{ $labels.instance }} is low on disk space" # На сервере мало свободной памяти - alert: NodeLowMemory expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes <= 0.05 for: 5m labels: severity: critical annotations: description: 'Node {{ $labels.instance }} of job {{ $labels.job }} is low on RAM for more than 5 minutes.' summary: 'Node {{ $labels.instance }} is low on RAM for more than 5 minutes' # На сервере высокое потребление ЦПУ - alert: NodeHighCPUUsage expr: 100 - (avg by (instance) (rate(node_cpu{job="node",mode="idle"}[1m])) * 100) >= 80 for: 5m labels: severity: warning annotations: summary: "Node {{ $labels.instance }} is high on CPU usage for more than 5 minutes" # Группа оповещений по приложениям - name: utm_alert_group_services rules: # Nginx не доступен - alert: NginxDown expr: node_systemd_unit_state{name="nginx.service",state="active"} == 0 for: 1s labels: severity: critical annotations: description: '{{ $labels.instance }} of job {{ $labels.job }} is down.' summary: 'Instance {{ $labels.instance }} is down' # Приложение не работает - alert: ServiceDown expr: up{instance_type='service'} == 0 for: 5m labels: severity: critical annotations: summary: "Service {{ $labels.instance }} has been down for more than 5 minutes" # Приложение не доступно (прокси отвечает 502 Bad Gateway) - alert: ServiceNotAvailable expr: floor(sum(increase(nginx_http_response_count_total{status='502'}[1m]))) > 0 for: 3m labels: severity: critical annotations: summary: "Service {{ $labels.instance }} has not been available for more than 3 minutes." # Приложение возвращает 500-е коды ответов - alert: ServiceError expr: floor(sum(increase(nginx_http_response_count_total{status=~'5..',status!='502'}[1m]))) > 0 labels: severity: critical annotations: summary: "Service {{ $labels.instance }} has just thrown 5xx error." ``` Сделать коммит в репозиторий GitHub с обновлением файла `values.yaml` Дождаться, когда Argo CD обновить приложение k8s-prometheus.

## Настройка Grafana В файле `argocd\env-prod\apps\monitoring\grafana\values.yaml` добавить

Изменения в файле argocd\env-prod\apps\monitoring\grafana\values.yaml

```yaml dashboardProviders: dashboardproviders4.yaml: apiVersion: 1 providers: - name: 'linux' orgId: 1 folder: 'Linux' type: file disableDeletion: false editable: true options: path: /var/lib/grafana/dashboards/linux foldersFromFilesStructure: false dashboards: linux: 1860-node-exporter-full: gnetId: 1860 revision: 42 datasource: Prometheus 22413-k8s-node-metrics: gnetId: 22413 revision: 6 datasource: Prometheus 15172-node-exporter: gnetId: 15172 revision: 6 datasource: Prometheus ``` Сделать коммит в репозиторий GitHub с обновлением файла `values.yaml` Дождаться, когда Argo CD обновить приложение k8s-grafana.

#### Дашборды

Описание импортируемых дашбордов

#### Node Exporter Full ID: 1860 Ссылка: [grafana.com](https://grafana.com/grafana/dashboards/1860-node-exporter-full/)

#### K8s Node Metrics / Multi Clusters (Node Exporter, Prometheus, Grafana11, 2025, EN) ID: 22413 Ссылка: [grafana.com](https://grafana.com/grafana/dashboards/22413-k8s-node-metrics-multi-clusters-node-exporter-prometheus-grafana11-2025-en/)

#### Node Exporter for Prometheus Dashboard based on 11074 ID: 15172 Ссылка: [grafana.com](https://grafana.com/grafana/dashboards/15172-node-exporter-for-prometheus-dashboard-based-on-11074/)

### Ручной импорт дашбордов по ID В Grafana: Dashboards → Import → вставь ID → выбери источник Prometheus. Каталог в дашбордах: `Linux` #### Дашборды

Описание импортируемых дашбордов

Пока нет

--- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://utm-1.gitbook.io/utm-docs/dokumentaciya/utm-it/resheniya/monitoring/prometheus/monitoring-linux-serverov.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.