Version: 6.4.1

Monitoring and Metrics: Prometheus and Grafana

This page provides information about the recommended installation and configuration of Prometheus and Grafana to monitor your DCI cluster.

CognitiveScale recommends using Prometheus and Grafana for metrics collection and visualization (dashboards). These components can easily be installed into a Kubernetes cluster using the kube-prometheus-stack helm chart.

Prerequisites

Add the prometheus-community Helm repository to your local Helm cache by running:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

Optional: Configure the subcomponents of the kube-prometheus-stack chart

The kube-prometheus-stack Helm chart has several configuration options and settings, as there are multiple components that must be configured individually (prometheus-operator, grafana, alertmanager, and default metrics exporters).

Customize your deployment by following the documentation for the kube-prometheus-stack.

Optional: Expose Grafana via Kubernetes Ingress

To make Grafana externally available, set the grafana.ingress.enabled option to true before deploying the kube-prometheus-stack Helm chart. This automatically deploys an Ingress resource that allows external access to the Grafana service. Additional options available for the Grafana subchart component are available here. If you do not deploy an Ingress, you may still access Grafana using the kubectl method described below.

Configure the kube-prometheus-stack to watch resources in other namespaces

The following snippet allows the Prometheus component installed by the kube-prometheus-stack to detect and monitor resources in other namespaces (such as cortex)

prometheusOperator:
namespaces:
releaseNamespace: true
additional:
- kube-system
- cortex
- cortex-compute
- istio-system
- monitoring
prometheus:
prometheusSpec:
ruleSelector: {}
ruleSelectorNilUsesHelmValues: false
ruleNamespaceSelector: {}
podMonitorSelector: {}
podMonitorSelectorNilUsesHelmValues: false
podMonitorNamespaceSelector: {}
probeSelector: {}
probeSelectorNilUsesHelmValues: false
probeNamespaceSelector: {}
serviceMonitorSelector: {}
serviceMonitorSelectorNilUsesHelmValues: false
serviceMonitorNamespaceSelector: {}

Install the Prometheus-Operator

Install the prometheus-operator Helm chart into a namespace. The example below uses the monitoring namespace and names this helm deployment monitoring, as well.

helm install prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring -f prometheus-values.yaml --version 36.0.3

Access Grafana Dashboards via kubectl

Use this method if you have not enabled an Ingress.

The following command allows you to access a Grafana instance deployed via the kube-prometheus-stack in the monitoring namespace,

kubectl port-forward $(kubectl get pods --selector=app=grafana -n monitoring --output=jsonpath="{.items..metadata.name}") -n monitoring 3000

Run the command then open a web browser and access Grafana over http://localhost:3000 to view metrics and dashboards for the entire cluster. Grafana comes bundled with plenty of default views/dashboards that allow you to switch between different macro and micro views of the resources/pods/namespaces/cluster. More community maintained dashboards for Grafana are available here.

Enable Fabric6 and Infrastructure Metrics

The following example yaml will create ServiceMonitor resources for the Fabric components as part of the Helm installation.

cortex:
env:
# cortex.env.FEATURE_METRICS_ENABLED: enable/disable metric endpoints and ServiceMonitor Custom Resources
FEATURE_METRICS_ENABLED: true
minio:
metrics:
serviceMonitor:
enabled: true
mongodb:
metrics:
enabled: true
redis:
metrics:
enabled: true