Setup

Prerequisites

Prometheus Operator or kube-prometheus-stack installed in the cluster (provides the ServiceMonitor and PrometheusRule CRDs)
Grafana with the sidecar enabled for dashboards and datasources
The Grafana datasource sidecar must watch Secrets (or ConfigMaps) labeled with grafana_datasource

The built-in Grafana resources are designed for the common Kubernetes pattern used by kube-prometheus-stack and the Grafana Helm chart: Grafana runs in the cluster and discovers dashboards and datasources via labeled ConfigMaps and Secrets. If you use external Grafana, Grafana Cloud, or Grafana Operator, keep the Prometheus resources enabled but plan to manage the dashboard and datasource through your existing Grafana workflow instead of relying on sidecar discovery.

Enabling Monitoring

Set monitoring.enabled to true in your Helm values:

monitoring:
  enabled: true

All sub-resources (ServiceMonitors, PrometheusRules, Grafana dashboard, Grafana datasource) are enabled by default once the top-level flag is set. You can selectively disable any of them:

monitoring:
  enabled: true
  serviceMonitors:
    enabled: true    # ServiceMonitors for Prometheus scraping
  prometheusRules:
    enabled: true    # Built-in alert rules
  grafanaDashboard:
    enabled: true    # Grafana dashboard ConfigMap
  grafanaDatasource:
    enabled: true    # Grafana PostgreSQL datasource Secret

If you only use Kargo or Argo CD, disable alerts for the one you are not using to avoid false-positive alerts:

monitoring:
  enabled: true
  alerts:
    kargo:
      enabled: false   # disable if not using Kargo

If your Grafana deployment does not use sidecar discovery, disable the chart-managed Grafana resources and import or provision them separately:

monitoring:
  enabled: true
  grafanaDashboard:
    enabled: false
  grafanaDatasource:
    enabled: false

For the full list of monitoring.* parameters with defaults and descriptions, see the Monitoring Parameters section of the Helm values reference. That page is auto-generated from the chart's values.yaml, which also documents every option inline.

Prometheus Selector Labels

Many Prometheus Operator installations use label selectors to filter which ServiceMonitors, PodMonitors, and PrometheusRules to discover. The chart defaults additionalLabels to release: kube-prometheus-stack on all three resource types, which matches the default selector used by kube-prometheus-stack.

If your kube-prometheus-stack Helm release has a different name, override the label to match:

monitoring:
  enabled: true
  serviceMonitors:
    additionalLabels:
      release: my-prometheus   # change to match your Helm release name
  podMonitors:
    additionalLabels:
      release: my-prometheus
  prometheusRules:
    additionalLabels:
      release: my-prometheus

tip

If your ServiceMonitors or PodMonitors are being created but Prometheus is not scraping them, a label mismatch is almost always the cause. Check your Prometheus custom resource for serviceMonitorSelector, podMonitorSelector, and ruleSelector to see what labels are required.

Shared Monitoring Namespace

If your Prometheus, Grafana, and Alertmanager run in a dedicated monitoring namespace, you can place all monitoring resources there instead of the Akuity Platform release namespace. This keeps everything co-located and avoids broadening Grafana sidecar permissions:

monitoring:
  enabled: true
  serviceMonitors:
    namespace: monitoring
  podMonitors:
    namespace: monitoring
  prometheusRules:
    namespace: monitoring
  grafanaDashboard:
    namespace: monitoring
  grafanaDatasource:
    namespace: monitoring

The release: kube-prometheus-stack label is included by default, so no additionalLabels override is needed unless your Helm release name differs (see Prometheus Selector Labels).

tip

This is a common approach for kube-prometheus-stack installations because it keeps dashboards, rules, and scrape configuration alongside the monitoring stack. If your Grafana sidecar is scoped to specific namespaces instead of ALL, it also avoids extra cross-namespace configuration.

Grafana Datasource Provisioning

By default, the chart provisions a Grafana PostgreSQL datasource Secret for the dashboard panels that query the portal database directly. The provisioned datasource:

Uses database.readOnlyHost when set, otherwise database.host
Uses database.port, database.dbname, database.user, database.password, and database.sslmode
Is created in monitoring.grafanaDatasource.namespace, defaulting to monitoring.grafanaDashboard.namespace and then the Helm release namespace

If you already manage a Grafana datasource outside the chart, disable the built-in one:

monitoring:
  enabled: true
  grafanaDatasource:
    enabled: false

If you use a non-default schema or need custom datasource options beyond the chart defaults, manage the datasource separately in Grafana and keep monitoring.grafanaDatasource.enabled: false.

This is also the recommended approach if you use external Grafana, Grafana Cloud, or Grafana Operator instead of a sidecar-based in-cluster Grafana deployment.

Grafana Dashboard Folder

To organize the dashboard into a specific Grafana folder, set monitoring.grafanaDashboard.folder. The chart writes a grafana_folder annotation on the dashboard ConfigMap, which the Grafana sidecar uses to place the dashboard in the named folder.

note

This requires your Grafana installation to have sidecar.dashboards.folderAnnotation set to grafana_folder. The upstream Grafana Helm chart does not set this by default. If you use kube-prometheus-stack or the standalone Grafana chart, add the following to your Grafana values:

sidecar:
  dashboards:
    folderAnnotation: grafana_folder

Without this, the annotation is ignored and the dashboard lands in the General folder.

monitoring:
  enabled: true
  grafanaDashboard:
    folder: "Akuity Platform"

Verifying Your Setup

After enabling monitoring and deploying, verify each component is working. Monitoring resource names are prefixed with the Helm release name (e.g., <release>-platform-controller). The examples below assume the default release name akuity-platform.

1. Check monitoring resources are created

All chart-managed monitoring resources carry the label app.kubernetes.io/part-of: akuity-platform, so you can list them in one command:

kubectl get servicemonitor,podmonitor,prometheusrule,configmap,secret \
  -l app.kubernetes.io/part-of=akuity-platform -n <your-namespace>

You should see ServiceMonitors for each enabled platform component (e.g. <release>-platform-controller, <release>-portal-server), PodMonitors for repo-server-delegate and repo-server-proxy, a PrometheusRule, a Grafana dashboard ConfigMap, and a Grafana datasource Secret.

2. Verify Prometheus is scraping targets

Open the Prometheus UI (typically at http://<prometheus-host>:9090) and navigate to Status > Targets. Look for targets matching the Akuity Platform ServiceMonitors and PodMonitors. All targets should show a UP state.

ServiceMonitor targets appear as serviceMonitor/<namespace>/<name> and PodMonitor targets appear as podMonitor/<namespace>/<name>. The PodMonitor targets (<release>-repo-server-proxy, <release>-repo-server-delegate) may show zero active targets if no Argo CD instances are using the Repo Server Delegate feature: this is expected.

New targets may briefly appear as UNKNOWN immediately after the monitoring resources are created. This is expected until the first scrape completes. With the default monitoring.serviceMonitors.interval: 60s, allow up to one minute before treating this as a failure.

If targets are missing, check that your ServiceMonitors and PodMonitors have the correct additionalLabels to match your Prometheus selectors (see Prometheus Selector Labels above).

3. Verify alerts are loaded

kubectl get prometheusrule -n <your-namespace>

You should see akuity-platform-rules (or <release>-rules if you used a custom release name). To verify Prometheus has loaded the rules, navigate to Status > Rules in the Prometheus UI and search for Akuity.

4. Find the Grafana dashboard

Open Grafana and search for "Akuity Platform" in the dashboard search. If the dashboard is not appearing, verify:

The Grafana sidecar is enabled and configured to watch ConfigMaps with the grafana_dashboard label
The dashboard ConfigMap is in a namespace the sidecar watches (see Shared Monitoring Namespace for the recommended setup)

If you intentionally disabled monitoring.grafanaDashboard.enabled, import the bundled dashboard JSON into Grafana using your normal workflow instead.

5. Verify the Grafana datasource exists

Open Grafana and navigate to Connections > Data sources. You should see a PostgreSQL datasource named <release> Portal DB unless you overrode monitoring.grafanaDatasource.datasourceName.

If it is missing, verify:

The Grafana datasource sidecar is enabled
The sidecar watches Secrets labeled with grafana_datasource
The datasource Secret is in a namespace the sidecar watches

If you intentionally disabled monitoring.grafanaDatasource.enabled, provision an equivalent PostgreSQL datasource in Grafana yourself and ensure its UID matches the one referenced by the dashboard, or update the dashboard to point at your datasource.

Scraped Components

ServiceMonitors

ServiceMonitors are created for each platform component that exposes a metrics endpoint. Components gated by an enabled flag only get a ServiceMonitor when that component is also enabled.

Component	Metrics Port	Condition
platform-controller	9500	Always
portal-server	9501	Always
notification-controller	9505	Only when `notificationController.enabled: true`
addon-controller	9506	Only when `addonController.enabled: true`

PodMonitors

PodMonitors scrape metrics from pods in Argo CD instance namespaces (argocd-*). Unlike ServiceMonitors, they use namespaceSelector.any: true to discover pods across all namespaces.

PodMonitor	Selector	Port	Notes
repo-server-delegate	`akuity.io/repo-server-delegate` label exists	`metrics`	Only produces targets when instances use the Repo Server Delegate feature
repo-server-proxy	`akuity.io/repo-server-proxy: "true"`	`akuity-metrics`	Drops the high-cardinality `repo_server_proxy_method_duration_seconds_bucket` metric via metricRelabeling to control storage costs

Both PodMonitors are enabled by default when monitoring.podMonitors.enabled is true. It is safe to leave them enabled even when no instances use the Repo Server Delegate feature: the PodMonitors simply match zero pods.

Grafana Dashboard

The bundled Grafana dashboard provides visibility into:

Argo CD Instances: health distribution, reconciliation status, instance counts, tables of unhealthy/unreconciled instances
Argo CD Clusters: connection status, reconciliation, health breakdown
Kargo Instances: health, reconciliation, instance counts
Kargo Agents: connection status, reconciliation, health breakdown
Control Plane Operations: controller workqueue depth and duration, OOM-killed containers, database connection pool stats, persistent volume usage, CPU throttling
Argo CD Repo Server Delegate (Optional): reverse-proxy latency, request rate, and pending requests for instances using the Repo Server Delegate feature

The dashboard includes configurable template variables:

DS_PROMETHEUS: Prometheus datasource for metrics panels (health gauges, time series, alert-derived stats)
DS_PORTAL_DB: PostgreSQL datasource for table panels that query the portal database directly (instance lists, cluster details, org breakdowns). The chart provisions this datasource by default through monitoring.grafanaDatasource.*. If you disable that datasource or manage your own, those panels will fail until DS_PORTAL_DB resolves to a working PostgreSQL datasource.
namespace: filters metrics to the selected Kubernetes namespace
ThrottlingRatio: threshold used by the CPU-throttled containers panel

The Argo CD Repo Server Delegate (Optional) row may be empty. It only shows data when both of the following are true:

the Argo CD instance is configured to use repoServerDelegate in either controlPlane or managedCluster mode
instance-level Prometheus metrics are enabled on the platform controller (for self-hosted installs this is typically done by setting platformController.env.ENABLE_INSTANCE_PROMETHEUS_MONITORING: "true")

The bundled chart includes PodMonitors (monitoring.podMonitors.enabled) that scrape repo-server-delegate and repo-server-proxy metrics across instance namespaces (argocd-*). If an instance uses the default "all managed clusters" manifest generation layout, this row will remain empty because there is no delegated repo server reverse-proxy traffic to display.

Prerequisites​

Enabling Monitoring​

Prometheus Selector Labels​

Shared Monitoring Namespace​

Grafana Datasource Provisioning​

Grafana Dashboard Folder​

Verifying Your Setup​

1. Check monitoring resources are created​

2. Verify Prometheus is scraping targets​

3. Verify alerts are loaded​

4. Find the Grafana dashboard​

5. Verify the Grafana datasource exists​

Scraped Components​

ServiceMonitors​

PodMonitors​

Grafana Dashboard​