You're viewing docs for v0.1.x. Jump to latest →

Configuration

Full Helm values reference for the Clu chart. The complete default-values document ships with the chart artifact (run helm show values oci://709825985650.dkr.ecr.us-east-1.amazonaws.com/cloudology/clu-ops-agent --version 0.1.2 to dump it locally); this page walks through each block + the settings-to-env-var mapping the ConfigMap template wires up.

Capabilities

modules:
  cluster:
    enabled: true              # Always true — the Core is the baseline
    subFeatures:
      knowledgeGraph: true
  cloud:
    enabled: false             # Set to true once the Cloud Agent entitlement is active
    provider: aws              # aws | azure | gcp
    aws:
      regionOverride: ""       # Optional: pin a different region from the
                               # cluster's ambient one (Bedrock region is
                               # separate — see agent.model.region below).
  idp:
    enabled: false
    writeOperations:
      enabled: false           # Separate toggle — the Core Plus entitled
                               # but write-ops disabled is a valid
                               # "scaffolding only" mode.
    auditLog: true

Module gating follows this priority order (see Modules for the long version):

  1. Marketplace entitlement (authoritative, re-verified hourly).
  2. Helm values (can opt out of entitled features; never opt in to unentitled ones).
  3. Auto-detection (optional integrations only — Prometheus, ArgoCD, metrics-server, etc.).

Agent runtime

agent:
  model:
    scheduled: fast            # Haiku — background scans, low-stakes calls
    interactive: smart         # Sonnet — chat, complex reasoning
    region: us-east-1          # Bedrock region. Different from cluster
                               # region — Bedrock isn't in every region.
  schedule:
    healthCheckInterval: 30m   # Reporter cadence
    knowledgeGraphRefresh: 1h  # Scanner re-run
    entitlementCheck: 1h       # Marketplace re-verification
  namespaces:
    include: []                # Scope the scan. Empty = all namespaces
                               # minus the excluded list.
    exclude:
      - kube-system
      - kube-public
      - kube-node-lease

Integrations (optional, auto-detected)

integrations:
  prometheus:
    enabled: "auto"            # auto | true | false
    url: ""                    # Override auto-discovery; e.g.
                               # http://prometheus-kube-prometheus-prometheus.monitoring.svc:9090
  metricsServer:
    enabled: "auto"
  externalSecrets:
    enabled: "auto"
  argocd:
    enabled: "auto"

auto defers to detection: the scanner walks Services / CRDs / APIServices and flips the integration on when it sees the expected shape.

Reports + approvals

reports:
  target: configmap            # configmap | s3
  s3Bucket: ""
  retention: 30d
  recommenderMode: true        # true: findings carry recommendations, the
                               #   operator stages them manually in the UI.
                               # false: the Reporter auto-creates
                               #   ApprovalRequests at cron time so the
                               #   Approvals view populates overnight.
  maxAutoStagedPerTick: 10     # Only applies when recommenderMode=false.

approvals:
  ttlSeconds: 86400            # 24h. Bump to 172800 (48h) for weekend
                               # coverage. Was 5 min pre-v0.0.1 — the old
                               # value only worked for actively-watched
                               # chat approvals.
  configmapName: clu-ops-approvals

Persistence

Four operator-visible stores are backed by dedicated ConfigMaps in the pod's namespace so state survives pod restarts:

persistence:
  snoozesConfigmapName: clu-ops-snoozes
  auditConfigmapName: clu-ops-audit
  auditMaxRetained: 500        # Cap at this many entries. Stays well
                               # under the 1 MiB ConfigMap size limit.

The reports ConfigMap is named reports.configmapName (defaults to clu-ops-reports); the scan cache uses clu-ops-scan (not currently parameterized).

Notifications

Outbound webhooks only — Clu never receives inbound traffic from these:

notifications:
  slack:
    enabled: false
    webhookUrl: ""
  teams:
    enabled: false
    webhookUrl: ""

Notifications fire on new critical findings since the previous report tick; a steady-state crashloop produces one notification, not one per scan interval.

Resources

resources:
  backend:
    requests:
      cpu: 200m
      memory: 512Mi
    limits:
      cpu: 1000m
      memory: 1Gi
  frontend:
    requests:
      cpu: 50m
      memory: 64Mi
    limits:
      cpu: 200m
      memory: 128Mi

The backend's memory floor is driven by the LLM client buffering streams in flight. The frontend is a static nginx — the defaults are generous.

ServiceAccount + RBAC

serviceAccount:
  create: true
  name: ""                     # Defaults to the chart's fullname
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123:role/clu-irsa

RBAC roles the chart installs:

  • clu-ops-agent-readerClusterRole with get/list/watch on standard resources (Pods, Deployments, Services, ConfigMaps, Secrets, Events, etc.). Always installed.
  • clu-ops-agent-writerClusterRole with create/update/patch on workload resources. No delete, no cluster-admin, no secret-content reads. Installed only when modules.corePlus.writeOperations.enabled=true.
  • clu-ops-agent-state — namespaced Role for the agent's own state ConfigMaps (reports, approvals, snoozes, audit, scan). Narrow scope by resource name. Always installed.

Marketplace + license

marketplace:
  enabled: true
  productCode: ""              # Marketplace-assigned product code

license:
  key: ""                      # JWT license for non-Marketplace installs
  existingSecret: ""           # Or reference an existing Secret holding
                               # the key under data.license-key

For non-AWS installs, set marketplace.enabled=false and populate license.key (or existingSecret). For local dev, set license.key: dev — it bypasses entitlement checks entirely.

Settings-to-env mapping

The chart's ConfigMap template flattens the Helm values into env vars the backend's Pydantic Settings class reads. The naming rule:

  • modules.cloud.enabledMODULE_CLOUD_ENABLED
  • agent.model.scheduledAGENT_MODEL_SCHEDULED
  • integrations.prometheus.urlINTEGRATION_PROMETHEUS_URL
  • reports.recommenderModeREPORTS_RECOMMENDER_MODE
  • approvals.ttlSecondsAPPROVAL_TTL_SECONDS

To see the full mapping for an installed release, render the chart and inspect the rendered ConfigMap:

helm get manifest <release-name> -n clu-ops \
  | yq 'select(.kind == "ConfigMap" and (.metadata.name | contains("clu-ops-agent")))'

To override just one value without rendering the full chart, set the env var directly on the Deployment via your own overlay.