v0.1.x. Jump to latest →Permissions: what the pod can do
Clu runs under two stacked identity systems. Both matter. Neither replaces the other.
| Layer | Identity | Grants access to |
|---|---|---|
| AWS IAM | IRSA role (via OIDC federation) or EKS Pod Identity | AWS APIs — Bedrock, IAM, RDS, S3, CloudWatch, Secrets Manager, EC2, Cost Explorer |
| Kubernetes RBAC | ServiceAccount clu-ops:clu-ops-agent | K8s API — Pods, Deployments, Services, ConfigMaps, etc. |
A bedrock:InvokeModel call inside the pod uses the IAM identity.
A kubectl get pods-equivalent call uses the K8s identity. They're
independent — denying one doesn't deny the other.
This doc is the full map of what each layer grants.
Layer 1 — AWS IAM
The pod gets AWS credentials through one of two mechanisms:
IRSA (IAM Roles for Service Accounts)
Default on EKS 1.19+. The pod's ServiceAccount carries an
eks.amazonaws.com/role-arn annotation; the Amazon EKS Pod Identity
webhook mutates the pod spec at admission time to:
- Project an OIDC token into the pod at
/var/run/secrets/eks.amazonaws.com/serviceaccount/token. - Set
AWS_WEB_IDENTITY_TOKEN_FILE+AWS_ROLE_ARNenv vars.
When the pod's boto3 client makes its first AWS call, it reads the
env vars + token, calls sts:AssumeRoleWithWebIdentity, and gets
temporary credentials scoped to the named role. Credentials auto-
rotate on a ~1-hour cycle.
Trust-policy condition: the role's trust policy must match
system:serviceaccount:<namespace>:<sa-name>. For Clu that's
system:serviceaccount:clu-ops:clu-ops-agent. The trust-policy
template is in IAM setup.
EKS Pod Identity (EKS 1.27+)
The newer alternative. Requires the eks-pod-identity-agent addon
installed on the cluster. Replaces OIDC federation with a direct
EKS → IAM binding managed by EKS itself — no OIDC provider to
configure, no AssumeRoleWithWebIdentity round-trip.
Clu works with either. Pod Identity is simpler to set up if you're starting fresh; IRSA is the de-facto standard if you already have other workloads using it.
What the IRSA role grants (per capability)
The full inline JSON for each policy lives in IAM setup. One paste-ready document per capability tier you've enabled in Helm. This page summarizes the intent; the actual policy text is canonical there.
Core (always attached):
| Service | Actions | Why |
|---|---|---|
| Bedrock | InvokeModel, InvokeModelWithResponseStream, Converse, ConverseStream | LLM inference for chat + scheduled reports |
| CloudWatch | GetMetricData, GetMetricStatistics, ListMetrics, DescribeAlarms, logs:StartQuery, logs:GetQueryResults, logs:DescribeLog* | Metrics + logs context on health-rule findings |
| AWS Marketplace | RegisterUsage | Entitlement metering at startup (bypassed with JWT license) |
Resources are scoped: Bedrock is narrowed to specific model-family ARN patterns (Anthropic Claude, Llama, gpt-oss, Mistral); CloudWatch
- Logs are cluster-wide (read-only); the Marketplace action takes no resource scoping.
Cloud (added when modules.cloud.enabled=true):
| Service | Actions | Tool using it |
|---|---|---|
| IAM | ListRoles, GetRole, ListAttachedRolePolicies | aws_iam_roles, aws_irsa_mapping |
| RDS | DescribeDBInstances, DescribeDBClusters | aws_rds |
| ElastiCache | DescribeCacheClusters, DescribeReplicationGroups | aws_elasticache |
| S3 | ListAllMyBuckets, GetBucketLocation, GetBucketTagging, GetBucketPolicy, GetBucketEncryption | aws_s3 (never reads bucket content) |
| Secrets Manager | ListSecrets, DescribeSecret | aws_secrets (never reads secret values) |
| ECR | DescribeRepositories, DescribeImages, ListImages | aws_ecr |
| EC2 | DescribeVpcs, DescribeSubnets, DescribeSecurityGroups, DescribeRouteTables, DescribeNatGateways, DescribeInternetGateways, DescribeInstances, DescribeVolumes | aws_vpc_networking, savings detectors |
| ELB | DescribeLoadBalancers, DescribeTargetGroups, DescribeTargetHealth | savings detectors (idle ALBs/NLBs) |
| Cost Explorer | GetCostAndUsage, GetCostForecast, GetTags, GetDimensionValues | aws_cost_summary |
Explicit omissions that matter:
- No
GetSecretValue— Clu never reads secret content. Theaws_secretstool surfaces names + ARNs so the agent can reason about secret existence, not contents. - No S3
GetObject— Clu lists buckets but doesn't read objects. - No EC2 write actions — describe-only, never
Run*,Terminate*,Create*.
Core Plus (added when modules.corePlus.enabled=true):
identical to Cloud. Writes happen K8s-side (through the
chart's writer ClusterRole below), not cloud-side. The IDP policy is
attached separately so operators can toggle the Core Plus
independently without re-attaching the Cloud's JSON.
Layer 2 — Kubernetes RBAC
The pod's ServiceAccount is clu-ops:clu-ops-agent. Every K8s API
call the pod makes authenticates as this SA. Three RBAC bindings
govern what it can do.
Cluster-wide reader (always installed)
Granted by the chart at install time. Lets the agent observe every resource it needs to reason about cluster shape, without granting any write or admin privilege.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: clu-ops-agent-reader
rules:
# Core resources
- apiGroups: [""]
resources:
- pods
- pods/log
- services
- endpoints
- configmaps
- namespaces
- nodes
- persistentvolumeclaims
- persistentvolumes
- replicationcontrollers
- serviceaccounts
- resourcequotas
- limitranges
- events
verbs: ["get", "list", "watch"]
# Secrets — narrower scope than the other core resources because
# content reads reveal application data. We list them so Helm 3
# release discovery works (Helm stores each release-revision as a
# Secret labelled ``owner=helm``), and we grant ``get`` so
# ``helm_values`` can decode the release payload. The writer
# ClusterRole explicitly does NOT grant any verb on Secrets; write
# paths never touch them.
- apiGroups: [""]
resources: [secrets]
verbs: ["get", "list", "watch"]
# Workloads
- apiGroups: ["apps"]
resources:
- deployments
- replicasets
- statefulsets
- daemonsets
verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
resources: [jobs, cronjobs]
verbs: ["get", "list", "watch"]
# Networking
- apiGroups: ["networking.k8s.io"]
resources: [networkpolicies, ingresses, ingressclasses]
verbs: ["get", "list", "watch"]
# RBAC (read-only — needed for IAM/RBAC analysis)
- apiGroups: ["rbac.authorization.k8s.io"]
resources: [roles, rolebindings, clusterroles, clusterrolebindings]
verbs: ["get", "list", "watch"]
# Policy
- apiGroups: ["policy"]
resources: [poddisruptionbudgets]
verbs: ["get", "list", "watch"]
# Storage
- apiGroups: ["storage.k8s.io"]
resources: [storageclasses, volumeattachments, csinodes, csidrivers]
verbs: ["get", "list", "watch"]
# Autoscaling
- apiGroups: ["autoscaling"]
resources: [horizontalpodautoscalers]
verbs: ["get", "list", "watch"]
# Metrics (when metrics-server present)
- apiGroups: ["metrics.k8s.io"]
resources: [pods, nodes]
verbs: ["get", "list"]
# API discovery
- apiGroups: ["apiextensions.k8s.io"]
resources: [customresourcedefinitions]
verbs: ["get", "list", "watch"]
- apiGroups: ["admissionregistration.k8s.io"]
resources: [validatingwebhookconfigurations, mutatingwebhookconfigurations]
verbs: ["get", "list", "watch"]
- apiGroups: ["apiregistration.k8s.io"]
resources: [apiservices]
verbs: ["get", "list", "watch"]
Notable inclusions:
- Secrets
get/list/watchis required because Helm 3 stores release manifests as labeled Secrets. Without ithelm_listreturns silently empty. The reader ClusterRole grants the verbs but not content-read privilege at the K8s-write side — see the explicit non-grants below.
Notable omissions (security-relevant):
- No
create,update,patch,deleteon anything. Writes live exclusively in the writer ClusterRole. - No impersonation (
users,groups,serviceaccountsin authentication.k8s.io). - No tokenreview / subjectaccessreview — the pod can't introspect arbitrary identities.
- No CRD discovery beyond the registered CRD shapes the scanner
walks (
apiextensions.k8s.iois limited to list, for CRDs specifically — not for arbitrary resources).
Cluster-wide writer (installed when the Core Plus is active)
Granted by the chart only when modules.corePlus.writeOperations.enabled=true
in Helm values. Every action under this role flows through the in-
product approval gate before a real apply.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: clu-ops-agent-writer
rules:
# Workload writes — no delete, no cluster-admin.
- apiGroups: ["apps"]
resources: [deployments, statefulsets, daemonsets]
verbs: ["create", "update", "patch"]
- apiGroups: ["batch"]
resources: [jobs, cronjobs]
verbs: ["create", "update", "patch"]
- apiGroups: [""]
resources: [services, configmaps, serviceaccounts, namespaces]
verbs: ["create", "update", "patch"]
- apiGroups: ["networking.k8s.io"]
resources: [ingresses, networkpolicies]
verbs: ["create", "update", "patch"]
- apiGroups: ["autoscaling"]
resources: [horizontalpodautoscalers]
verbs: ["create", "update", "patch"]
# Subresources for scale + restart (rollouts).
- apiGroups: ["apps"]
resources: [deployments/scale, statefulsets/scale]
verbs: ["update", "patch"]
# Cordon (a node patch) — narrowly scoped.
- apiGroups: [""]
resources: [nodes]
verbs: ["patch"]
Critical non-grants (invariants, not configurable):
- No
deleteverb anywhere. The Core Plus doesn't delete. Cleanup is an operator responsibility via kubectl, helm, or GitOps. - No cluster-admin. No wildcard
*rules, no*/scalesubresource on unexpected kinds, nosecrets/*beyond the reader grant. - No secret content reads from the writer.
k8s_applyon a manifest that contains a Secret would requirecreate/update/patchonsecrets— which the writer does NOT include. Customers who need to ship Secret content must do so via ExternalSecrets or another out-of-band path.
Namespaced state Role (always installed)
Scoped to the pod's own namespace. Governs Clu's self-state — the four ConfigMaps that persist operator-visible state (reports, approvals, snoozes, audit) plus the scan cache.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: clu-ops-agent-state
namespace: clu-ops
rules:
# ``create`` cannot be scoped by ``resourceNames`` — applies to
# any ConfigMap in the namespace, but the agent only creates the
# five named below.
- apiGroups: [""]
resources: [configmaps]
verbs: [create]
# Read + update on the specific store ConfigMaps the agent owns.
- apiGroups: [""]
resources: [configmaps]
resourceNames:
- clu-ops-reports
- clu-ops-approvals
- clu-ops-snoozes
- clu-ops-audit
- clu-ops-scan
verbs: [get, update, patch]
# No ``delete`` — store-side retention caps handle cleanup, not
# K8s-side TTL.
This is a Role (namespaced), not a ClusterRole. Blast radius is
the pod's own namespace only. If this Role is missing, the factory
falls back to InMemory for those stores + logs a WARN on startup —
the agent works but state vanishes on restart.
Protected namespaces
Hard-coded in the backend (not Helm-configurable):
kube-systemkube-publickube-node-leaseclu-ops
Every write tool calls the backend's WriteSafetyGate which rejects any write targeting these namespaces, regardless of RBAC. Even if an operator attached cluster-admin to the writer ClusterRole, writes to these namespaces still fail at the gate.
Layer 3 (optional) — the human operator
When you kubectl get pods against an EKS cluster, the cluster
needs to authorize your human IAM identity, not the pod's. EKS
resolves this via either:
- Access entries (modern, recommended) — IAM role ARN → K8s group or access-policy mapping, stored in EKS control plane.
- aws-auth ConfigMap (legacy) — same mapping via a ConfigMap in kube-system that EKS watches.
For first install, the AmazonEKSClusterAdminPolicy access policy
is the simplest mapping:
your IAM role ─► access entry ─► AmazonEKSClusterAdminPolicy ─► cluster-admin
cluster-admin is the built-in ClusterRole with * on *. You
need that to bootstrap the cluster + install Clu; day-to-day access
should be narrower (namespace admin, dev-namespace viewer, etc.).
Your human IAM permissions (what you can do in the AWS console +
CLI) are independent from both the pod's IRSA role AND the access
entry. AmazonEKSClusterAdminPolicy is a K8s-side policy mapped
through EKS; your IAM policy is whatever your SSO role grants you at
the AWS API layer.
How the three layers compose in practice
You ask "why can't Clu list my Secrets?" and the answer depends on which layer is denying:
- Clu's SA missing RBAC → chat shows
error_kind: permission_denied ... list on Secret ... forbidden by RBAC, paste-ready fix targets the reader ClusterRole above. - Clu's IRSA role missing IAM → same shape,
error_kind: permission_denied ... aws:secretsmanager, paste-ready fix targets the Cloud JSON in IAM setup. - Your kubectl user missing access → EKS returns 401/403 on
your
kubectlcall; Clu never sees the request. Check your access entry:aws eks list-access-entries --cluster-name <name>.
The three are distinguishable because Clu's chat surfaces layer 1 and layer 2 via the structured error taxonomy, and layer 3 denies show up in your terminal (not in Clu) before the request even reaches the pod.
How to audit what's granted right now
# Layer 1 — IAM role attached policies
role_arn=$(kubectl get sa -n clu-ops clu-ops-agent \
-o jsonpath='{.metadata.annotations.eks\.amazonaws\.com/role-arn}')
role_name=${role_arn##*/}
aws iam list-role-policies --role-name "$role_name" # inline policies
aws iam list-attached-role-policies --role-name "$role_name" # managed
# Layer 2 — RBAC on the pod's SA
kubectl auth can-i --as=system:serviceaccount:clu-ops:clu-ops-agent --list
# Or for a specific verb:
kubectl auth can-i --as=system:serviceaccount:clu-ops:clu-ops-agent \
list secrets -n default
# Layer 3 — your own access
kubectl auth can-i --list # what your current kubeconfig user can do
aws eks list-access-entries --cluster-name <name> --region <region>
aws eks list-associated-access-policies --cluster-name <name> \
--principal-arn <your-role-arn> --region <region>
The first two commands are what Clu itself runs internally (or would, if self-introspection of RBAC becomes a tool — currently it isn't; the errors come from the actual calls failing).