Automate secret rotation in Kubernetes, then get out of the way!

Márk Sági-Kazár


2023-05-12 @ Open Source Summit NA 2023

whoami

Márk Sági-Kazár

Open Source Tech Lead @ Cisco

CNCF Ambassador




@sagikazarmark

https://sagikazarmark.hu

hello@sagikazarmark.hu

Empowering engineering teams to focus on their core business objectives, while seamlessly running their applications on Kubernetes.

Once upon a time…

I was in a debugging session in the middle of the night. There was an AWS permission issue in one of the applications, but as it turned out, the application was calling something that it shouldn’t have. I grabbed the credentials from the dev environment.

One of the most common misconecptions about Kubernetes secrets is that they are insecure because of the use of base64 encoding.

This is actually one of my favorite.

Your secrets WILL be compromised…

When?

What are you going to do about it?

  • Well, not all
  • Short-lived (eg. workload identity) tokens are OK
  • Long-lived (eg. static credentials) are problematic

Why is secret rotation important?

  • Maintain security of sensitive information
  • Meet compliance requirements
  • Reduce the risk of a data breach

Challenges of secret rotation

  • Complexity
  • Time-consuming and error prone process
  • Disruption of service availability

Multi -cluster, -env, -tenant setups

Secret rotation should be…

  • possible
  • automated
  • periodic

Hard-coded secrets make it harder

Secret rotation flow

OperatorSecret providerSecret store???ProductionWatch for changesGenerate new secretReturn new secretRotate secret in storeNotice secret changeDeploy new secretOperatorSecret providerSecret store???Production

Secret rotation in Kubernetes

⚠️ Plug the holes first! ⚠️

  • Turn on encryption at rest
  • Configure least-privilege access to Secrets


Official guide: Good practices for Kubernetes Secrets

Deploying secrets to Kubernetes

  • External Secrets Operator (ESO): https://external-secrets.io
  • Synchronize secrets from an external store to Kubernetes
  • Mount secrets as usual (env var, file)
  • Configure sync via CR
  • Works well with GitOps

  • Go into details, will be important later
  • Secret store config (managed by platform team)
  • External secret (managed by dev team)

Alternatives

  • Sealed Secrets
  • SOPS (+ operator/Helm secrets)
  • They don’t scale well
  • Encryption requires the user to know the secret

  • File changes must be detected by applications
  • Env vars cannot change

Triggering workload rollout

  • Reloader: https://github.com/stakater/Reloader
  • Detects secret changes
  • Triggers rollout for workloads referencing changed secrets

Secret storeExternal secretsKubernetesReloaderWatch for changesWatch for changesNotice secret changeDeploy new secretNotice secret changeTrigger workload rolloutSecret storeExternal secretsKubernetesReloader

What could possibly go wrong?

Who knows, so monitor everything

  • Metrics and SLI recommendations
  • Grafana dashboard (needs improvements)

Warning

Potential high cardinality labels (drop metrics/labels you don’t need)

Changes take effect with a delay

  1. Change some configuration ✏️
  2. Wait until the next secret sync period 🤞
  3. Hope nothing breaks 🙏

Solution: create (and modify) test secrets at the same time.

Cascading effect of an outage 1

Requirement: Use store validation.

  1. Provider goes down for a long time (ie. hours) ❌
  2. Store validation reaches a backoff of hours ⏳
  3. Secret synchronization essentially stops 😱

Solution: Bump every (Cluster)SecretStore after an outage.



  1. May not be a problem anymore

To sum up ESO

  • Understand how (and when) changes will take effect
  • Monitor and alert for failures

Kubernetes without secrets 😱

Access secret store directly

  • Integrated into the application

OR

  • “Inject” secrets into the application
  • Makes the application less portable
  • It’s better to keep config management separate

Secret injection in Kubernetes

  • Inject a custom init into Pods using a mutating admission webhook
  • Get secrets from secret store in the custom init
  • Inject secrets as environment variables
  • There are file based solutions, but mostly env

Bank-Vaults

  • Started at Banzai Cloud
  • Vault Swiss Army knife

https://bank-vaults.dev

Bank-Vaults secret injection

  • Secret references: vault:path/to/secret#KEY
  • Mutating webhook
    • Detect secret references
    • Mutate Pods
  • Custom init replaces secret references with actual values

Warning

Secret changes do not take effect (ie. trigger workload reload) at the moment.

Risks and mitigations

Risk: Secret store is a SPOF

Mitigation: Maintain a cluster-local instance


Risk: Webhook is a SPOF

Mitigation: Configure webhook according to best practices

Alternatives

  • Kamus
  • Secrets Store CSI Driver

Bank-Vaults Roadmap

  • Moving to a new GitHub organization
  • Workload reload on secret change
  • Support for more providers
  • Secret synchronization between providers
  • Your desired feature (submit a new feature request)

Demo

https://github.com/sagikazarmark/demo-oss-na-2023-kube-secret-rotation

Final thoughts

It seems wisest to assume the worst from the beginning…and let anything better come as a surprise.

— Jules Verne

  • May not be the best life advice/key to hapiness
  • Prepare for the worst
  • Understand the trade-offs and limitations of each solution

Thank you

Any questions?



@sagikazarmark

https://sagikazarmark.hu

hello@sagikazarmark.hu

Automate secret rotation in Kubernetes, then get out of the way! Márk Sági-Kazár 2023-05-12 @ Open Source Summit NA 2023

  1. Slides

  2. Tools

  3. Close
  • Automate secret rotation in Kubernetes, then get out of the way!
  • whoami
  • Once upon a time…
  • One of the most common...
  • This is actually...
  • Your secrets WILL be compromised…
  • Well, not all Short-lived...
  • Why is secret rotation important?
  • Challenges of secret rotation
  • Secret rotation should be…
  • Secret rotation flow
  • Secret rotation in Kubernetes
  • ⚠️ Plug the holes first! ⚠️
  • Deploying secrets to Kubernetes
  • Go into details,...
  • Alternatives
  • File changes must...
  • Triggering workload rollout
  • sequenceDiagram participant...
  • What could possibly go wrong?
  • Who knows, so monitor everything
  • Changes take effect with a delay
  • Cascading effect of an outage 1
  • To sum up ESO
  • Kubernetes without secrets 😱
  • Access secret store directly
  • Secret injection in Kubernetes
  • Bank-Vaults
  • Bank-Vaults secret injection
  • Risks and mitigations
  • Alternatives
  • Bank-Vaults Roadmap
  • Demo
  • Final thoughts
  • Thank you
  • Slide 36
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • ? Keyboard Help