Secure Terraform With GitHub OIDC: A Step-by-Step Guide

by Felix Dubois 56 views

Hey guys! Let's dive into how we can set up secure Terraform workflows using GitHub OIDC. No more static AWS keys – we're going to use OpenID Connect (OIDC) to get those short-lived credentials and assume an AWS IAM role. This setup is perfect for managing your infrastructure as code securely. So, let's break it down!

Context

The main goal here is to enable GitHub Actions to run Terraform without relying on static AWS keys. We're leveraging OpenID Connect (OIDC) to obtain temporary credentials, which will then be used to assume an AWS IAM role (specifically, terraform-apply). This role will need the permissions to access our backend (think S3, DDB, and KMS) and also manage Identity Center resources. It's all about that security, you know?

Scope

Here's what we're aiming to achieve:

  • An IAM role called terraform-apply with an OIDC trust policy. This policy will be tightly scoped to our repo and the specific branch or environment we're working with.
  • Two IAM policies attached to this role:
    1. TerraformStateAccess: This policy will grant permissions to access our S3 bucket, DynamoDB lock table, and KMS CMK.
    2. Identity Center minimal: Initially, this will be a broader set of actions, which we'll narrow down later to follow the principle of least privilege.
  • A KMS key policy update to allow our terraform-apply role to use the alias/tf-state key.
  • Minimal GitHub Actions workflows:
    • terraform-plan on pull requests, which will post the plan as a comment.
    • terraform-apply on push to the main branch, which will apply the changes.

Note: We're keeping things focused for now. Fine-grained least-privilege permissions per action, multi-account federation, and production approvals are out of scope for this initial setup but can definitely be follow-ups.

Pre-reqs

Before we jump into the implementation, let's make sure we have a couple of things in place:

  • An OIDC identity provider should already exist in your AWS account. This provider needs to have:
    • Issuer: https://token.actions.githubusercontent.com
    • Audience: sts.amazonaws.com
  • You'll also need the backend ARN values available. These should be coming from your bootstrap outputs.

Implementation Steps

Alright, let's get our hands dirty! Here's a step-by-step breakdown of what we need to do.

IAM Terraform (infra/ci-oidc/)

First up, we're going to define our IAM resources using Terraform. This will live under the infra/ci-oidc/ directory in our repository.

  • [ ] aws_iam_role.terraform_apply with trust policy:

    We need to create an IAM role named terraform_apply. The crucial part here is the trust policy. This policy will define who can assume this role. We're using OIDC, so the policy needs to trust GitHub Actions.

    • Principal.Federated = arn:aws:iam::<acct>:oidc-provider/token.actions.githubusercontent.com

      This specifies that the principal is a federated entity, in this case, our OIDC provider.

    • Condition.StringEquals["token.actions...:aud"] = "sts.amazonaws.com"

      This condition ensures that the audience of the OIDC token is sts.amazonaws.com, which is what we expect for AWS STS.

    • Condition.StringLike["token.actions...:sub"] = "repo:<org>/<repo>:ref:refs/heads/main" (or environment:<name>)
      This is where we scope the trust to our repository and branch. We can also scope it to a specific environment if needed.

  • [ ] aws_iam_policy.TerraformStateAccess

    Next, we'll define a policy that grants access to our Terraform state backend. This includes:

    • S3 list/get/put/delete on the state bucket.

      We need permissions to list objects in the bucket, get objects (the state file), put objects (update the state), and delete objects (in some cases).

    • DDB Put/Get/Delete/Update

      DynamoDB is used for state locking, so we need permissions to interact with the lock table.

    • KMS Encrypt/Decrypt/DataKey/Describe

      If our state is encrypted with KMS, we need these permissions to encrypt and decrypt the state.

  • [ ] aws_iam_policy.TerraformIdentityCenterMinimal

    This policy will grant the necessary permissions to manage Identity Center resources. Initially, we'll start with broader actions like identitystore:*, sso:*, and ssoadmin:*. The goal is to get things working and then narrow down the permissions later to follow the principle of least privilege. It’s like starting with a sledgehammer and then switching to a scalpel, you know?

  • [ ] Attach both policies to the role.

    Finally, we need to attach both the TerraformStateAccess and TerraformIdentityCenterMinimal policies to the terraform_apply role.

KMS Key Policy Update (in bootstrap)

We need to update the KMS key policy to allow our terraform-apply role to use the CMK that encrypts our Terraform state.

  • [ ] Allow arn:aws:iam::<acct>:role/terraform-apply to use the CMK.

    We'll add a statement to the KMS key policy that allows the terraform-apply role to perform actions like encrypt and decrypt using the key.

  • [ ] Keep root admin statement to avoid lockout.

    It's crucial to keep the root admin statement in the KMS key policy to prevent accidental lockouts. This ensures that the root user in our account can always access the key.

GitHub Actions

Now, let's set up our GitHub Actions workflows. We'll create two workflows: one for planning and one for applying.

  • [ ] .github/workflows/plan.yml: PR trigger β†’ OIDC assume β†’ init/plan for envs/sandbox β†’ comment plan.

    This workflow will trigger on pull requests. It will:

    1. Assume the terraform-apply role using OIDC.
    2. Run terraform init and terraform plan for the envs/sandbox environment.
    3. Post the plan as a comment on the pull request. This is super helpful for reviewing changes before they're applied.
  • [ ] .github/workflows/apply.yml: push to main β†’ OIDC assume β†’ init/apply for envs/sandbox.

    This workflow will trigger when we push to the main branch. It will:

    1. Assume the terraform-apply role using OIDC.
    2. Run terraform init and terraform apply for the envs/sandbox environment. This is where the actual changes to our infrastructure are applied.
  • [ ] (Optional) add Protected Environment to require manual approval for apply.

    For added safety, we can add a protected environment in GitHub. This will require manual approval before the apply workflow can run. It’s like having a safety net, you know?

Acceptance Criteria (Definition of Done)

How do we know we've done a good job? Here are the acceptance criteria:

  • [ ] From PR: workflow posts a plan comment.

    When a pull request is created, the terraform-plan workflow should run and post a comment with the Terraform plan.

  • [ ] From merge to main: workflow applies successfully.

    When changes are merged into the main branch, the terraform-apply workflow should run and apply the changes without errors.

  • [ ] CloudTrail shows AssumeRoleWithWebIdentity for terraform-apply.

    We should be able to see in CloudTrail that the terraform-apply role is being assumed using AssumeRoleWithWebIdentity, which confirms that OIDC is working as expected. This is like seeing the footprints of our process, right?

  • [ ] terraform init/plan succeed (no KMS AccessDenied).

    Running terraform init and terraform plan should succeed without any KMS AccessDenied errors. This verifies that our role has the necessary permissions to access the KMS key.

  • [ ] OIDC trust is scoped to this repo and desired branch/env.

    We need to ensure that the OIDC trust policy is correctly scoped to our repository and the specific branch or environment we're working with. This is a critical security measure to prevent unauthorized access.

Validation Commands

To validate our setup, we can use the following commands:

# Identify role ARN
aws iam list-roles | grep terraform-apply

# Verify KMS policy includes the role
aws kms get-key-policy --key-id alias/tf-state --policy-name default | jq .

# Run a local plan using the same role (SSO or test assume)
cd envs/sandbox
terraform init
terraform plan

These commands will help us verify that the terraform-apply role exists, that the KMS key policy includes the role, and that we can run Terraform commands locally using the same role.

So there you have it, guys! A comprehensive guide to setting up secure Terraform workflows with GitHub OIDC. This approach not only enhances security by eliminating static AWS keys but also streamlines the infrastructure management process. Happy Terraforming!