Terraform Stacks

Links#

https://developer.hashicorp.com/terraform/language/stacks
https://developer.hashicorp.com/terraform/language/stacks/create
https://developer.hashicorp.com/terraform/language/block/stack/tfcomponent
https://developer.hashicorp.com/terraform/language/block/stack/tfdeploy
https://developer.hashicorp.com/terraform/cli/commands/stacks
https://developer.hashicorp.com/terraform/cli/commands/stacks/validate
https://developer.hashicorp.com/terraform/cli/commands/stacks/providers-lock
https://developer.hashicorp.com/terraform/cloud-docs/workspaces/dynamic-provider-credentials/aws-configuration

1. Important Points#

Terraform Stacks 是 Terraform / HCP Terraform 的 higher-level deployment model:
    component:
        reusable infrastructure unit
        usually points to a Terraform module

    deployment:
        one instance of the stack
        usually maps to env / region / account

    stack:
        components + deployments + inputs + provider wiring

适合:
    multi-env infrastructure
    multi-region / multi-account deployment
    same modules deployed many times with different inputs
    platform team wants standardized dependency graph

不适合:
    tiny one-off Terraform project
    team only needs one root module and one workspace
    infra still changes manually outside Terraform

核心原则:
    module owns resource implementation
    component wires module + providers + inputs
    deployment owns environment-specific values
    identity token / OIDC should replace static cloud keys
    component dependency must be explicit
    stacks should be validated in CI before apply

2. Service Configuration#

file types#

File	Purpose
`*.tfcomponent.hcl`	declare providers, components, inputs, outputs for the stack
`*.tfdeploy.hcl`	declare deployments and deployment-specific values
`.terraform.lock.hcl`	provider version lock file
module `*.tf`	normal Terraform module implementation

mental model:
    module:
        how to create VPC / ECS / RDS / S3

    component:
        this stack uses that module with these providers and inputs

    deployment:
        create prod-ap-east-1 / dev-ap-east-1 / prod-us-east-1 from the same stack

project structure#

order-platform-stacks
├── README.md
├── providers.tfcomponent.hcl
├── variables.tfcomponent.hcl
├── network.tfcomponent.hcl
├── app.tfcomponent.hcl
├── deployments.tfdeploy.hcl
└── modules
    ├── network
    │   ├── main.tf
    │   ├── variables.tf
    │   └── outputs.tf
    └── app
        ├── main.tf
        ├── variables.tf
        └── outputs.tf

structure rules:
    keep stack wiring in root
    keep resource implementation in modules
    use one module per meaningful capability
    do not put large resource logic directly in component files

3. Core Concepts#

required providers#

# providers.tfcomponent.hcl
required_providers {
  aws = {
    source  = "hashicorp/aws"
    version = "~> 5.0"
  }
}

required_providers:
    declares provider source and version
    stacks lock providers with terraform stacks providers-lock

provider#

# providers.tfcomponent.hcl
provider "aws" "this" {
  config {
    region = var.aws_region

    assume_role_with_web_identity {
      role_arn           = var.aws_role_arn
      web_identity_token = identity_token.aws.jwt
    }
  }
}

identity_token "aws" {
  audience = ["aws.workload.identity"]
}

provider rules:
    use OIDC / dynamic credentials in HCP Terraform
    avoid long-lived AWS access keys
    keep provider aliases explicit
    pass provider to each component intentionally

input variables#

# variables.tfcomponent.hcl
variable "env" {
  type = string
}

variable "aws_region" {
  type = string
}

variable "aws_role_arn" {
  type      = string
  sensitive = true
}

variable "vpc_cidr" {
  type = string
}

input rules:
    shared shape goes in variable blocks
    env/region/account values go in deployment
    mark role ARN / token / secret as sensitive when appropriate

component#

# network.tfcomponent.hcl
component "network" {
  source = "./modules/network"

  providers = {
    aws = provider.aws.this
  }

  inputs = {
    name     = "order-${var.env}"
    vpc_cidr = var.vpc_cidr
  }
}

component:
    wraps one Terraform module
    declares providers and inputs
    can publish outputs for other components

dependency between components#

# app.tfcomponent.hcl
component "app" {
  source = "./modules/app"

  providers = {
    aws = provider.aws.this
  }

  inputs = {
    env            = var.env
    vpc_id         = component.network.vpc_id
    private_subnet_ids = component.network.private_subnet_ids
  }
}

dependency:
    app depends on network because it reads component.network outputs
    keep dependencies explicit
    avoid circular component dependency

deployment#

# deployments.tfdeploy.hcl
deployment "dev_ap_east_1" {
  inputs = {
    env          = "dev"
    aws_region   = "ap-east-1"
    aws_role_arn = "arn:aws:iam::111111111111:role/hcp-terraform-dev"
    vpc_cidr     = "10.10.0.0/16"
  }
}

deployment "prod_ap_east_1" {
  inputs = {
    env          = "prod"
    aws_region   = "ap-east-1"
    aws_role_arn = "arn:aws:iam::222222222222:role/hcp-terraform-prod"
    vpc_cidr     = "10.20.0.0/16"
  }
}

deployment:
    one deploy target of the stack
    usually maps to env + region + account
    should not contain resource implementation

4. Workflow Best Practices#

local validation#

terraform stacks init
terraform stacks validate
terraform stacks providers-lock -platform=linux_amd64 -platform=darwin_arm64

CI baseline:
    terraform fmt -check
    terraform stacks validate
    terraform stacks providers-lock check if lock file changes are expected

HCP Terraform workflow#

typical flow:
    1. push stack config to VCS
    2. HCP Terraform detects stack changes
    3. plan runs per deployment / component
    4. review plan
    5. apply approved changes

review focus:
    component graph
    deployment count
    provider role/account
    destructive changes
    cross-environment dependency

modules vs components#

Layer	Owns	Example
Module	resource implementation	`aws_vpc`, `aws_subnet`, route tables
Component	module instance in stack	`component "network"`
Deployment	env/account/region values	`prod_ap_east_1`

rule:
    keep module reusable
    keep component boring
    keep deployment values explicit

5. Security Best Practices#

credentials:
    prefer dynamic provider credentials / OIDC
    avoid static AWS access key in variables
    separate dev/prod AWS roles
    scope IAM role to component responsibility when possible

state:
    do not output secrets
    mark sensitive outputs
    review who can read plans/state

review:
    production deployment requires approval
    destructive plan requires explicit review
    provider role ARN must match target account

AWS trust policy sample#

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::222222222222:oidc-provider/app.terraform.io"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "app.terraform.io:aud": "aws.workload.identity"
        },
        "StringLike": {
          "app.terraform.io:sub": "organization:my-org:project:platform:stack:order-platform:*"
        }
      }
    }
  ]
}

replace:
    222222222222:
        AWS account ID

    my-org:
        HCP Terraform organization

    platform:
        HCP Terraform project

    order-platform:
        stack name

6. Reliability / Change Management#

change safety:
    split high-blast-radius resources into separate components
    avoid one deployment that mixes unrelated environments
    keep prod and dev role/account separated
    use clear component names
    keep module version changes reviewable

deployment design:
    dev:
        fast iteration

    staging:
        production-like role and config

    prod:
        explicit approval
        stricter IAM
        slower rollout

rollback:
    revert stack config commit
    re-run plan
    review whether resource deletion/recreation is involved
    restore state only as last resort

7. Monitoring / Operations#

what to watch:
    failed plan
    failed apply
    pending approval
    drift
    provider credential error
    component dependency failure
    long-running apply

runbook should include:
    how to find deployment
    how to inspect component plan
    how to approve/reject apply
    how to rotate provider credentials
    how to revert a bad stack config commit

8. Hands-on#

create stack skeleton#

mkdir -p order-platform-stacks/modules/network order-platform-stacks/modules/app
cd order-platform-stacks
touch providers.tfcomponent.hcl variables.tfcomponent.hcl network.tfcomponent.hcl app.tfcomponent.hcl deployments.tfdeploy.hcl

validate stack#

terraform stacks init
terraform stacks validate

lock providers#

terraform stacks providers-lock -platform=linux_amd64 -platform=darwin_arm64

format#

terraform fmt -recursive

CI example#

name: terraform-stacks

on:
  pull_request:

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - run: terraform fmt -check -recursive
      - run: terraform stacks init
      - run: terraform stacks validate

9. Production Checklist#

structure:
    modules contain resource implementation
    components are small and named clearly
    deployments map to env/region/account
    component dependencies are explicit

security:
    dynamic provider credentials enabled
    no static cloud keys in repo
    prod role separated from dev role
    state/plan access reviewed

workflow:
    fmt / validate in CI
    provider lock file committed
    production apply requires review
    destructive changes require manual approval

operations:
    failed plan/apply runbook exists
    rollback process documented
    provider credential rotation documented
    module version upgrade process documented