Context
I need a reliable place to store Terraform state files (.tfstate).
Remote state storage is a necessity. To achieve my Maximize Velocity and Ship Incrementally principles, infrastructure changes must happen via CI/CD pipelines. This requires a shared, accessible state backend that the runner can verify and update.
The traditional enterprise solution for specific state storage is an object store (like AWS S3) plus a locking mechanism (like DynamoDB). Historically, this required managing two separate pieces of infrastructure.
New Capabilities: As of Terraform 1.10+, S3 backends support native state locking using S3 conditional writes, removing the strict requirement for DynamoDB. Cloudflare R2 (which is S3-compatible) also supports this mechanism.
However, even with native locking, "self-hosting" state in S3/R2 requires:
- Provisioning and managing the bucket.
- Managing authentication credentials (access keys) for the CI runner.
- Configuring the backend in code.
This violates my Minimize Platforms, Maximize Velocity principle by adding "meta-infrastructure" that needs to be maintained just to maintain the actual infrastructure.
Decision
I will use Terraform Cloud (Free Tier) for state storage, but I will NOT use its remote execution capabilities for CI/CD.
This decision prioritizes "zero maintenance" over "self-hosting."
Alternatives Considered
Self-Hosted State (S3/R2 with Native Locking)
- Pros: Complete ownership of data. No third-party platform (TFC). Native locking removes the complexity of DynamoDB. Cloudflare R2 is a compelling low-cost option here.
- Cons: Still requires provisioning a bucket and managing securely scoped credentials for GitHub Actions.
- Status: Rejected for now, but widely considered a viable "Escape Hatch". If Terraform Cloud ever changes its free tier or terms, migrating to R2 with native locking is the fallback plan.
Terraform Cloud
- Pros: Zero configuration (no buckets to create). Free for small teams. Good UI for state history.
- Cons: Introduces a third-party dependency.
- Decision: Accepted for velocity.
Implementation Details
Specifically:
- State Backend: Terraform Cloud will host the state.
- Execution Mode: I will configure the workspace Execution Mode to
Local. This means Terraform Cloud acts only as a smart backend storage, while the actualterraform planandterraform applycommands run on my machine or in my own CI environment. - CI/CD: I will use GitHub Actions to run Terraform operations.
Consequences
Positive
- Zero Maintenance State Storage: No need to provision or manage S3 buckets or DynamoDB tables.
- Unified CI: By running Terraform in GitHub Actions (via
miseand theterraformCLI), I keep my build, test, and deploy pipeline in one place. I can seeterraform planoutputs directly in Pull Request comments (via the setup-terraform action or similar) rather than having to click out to the Terraform Cloud UI. This reduces context switching. - Faster Feedback: GitHub Actions has better integration with my repository (workflow triggers, checking out code) than triggering runs in TFC.
Negative
- Another Account: Requires a Terraform Cloud account, technically adding a "platform," but it is a low-touch utility rather than a daily driver.
- Configuration: Requires setting
execution_mode = "local"in the Terraform Cloud workspace settings, which is a manual step (or a "chicken and egg" problem to automate).
Risks
- Vendor Lock-in: Migrating state out of Terraform Cloud is possible (just change the
backendconfig and runterraform init -migrate-state), but it is slightly stickier than a raw S3 file.