Cleanup of Terraform Apply partial fails

what are the best practice recommendations on handling the cleanup after a Terraform Apply fails midway ! (total 10 resources to be changed, but 3 did and the 4th one bombed). I recognize that Terraform state itself may not see this as a problem because on the next apply (after fixing the root cause of the error), it can pickup from the 4th change onwards, but it seems that the infrastructure is left in an inconsistent state (from the point of view of the infrastructure engineer)…

Should the Terraform developer first do manual cleanup to bring back the infrastructure into a previously consistent state and only then do the full ‘apply’ again…

This is unfortunately a big weakness of Terraform. If it fails part way through, it will fail to record in the state file the changes it made, which will leave things in an inconsistent state. If you then re-run apply, you’ll either get an error, because some resource already exists with that name (but Terraform doesn’t know about it because it failed to save it in state) or for resources that don’t enforce unique identities, you’ll get duplicate copies.

There are no good solutions at the moment. Your only options are ugly workarounds such as:

  1. Manually find everything that got deployed, delete it, and re-run terraform apply.
  2. Manually find everything that got deployed, and for each such resource, run terraform import.

See also https://github.com/hashicorp/terraform/issues/20718.

Thanks Jim ! What seems so intuitive with databases & transaction management, is indeed harder when the underlying ‘data’ being changed is actually infrastructure being manipulated.