Category Archives: Cloud

GCP + Terraform: Running Terraform Commands with a Service Account

PROBLEM

When running these commands…

gcloud auth login
gcloud auth application-default login

… it allows terraform apply to provision the infrastructure using your credential.

However, sometimes there’s a need to run Terraform using a service account.

SOLUTION

First, identify the service account you want to use… for example: my-service-account@my-project.iam.gserviceaccount.com.

Then, create and download the private key for the service account.

Command:

gcloud iam service-accounts keys create --iam-account my-service-account@my-project.iam.gserviceaccount.com  key.json              

Output:

created key [xxxxxxxx] of type [json] as [key.json] for [my-service-account@my-project.iam.gserviceaccount.com]

With this service account’s private key, we can now authorize its access to GCP.

Command:

gcloud auth activate-service-account --key-file key.json  

Output:

Activated service account credentials for: [my-service-account@my-project.iam.gserviceaccount.com]

You can verify whether the right account is being used or not.

Command:

gcloud auth list

Output:

                      Credentialed Accounts
ACTIVE  ACCOUNT
*       my-service-account@my-project.iam.gserviceaccount.com
        user@myshittycode.com

To set the active account, run:
    $ gcloud config set account `ACCOUNT`

In this case, the * marks the active account being used.

Now, you can run terraform apply to provision the infrastructure using the selected service account.

GCP + Kitchen Terraform: Local Development Workflow

INTRODUCTION

Here’s a typical workflow for implementing and running Kitchen Terraform tests outside of the GCP environment, for example, from an IDE on a Mac laptop.

Enable “gcloud” Access

Command:

gcloud auth login

The first step is to ensure we can interact with GCP using the gcloud command using our user credential. This is needed because the tests use the gcloud commands to retrieve GCP resource information in order to do the assertions.

Enable SDK Access

Command:

gcloud auth application-default login

This ensures our Terraform code can run the GCP SDK successfully without a service account. Instead, it will use our user credential.

Without this command, we may get the following error when running the Terraform code:

Response: {
 "error": "invalid_grant",
 "error_description": "reauth related error (invalid_rapt)",
 "error_subtype": "invalid_rapt"
}

Display All Kitchen Test Suites

Command:

bundle exec kitchen list    

This command displays a list of Kitchen test suites defined in kitchen.yml.

The output looks something like this:

Instance                            Driver     Provisioner  Verifier   Transport  Last Action    Last Error
router-all-subnets-ip-ranges-local  Terraform  Terraform    Terraform  Ssh          
router-interface-local              Terraform  Terraform    Terraform  Ssh          
router-no-bgp-no-nat-local          Terraform  Terraform    Terraform  Ssh          
router-with-bgp-local               Terraform  Terraform    Terraform  Ssh          
router-with-nat-local               Terraform  Terraform    Terraform  Ssh          

Run a Specific Test Suite

Command:

bundle exec kitchen test [INSTANCE_NAME]    

# For example:-
bundle exec kitchen test router-with-nat-local

This command allows us to run a specific test suite. This will handle the entire Terraform lifecycle… ie: setting up the infrastructure, running the tests and destroying the infrastructure.

This is helpful especially when we need to run just the test suite that is currently under development. This way, it runs faster because we don’t have to provision/deprovision the cloud infrastructure for other test suites. At the same time, we will also reduce the incurred cost.

Run a Specific Test Suite with Finer Controls

There are times where running bundle exec kitchen test [INSTANCE_NAME] is still very time consuming and expensive, especially when we try to debug any failed assertions or add a little assertions at a time.

To provision the infrastructure once, run the following command:

bundle exec kitchen converge [INSTANCE_NAME]    

# For example:-
bundle exec kitchen converge router-with-nat-local

To run the assertions, run the following command as many times as possible until all the assertions are implemented successfully:

bundle exec kitchen verify [INSTANCE_NAME]    

# For example:-
bundle exec kitchen verify router-with-nat-local

Finally, once the test suite is implemented properly, we can now deprovision the infrastructure:

bundle exec kitchen destroy [INSTANCE_NAME]    

# For example:-
bundle exec kitchen destroy router-with-nat-local

Terragrunt: “plan-all” while Passing Outputs between Modules

PROBLEM

Terragrunt has a feature that allows one module to pass outputs to another module.

For example, if “project-prod” module wants to consume “subfolders” output from “folder” module, it can be done like this in “project-prod” module’s terragrunt.hcl:-

include {
    path = find_in_parent_folders()
}

dependency "folder" {
    config_path = "../folder"
}

inputs = {
    env_folders = dependency.folder.outputs.subfolders
}

The challenge is when running commands such as plan-all, it will fail with the following error:-

Cannot process module Module [...] because one of its 
dependencies, [...], finished with an error: /my/path/folder/terragrunt.hcl 
is a dependency of /my/path/project-prod/terragrunt.hcl 
but detected no outputs. Either the target module has not 
been applied yet, or the module has no outputs. If this 
is expected, set the skip_outputs flag to true on the 
dependency block.

SOLUTION

This error occurs because the generated plan for “folder” module has not been applied yet (ie: the infrastructure does not exist), hence there are no outputs to pass to “project-prod” module to satisfy plan-all.

To fix this, mock outputs can be supplied:-

include {
    path = find_in_parent_folders()
}

dependency "folder" {
    config_path = "../folder"

    mock_outputs = {
        subfolders = {
            "dev" = {
                "id" = "temp-folder-id"
            }
            "prod" = {
                "id" = "temp-folder-id"
            }
            "uat" = {
                "id" = "temp-folder-id"
            }
        }
    }
}

inputs = {
    env_folders = dependency.folder.outputs.subfolders
}

Finally, when running apply-all, it will use the runtime outputs instead of provided mock outputs to build the rest of the infrastructure.

GCP + Terraform: “google: could not find default credentials” Error

PROBLEM

When running any Terraform commands (init, plan, etc) from a different server, the following error is thrown:-

Error: google: could not find default credentials. 
See https://developers.google.com/accounts/docs/application-default-credentials 
for more information.

  on  line 0:
  (source code not available)

SOLUTION

One recommended way is to set up a service account by following the instruction from the above link.

Another way, for developement purpose, is to install Google Cloud SDK and run the following gcloud command, which will generate an Application Default Credentials (ADC) JSON file based on your user account and store it in a location where the SDK can find it automatically:-

gcloud auth application-default login

Terraform: “Error acquiring the state lock” Error

PROBLEM

When running terraform plan, the following error is thrown:-

Acquiring state lock. This may take a few moments...

Error: Error locking state: Error acquiring the state lock: writing "gs://my/bucket/terraform.tfstate/default.tflock" failed: googleapi: Error 412: Precondition Failed, conditionNotMet
Lock Info:
  ID:        1234567890
  Path:      gs://my/bucket/folder/terraform.tfstate/default.tflock
  Operation: migration destination state
  Who:       mike@machine
  Version:   0.12.12
  Created:   2019-10-30 12:44:36.410366 +0000 UTC
  Info:      


Terraform acquires a state lock to protect the state from being written
by multiple users at the same time. Please resolve the issue above and try
again. For most commands, you can disable locking with the "-lock=false"
flag, but this is not recommended.

SOLUTION

One way is to disable locking by passing -lock=false flag.

However, if you are sure the lock isn’t properly released, to perform a force unlock, run this command:

terraform force-unlock [LOCK_ID]

In this case…

terraform force-unlock 1234567890

Azure: Deploying WAR File to Tomcat

PROBLEM

Typically, when using ZipDeploy to push a WAR file (ex: my-app.war) to an Azure instance, we need to:-

  • Rename my-app.war to ROOT.war.
  • Place ROOT.war under webapps/.
  • Zip up webapps/.
  • Use ZipDeploy to push the zip file to an Azure instance.

This zip file will automatically be unzipped under site/wwwroot:-

D:\home\site\wwwroot
└── webapps
    └── ROOT.war

Tomcat detects ROOT.war and will try to unpack the WAR file under ROOT/:-

D:\home\site\wwwroot
└── webapps
    ├── ROOT
    │   ├── META-INF
    │   │   └── ...
    │   └── WEB-INF
    │       └── ...
    └── ROOT.war

The problem is sometimes Tomcat is unable to fully unpack ROOT.war because some files may be locked by some running processes. As a result, the web page fails to load and shows 404 error.

SOLUTION

A better and more stable solution is to use WarDeploy to push WAR file to an Azure instance because:-

  • It simplifies the deployment process because we don’t need to create a zip file with webapps/ containing a WAR file named ROOT.war.
  • Instead of relying on Tomcat to unpack the WAR file, WarDeploy will do the unpacking step elsewhere before copying to site/wwwroot/webapps/.

To use WarDeploy, you can use curl command or PowerShell script to do so.

Here’s an example of the PowerShell script:-

Param(
    [string]$filePath,
    [string]$apiUrl,
    [string]$username,
    [string]$password
)

$base64AuthInfo = [Convert]::ToBase64String([Text.Encoding]::ASCII.GetBytes(("{0}:{1}" -f $username, $password)))

Invoke-RestMethod -Uri $apiUrl -Headers @{ Authorization = ("Basic {0}" -f $base64AuthInfo) } -Method POST -InFile $filePath -ContentType "multipart/form-data"

If the PowerShell script above is called deploy.ps1, to push my-app.war to my-app Azure instance:-

deploy.ps1 -filePath D:\my-app.war -apiUrl https://my-app.scm.azurewebsites.net/api/wardeploy -username [USER] -password [PASSWORD]

Depending on the size of the WAR file, you may get “The operation has timed out” error after 100 seconds:-

Invoke-RestMethod : The operation has timed out
At D:\agent\_work\2ea6e947a\my-app\deploy.ps1:18 char:1
+ Invoke-RestMethod -Uri $apiUrl -Headers @{Authorization=("Basic {0}"  ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-RestMethod], WebException
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeRestMethodCommand

This will most likely to occur if the WAR file is too big (80MB+).

To fix this, increase the time out duration by adding -TimeoutSec option to Invoke-RestMethod statement:-

Invoke-RestMethod [OTHER_OPTIONS...] -TimeoutSec 300