Handle Amazon Redshift provisioned clusters with Terraform

Amazon Redshift is a quick, scalable, safe, and absolutely managed cloud knowledge warehouse that makes it easy and cost-effective to research all of your knowledge utilizing customary SQL and your current extract, rework, and cargo (ETL); enterprise intelligence (BI); and reporting instruments. Tens of 1000’s of consumers use Amazon Redshift to course of exabytes of knowledge per day and energy analytics workloads comparable to BI, predictive analytics, and real-time streaming analytics.

HashiCorp Terraform is an infrastructure as code (IaC) device that allows you to outline cloud sources in human-readable configuration information that you may model, reuse, and share. You may then use a constant workflow to provision and handle your infrastructure all through its lifecycle.

On this put up, we reveal the way to use Terraform to handle frequent Redshift cluster operations, comparable to:

  • Creating a brand new provisioned Redshift cluster utilizing Terraform code and including an AWS Identification and Entry Administration (IAM) function to it
  • Scheduling pause, resume, and resize operations for the Redshift cluster

Resolution overview

The next diagram illustrates the answer structure for provisioning a Redshift cluster utilizing Terraform.

Handle Amazon Redshift provisioned clusters with Terraform

Along with Amazon Redshift, the answer makes use of the next AWS providers:

  • Amazon Elastic Compute Cloud (Amazon EC2) presents the broadest and deepest compute platform, with over 750 cases and selection of the most recent processors, storage, networking, working system (OS), and buy mannequin that can assist you greatest match the wants of your workload. For this put up, we use an m5.xlarge occasion with the Home windows Server 2022 Datacenter Version. The selection of occasion kind and Home windows OS is versatile; you may select a configuration that fits your use case.
  • IAM permits you to securely handle identities and entry to AWS providers and sources. We use IAM roles and insurance policies to securely entry providers and carry out related operations. An IAM function is an AWS identification that you may assume to realize non permanent entry to AWS providers and sources. Every IAM function has a set of permissions outlined by IAM insurance policies. These insurance policies decide the actions and sources the function can entry.
  • AWS Secrets and techniques Supervisor permits you to securely retailer the person title and password wanted to log in to Amazon Redshift.

On this put up, we reveal the way to arrange an setting that connects AWS and Terraform. The next are the high-level duties concerned:

  1. Arrange an EC2 occasion with Home windows OS in AWS.
  2. Set up Terraform on the occasion.
  3. Configure your setting variables (Home windows OS).
  4. Outline an IAM coverage to have minimal entry to carry out actions on a Redshift cluster, together with pause, resume, and resize.
  5. Set up an IAM function utilizing the coverage you created.
  6. Create a provisioned Redshift cluster utilizing Terraform code.
  7. Connect the IAM function you created to the Redshift cluster.
  8. Write the Terraform code to schedule cluster operations like pause, resume, and resize.

Conditions

To finish the actions described on this put up, you want an AWS account and administrator privileges on the account to make use of the important thing AWS providers and create the required IAM roles.

Create an EC2 occasion

We start with creating an EC2 occasion. Full the next steps to create a Home windows OS EC2 occasion:

  1. On the Amazon EC2 console, select Launch Occasion.
  2. Select a Home windows Server Amazon Machine Picture (AMI) that fits your necessities.
  3. Choose an applicable occasion kind in your use case.
  4. Configure the occasion particulars:
    1. Select the VPC and subnet the place you wish to launch the occasion.
    2. Allow Auto-assign Public IP.
    3. For Add storage, configure the specified storage choices in your occasion.
    4. Add any essential tags to the occasion.
  5. For Configure safety group, choose or create a safety group that permits the required inbound and outbound visitors to your occasion.
  6. Overview the occasion configuration and select Launch to begin the occasion creation course of.
  7. For Choose an current key pair or create a brand new key pair, select an current key pair or create a brand new one.
  8. Select Launch occasion.
  9. When the occasion is operating, you may connect with it utilizing the Distant Desktop Protocol (RDP) and the administrator password obtained from the Get Home windows password

Set up Terraform on the EC2 occasion

Set up Terraform on the Home windows EC2 occasion utilizing the next steps:

  1. RDP into the EC2 occasion you created.
  2. Set up Terraform on the EC2 occasion.

It’s essential replace the setting variables to level to the listing the place the Terraform executable is obtainable.

  1. Beneath System Properties, on the Superior tab, select Atmosphere Variables.

Environment Variables

  1. Select the trail variable.

Path Variables

  1. Select New and enter the trail the place Terraform is put in. For this put up, it’s within the C: listing.

Add Terraform to path variable

  1. Verify Terraform is put in by getting into the next command:

terraform -v

Check Terraform version

Optionally, you should use an editor like Visible Studio Code (VS Code) and add the Terraform extension to it.

Create a person for accessing AWS by means of code (AWS CLI and Terraform)

Subsequent, we create an administrator person in IAM, which performs the operations on AWS by means of Terraform and the AWS Command Line Interface (AWS CLI). Full the next steps:

  1. Create a brand new IAM person.
  2. On the IAM console, obtain and save the entry key and person key.

Create New IAM User

  1. Set up the AWS CLI.
  2. Launch the AWS CLI and run aws configure and cross the entry key ID, secret entry key, and default AWS Area.

This prevents the AWS person title and password from being seen in plain textual content within the Terraform code and prevents unintended sharing when the code is dedicated to a code repository.

AWS Configure

Create a person for Accessing Redshift by means of code (Terraform)

As a result of we’re making a Redshift cluster and subsequent operations, the administrator person title and password required for these processes (totally different than the admin function we created earlier for logging in to the AWS Administration Console) must be invoked within the code. To do that securely, we use Secrets and techniques Supervisor to retailer the person title and password. We write code in Terraform to entry these credentials throughout the cluster create operation. Full the next steps:

  1. On the Secrets and techniques Supervisor console, select Secrets and techniques within the navigation pane.
  2. Select Retailer a brand new secret.

Store a New Secret

  1. For Secret kind, choose Credentials for Amazon Redshift knowledge warehouse.
  2. Enter your credentials.

Choose Secret Type

Arrange Terraform

Full the next steps to arrange Terraform:

  1. Create a folder or listing for storing all of your Terraform code.
  2. Open the VS Code editor and browse to your folder.
  3. Select New File and enter a reputation for the file utilizing the .tf extension

Now we’re prepared to begin writing our code beginning with defining suppliers. The suppliers definition is a manner for Terraform to get the required APIs to work together with AWS.

  1. Configure a supplier for Terraform:
terraform {
required_providers {
aws = {
supply  = "hashicorp/aws"
model = "5.53.0"
}
}
}

# Configure the AWS Supplier
supplier "aws" {
area = "us-east-1"
}

  1. Entry the admin credentials for the Amazon Redshift admin person:
knowledge "aws_secretsmanager_secret_version" "creds" {
# Fill within the title you gave to your secret
secret_id = "terraform-creds"
}
/*json decode to parse the key*/
locals {
terraform-creds = jsondecode(
knowledge.aws_secretsmanager_secret_version.creds.secret_string
)
}

Create a Redshift cluster

To create a Redshift cluster, use the aws_redshift_cluster useful resource:

# Create an encrypted Amazon Redshift cluster

useful resource "aws_redshift_cluster" "dw_cluster" {
cluster_identifier = "tf-example-redshift-cluster"
database_name      = "dev"
master_username    = native.terraform-creds.username
master_password    = native.terraform-creds.password
node_type          = "ra3.xlplus"
cluster_type       = "multi-node"
publicly_accessible = "false"
number_of_nodes    = 2
encrypted         = true
kms_key_id        = native.RedshiftClusterEncryptionKeySecret.arn
enhanced_vpc_routing = true
cluster_subnet_group_name="<<your-cluster-subnet-groupname>>"
}

On this instance, we create a Redshift cluster referred to as tf-example-redshift-cluster, utilizing the ra3.xlplus node kind 2 node cluster. We use the credentials from Secrets and techniques Supervisor and jsondecode to entry these values. This makes certain the person title and password aren’t handed in plain textual content.

Add an IAM function to the cluster

As a result of we didn’t have the choice to affiliate an IAM function throughout cluster creation, we accomplish that now with the next code:

useful resource "aws_redshift_cluster_iam_roles" "cluster_iam_role" {
cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
iam_role_arns      = ["arn:aws:iam::yourawsaccountId:role/service-role/yourIAMrolename"]
}

Allow Redshift cluster operations

Performing operations on the Redshift cluster comparable to resize, pause, and resume on a schedule presents a extra sensible use of those operations. Due to this fact, we create two insurance policies: one that permits the Amazon Redshift scheduler service and one that permits the cluster pause, resume, and resize operations. Then we create a job that has each insurance policies hooked up to it.

You may carry out these steps immediately from the console after which referenced in Terraform code. The next instance demonstrates the code snippets to create insurance policies and a job, after which to connect the coverage to the function.

  1. Create the Amazon Redshift scheduler coverage doc and create the function that assumes this coverage:
#outline coverage doc to ascertain the Belief Relationship between the function and the entity (Redshift scheduler)

knowledge "aws_iam_policy_document" "assume_role_scheduling" {
assertion {
impact = "Enable"
principals {
kind        = "Service"
identifiers = ["scheduler.redshift.amazonaws.com"]
}
actions = ["sts:AssumeRole"]
}
}

#create a job that has the above belief relationship hooked up to it, in order that it may invoke the redshift scheduling service
useful resource "aws_iam_role" "scheduling_role" {
title               = "redshift_scheduled_action_role"
assume_role_policy = knowledge.aws_iam_policy_document.assume_role_scheduling.json
}

  1. Create a coverage doc and coverage for Amazon Redshift operations:
/*outline the coverage doc for different redshift operations*/

knowledge "aws_iam_policy_document" "redshift_operations_policy_definition" {
assertion {
impact = "Enable"
actions = [
"redshift:PauseCluster",
"redshift:ResumeCluster",
"redshift:ResizeCluster",
]
sources = ["arn:aws:redshift:*:youraccountid:cluster:*"]
}
}

/*create the coverage and add the above knowledge (json) to the coverage*/
useful resource "aws_iam_policy" "scheduling_actions_policy" {
title   = "redshift_scheduled_action_policy"
coverage = knowledge.aws_iam_policy_document.redshift_operations_policy_definition.json
}

  1. Connect the coverage to the IAM function:
/*join the coverage and the function*/
useful resource "aws_iam_role_policy_attachment" "role_policy_attach" {
policy_arn = aws_iam_policy.scheduling_actions_policy.arn
function       = aws_iam_role.scheduling_role.title
}

  1. Pause the Redshift cluster:
#pause a cluster
useful resource "aws_redshift_scheduled_action" "pause_operation" {
title     = "tf-redshift-scheduled-action-pause"
schedule = "cron(00 22 * * ? *)"
iam_role = aws_iam_role.scheduling_role.arn
target_action {
pause_cluster {
cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
}
}
}

Within the previous instance, we created a scheduled motion referred to as tf-redshift-scheduled-action-pause that pauses the cluster at 10:00 PM each day as a cost-saving motion.

  1. Resume the Redshift cluster:
title     = "tf-redshift-scheduled-action-resume"
schedule = "cron(15 07 * * ? *)"
iam_role = aws_iam_role.scheduling_role.arn
target_action {
resume_cluster {
cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
}
}
}

Within the previous instance, we created a scheduled motion referred to as tf-redshift-scheduled-action-resume that resumes the cluster at 7:15 AM each day in time for enterprise operations to begin utilizing the Redshift cluster.

  1. Resize the Redshift cluster:
#resize a cluster
useful resource "aws_redshift_scheduled_action" "resize_operation" {
title     = "tf-redshift-scheduled-action-resize"
schedule = "cron(15 14 * * ? *)"
iam_role = aws_iam_role.scheduling_role.arn
target_action {
resize_cluster {
cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
cluster_type = "multi-node"
node_type = "ra3.xlplus"
number_of_nodes = 4 /*improve the variety of nodes utilizing resize operation*/
traditional = true /*default conduct is to make use of elastic resizeboolean worth if we wish to use traditional resize*/
}
}
}

Within the previous instance, we created a scheduled motion referred to as tf-redshift-scheduled-action-resize that will increase the nodes from 2 to 4. You are able to do different operations like change the node kind as effectively. By default, elastic resize can be used, however if you wish to use traditional resize, you need to cross the parameter traditional = true as proven within the previous code. This could be a scheduled motion to anticipate the wants of peak durations and resize appripriately for that period. You may then downsize utilizing comparable code throughout non-peak occasions.

Take a look at the answer

We apply the next code to check the answer. Change the useful resource particulars accordingly, comparable to account ID and Area title.

terraform {
  required_providers {
    aws = {
      supply  = "hashicorp/aws"
      model = "5.53.0"
    }
  }
}

# Configure the AWS Supplier
supplier "aws" {
  area = "us-east-1"
}

# entry secrets and techniques saved in secret supervisor
knowledge "aws_secretsmanager_secret_version" "creds" {
  # Fill within the title you gave to your secret
  secret_id = "terraform-creds"
}

/*json decode to parse the key*/
locals {
  terraform-creds = jsondecode(
    knowledge.aws_secretsmanager_secret_version.creds.secret_string
  )
}

#Retailer the arn of the KMS key for use for encrypting the redshift cluster

knowledge "aws_secretsmanager_secret_version" "encryptioncreds" {
  secret_id = "RedshiftClusterEncryptionKeySecret"
}
locals {
  RedshiftClusterEncryptionKeySecret = jsondecode(
    knowledge.aws_secretsmanager_secret_version.encryptioncreds.secret_string
  )
}

# Create an encrypted Amazon Redshift cluster
useful resource "aws_redshift_cluster" "dw_cluster" {
  cluster_identifier = "tf-example-redshift-cluster"
  database_name      = "dev"
  master_username    = native.terraform-creds.username
  master_password    = native.terraform-creds.password
  node_type          = "ra3.xlplus"
  cluster_type       = "multi-node"
  publicly_accessible = "false"
  number_of_nodes    = 2
  encrypted         = true
  kms_key_id        = native.RedshiftClusterEncryptionKeySecret.arn
  enhanced_vpc_routing = true
  cluster_subnet_group_name="redshiftclustersubnetgroup-yuu4sywme0bk"
}

#add IAM Function to the Redshift cluster

useful resource "aws_redshift_cluster_iam_roles" "cluster_iam_role" {
  cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
  iam_role_arns      = ["arn:aws:iam::youraccountid:role/service-role/yourrolename"]
}

#for audit logging please create an S3 bucket which has learn write privileges for Redshift service, this instance doesn't embrace S3 bucket creation.

useful resource "aws_redshift_logging" "redshiftauditlogging" {
  cluster_identifier   = aws_redshift_cluster.dw_cluster.cluster_identifier
  log_destination_type = "s3"
  bucket_name          = "your-s3-bucket-name"
}

#to do operations like pause, resume, resize on a schedule we have to first create a job that has permissions to carry out these operations on the cluster

#outline coverage doc to ascertain the Belief Relationship between the function and the entity (Redshift scheduler)

knowledge "aws_iam_policy_document" "assume_role_scheduling" {
  assertion {
    impact = "Enable"
    principals {
      kind        = "Service"
      identifiers = ["scheduler.redshift.amazonaws.com"]
    }

    actions = ["sts:AssumeRole"]
  }
}

#create a job that has the above belief relationship hooked up to it, in order that it may invoke the redshift scheduling service
useful resource "aws_iam_role" "scheduling_role" {
  title               = "redshift_scheduled_action_role"
  assume_role_policy = knowledge.aws_iam_policy_document.assume_role_scheduling.json
}

/*outline the coverage doc for different redshift operations*/

knowledge "aws_iam_policy_document" "redshift_operations_policy_definition" {
  assertion {
    impact = "Enable"
    actions = [
      "redshift:PauseCluster",
      "redshift:ResumeCluster",
      "redshift:ResizeCluster",
    ]

    sources =  ["arn:aws:redshift:*:youraccountid:cluster:*"]
  }
}

/*create the coverage and add the above knowledge (json) to the coverage*/

useful resource "aws_iam_policy" "scheduling_actions_policy" {
  title   = "redshift_scheduled_action_policy"
  coverage = knowledge.aws_iam_policy_document.redshift_operations_policy_definition.json
}

/*join the coverage and the function*/

useful resource "aws_iam_role_policy_attachment" "role_policy_attach" {
  policy_arn = aws_iam_policy.scheduling_actions_policy.arn
  function       = aws_iam_role.scheduling_role.title
}

#pause a cluster

useful resource "aws_redshift_scheduled_action" "pause_operation" {
  title     = "tf-redshift-scheduled-action-pause"
  schedule = "cron(00 14 * * ? *)"
  iam_role = aws_iam_role.scheduling_role.arn
  target_action {
    pause_cluster {
      cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
    }
  }
}

#resume a cluster

useful resource "aws_redshift_scheduled_action" "resume_operation" {
  title     = "tf-redshift-scheduled-action-resume"
  schedule = "cron(15 14 * * ? *)"
  iam_role = aws_iam_role.scheduling_role.arn
  target_action {
    resume_cluster {
      cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
    }
  }
}

#resize a cluster

useful resource "aws_redshift_scheduled_action" "resize_operation" {
  title     = "tf-redshift-scheduled-action-resize"
  schedule = "cron(15 14 * * ? *)"
  iam_role = aws_iam_role.scheduling_role.arn
  target_action {
    resize_cluster {
      cluster_identifier = aws_redshift_cluster.dw_cluster.cluster_identifier
      cluster_type = "multi-node"
      node_type = "ra3.xlplus"
      number_of_nodes = 4 /*improve the variety of nodes utilizing resize operation*/
      traditional = true /*default conduct is to make use of elastic resizeboolean worth if we wish to use traditional resize*/
    }
  }
}

Run terraform plan to see a listing of modifications that can be made, as proven within the following screenshot.

Terraform plan

After you have got reviewed the modifications, use terraform apply to create the sources you outlined.

Terraform Apply

You may be requested to enter sure or no earlier than Terraform begins creating the sources.

Confirmation of apply

You may verify that the cluster is being created on the Amazon Redshift console.

redshift cluster creation

After the cluster is created, the IAM roles and schedules for pause, resume, and resize operations are added, as proven within the following screenshot.

Terraform actions

It’s also possible to view these scheduled operations on the Amazon Redshift console.

Scheduled Actions

Clear up

In case you deployed sources such because the Redshift cluster and IAM roles, or any of the opposite related sources by operating terraform apply, to keep away from incurring prices in your AWS account, run terraform destroy to tear these sources down and clear up your setting.

Conclusion

Terraform presents a strong and versatile answer for managing your infrastructure as code utilizing a declarative strategy, with a cloud-agnostic nature, useful resource orchestration capabilities, and robust neighborhood help. This put up supplied a complete information to utilizing Terraform to deploy a Redshift cluster and carry out vital operations comparable to resize, resume, and pause on the cluster. Embracing IaC and utilizing the proper instruments, comparable to Workflow Studio, VS Code, and Terraform, will allow you to construct scalable and maintainable distributed purposes, and automate processes.


In regards to the Authors

Amit Ghodke is an Analytics Specialist Options Architect primarily based out of Austin. He has labored with databases, knowledge warehouses and analytical purposes for the previous 16 years. He loves to assist prospects implement analytical options at scale to derive most enterprise worth.

Ritesh Kumar Sinha is an Analytics Specialist Options Architect primarily based out of San Francisco. He has helped prospects construct scalable knowledge warehousing and massive knowledge options for over 16 years. He likes to design and construct environment friendly end-to-end options on AWS. In his spare time, he loves studying, strolling, and doing yoga.

Leave a Reply

Your email address will not be published. Required fields are marked *