Adding S3-to-SQS notifications

Overview

  • In certain deployment scenarios, you may need to send data to CloudQuery Asset Inventory via ClickHouse, rather than directly using the sync process.

  • In those scenarios, you will typically be placing files (e.g. Parquet-format files) into S3, and then ClickHouse will read from S3 in order to ingest the data.

  • SQS notifications allow ClickHouse to ingest the data as it is placed into S3, rather than ClickHouse polling S3.

Installation

This guide provides one installation option:

  • a complete S3-to-SQS deployment using terraform (works with your existing S3 bucket, adds a new SQS Queue, the required SQS Queue policy to allow S3 to write to the Queue, and a BucketNotification to achieve the required behaviour).

Prerequisites

AWS Permissions

The AWS user/role executing the installation needs permissions for:

  • IAM

    • iam:CreatePolicy

  • S3

    • s3:PutBucketNotification

    • s3:PutBucketPolicy

  • SQS

    • sqs:CreateQueue

    • sqs:SetQueueAttributes

Example main.tf

To get started, create a working directory (you may of course also reuse your existing TF working directory if you are comfortable, and skip the appropriate parts of the next section).

Please place the example main.tf into your working directory:

Example main.tf

Terraform

  • Please edit main.tf to add your provider "aws" authorisation details and TF state storage backend details (if appropriate).

  • Initialise your working directory with terraform init.

  • Import the existing bucket (changing your-existing-s3-bucket-name):

terraform import -var="source_bucket_name=your-existing-s3-bucket-name" \
                 -var="sqs_queue_name=your-existing-s3-bucket-name-created" \
                 aws_s3_bucket.source_bucket your-existing-s3-bucket-name
  • Run the main apply:

terraform apply -var="source_bucket_name=your-existing-s3-bucket-name" \
                -var="sqs_queue_name=your-existing-s3-bucket-name-created"
  • Please note the s3_created_events_sqs_queue_arn output.

Next Steps

The s3_created_events_sqs_queue_arn output from Terraform can be used to configure ClickHouse to consume from the SQS queue.

Removal

  • The following command will effectively reverse the terraform import (so that Terraform loses track of the bucket). This is important because the bucket is marked as prevent_destroy to avoid accidental deletion:

terraform state rm aws_s3_bucket.source_bucket
  • You can then safely destroy the rest of the resources:

terraform destroy -var="source_bucket_name=your-existing-s3-bucket-name" \
                  -var="sqs_queue_name=your-existing-s3-bucket-name-created"

Last updated

Was this helpful?