2022-10-01

Redshift Data API

What is Amazon Redshift Data API

Amazon Redshift Data API is a service provided by Amazon Web Services (AWS) that enables developers to more easily and securely run SQL queries on their Amazon Redshift data warehouses. The service simplifies the process of accessing, ingesting, and egressing data for programming languages supported by AWS SDKs, such as Python, Go, Java, Node.js, PHP, Ruby, and C++.

One of the primary benefits of the Redshift Data API is that it eliminates the need for complex database connection management. Instead of establishing a JDBC or ODBC connection, developers can simply use the API to run their SQL commands.

The Data API is asynchronous, which means it can be used to run longer queries without keeping an open connection. It stores query results for 24 hours, and can directly export these results to an S3 bucket. It also integrates well with AWS Secrets Manager and AWS Identity and Access Management (IAM), offering easier and more secure management of database credentials.

Available Commands

Here are some examples of commands available in the Amazon Redshift Data API.

Run a SQL Statement

bash
$ aws redshift-data execute-statement
    --region us-west-2
    --secret arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn
    --cluster-identifier mycluster-test
    --sql "SELECT :colname, COUNT(*) FROM demo_table WHERE trip_distance > :distance"
    --parameters "[{\"name\": \"colname\", \"value\": \"ratecode\"}, \ {\"name\": \"distance\", \"value\": \"5\"}]"
    --database dev
{
  "ClusterIdentifier": "mycluster-test",
  "CreatedAt": 1598323175.823,
  "Database": "dev",
  "Id": "c016234e-5c6c-4bc5-bb16-2c5b8ff61814",
  "SecretArn": "arn:aws:secretsmanager:us-west-2:123456789012:secret:yanruiz-secret-hKgPWn"
}

Run Multiple SQL Statements

bash
$ aws redshift-data batch-execute-statement
    --region us-west-2
    --db-user myuser
    --cluster-identifier mycluster-test
    --database dev
    --sqls "set timezone to BST" "select * from mytable" "select * from another_table"
{
  "ClusterIdentifier": "mycluster-test",
  "CreatedAt": 1598306924.632,
  "Database": "dev",
  "DbUser": "myuser",
  "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766"
}

Cancel a Running Query

bash
$ aws redshift-data cancel-statement
    --id d9b6c0c9-0747-4bf4-b142-e8883122f766

Fetch the Results of a SQL Statement

bash
$ aws redshift-data get-statement-result
    --id d9b6c0c9-0747-4bf4-b142-e8883122f766
    --region us-west-2
{
  "ColumnMetadata": [
    {
      "isCaseSensitive": false,
      "isCurrency": false,
      "isSigned": true,
      "label": "userid",
      "length": 0,
      "name": "userid",
      "nullable": 0,
      "precision": 10,
      "scale": 0,
      "schemaName": "",
      "tableName": "stll_query",
      "typeName": "int4"
    },
    {
      "isCaseSensitive": true,
      "isCurrency": false,
      "isSigned": false,
      "label": "label",
      "length": 0,
      "name": "label",
      "nullable": 0,
      "precision": 320,
      "scale": 0,
      "schemaName": "",
      "tableName": "stll_query",
      "typeName": "bpchar"
    }
  ],
  "Records": [
    [
      {
        "longValue": 1
      },
      {
        "stringValue": "health"
      }
    ]
  ],
  "TotalNumRows": 1
}

Describe Metadata about a SQL Statement

bash
$ aws redshift-data describe-statement
    --id d9b6c0c9-0747-4bf4-b142-e8883122f766
    --region us-west-2
{
  "ClusterIdentifier": "mycluster-test",
  "CreatedAt": 1598306924.632,
  "Duration": 1095981511,
  "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766",
  "QueryString": "select * from stl_query limit 1",
  "RedshiftPid": 20859,
  "RedshiftQueryId": 48879,
  "ResultRows": 1,
  "ResultSize": 4489,
  "Status": "FINISHED",
  "UpdatedAt": 1598306926.667
}

Describe a Table

bash
$ aws redshift-data describe-table
    --region us-west-2
    --cluster-identifier mycluster-test
    --database dev
    --schema information_schema
    --table sql_features
    --secret arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn
{
  "ColumnList": [
    {
      "isCaseSensitive": false,
      "isCurrency": false,
      "isSigned": false,
      "length": 2147483647,
      "name": "feature_id",
      "nullable": 1,
      "precision": 2147483647,
      "scale": 0,
      "schemaName": "information_schema",
      "tableName": "sql_features",
      "typeName": "character_data"
    },
    {
      "isCaseSensitive": false,
      "isCurrency": false,
      "isSigned": false,
      "length": 2147483647,
      "name": "feature_name",
      "nullable": 1,
      "precision": 2147483647,
      "scale": 0,
      "schemaName": "information_schema",
      "tableName": "sql_features",
      "typeName": "character_data"
    }
  ]
}

List Metadata about SQL Statements

bash
$ aws redshift-data list-statements
    --region us-west-2
    --status ALL
{
  "Statements": [
    {
      "CreatedAt": 1598306924.632,
      "Id": "d9b6c0c9-0747-4bf4-b142-e8883122f766",
      "QueryString": "select * from stl_query limit 1",
      "Status": "FINISHED",
      "UpdatedAt": 1598306926.667
    },
    {
      "CreatedAt": 1598311717.437,
      "Id": "e0ebd578-58b3-46cc-8e52-8163fd7e01aa",
      "QueryString": "select * from stl_query limit 1",
      "Status": "FAILED",
      "UpdatedAt": 1598311719.008
    },
    {
      "CreatedAt": 1598313683.65,
      "Id": "c361d4f7-8c53-4343-8c45-6b2b1166330c",
      "QueryString": "select * from stl_query limit 1",
      "Status": "ABORTED",
      "UpdatedAt": 1598313685.495
    },
    {
      "CreatedAt": 1598306653.333,
      "Id": "a512b7bd-98c7-45d5-985b-a715f3cfde7f",
      "QueryString": "select 1",
      "Status": "FINISHED",
      "UpdatedAt": 1598306653.992
    }
  ]
}

List the Databases in a Cluster

bash
$ aws redshift-data list-databases
    --region us-west-2
    --secret arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn
    --cluster-identifier mycluster-test
    --database dev
{
  "Databases": ["dev"]
}

List the Schemas in a Database

bash
$ aws redshift-data list-schemas
    --region us-west-2
    --secret arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn
    --cluster-identifier mycluster-test
    --database dev
{
  "Schemas": ["information_schema", "pg_catalog", "pg_internal", "public"]
}

List the Tables in a Database

bash
$ aws redshift-data list-tables
    --region us-west-2
    --secret arn:aws:secretsmanager:us-west-2:123456789012:secret:myuser-secret-hKgPWn
    --cluster-identifier mycluster-test
    --database dev
    --schema information_schema
{
  "Tables": [
    {
      "name": "sql_features",
      "schema": "information_schema",
      "type": "SYSTEM TABLE"
    },
    {
      "name": "sql_implementation_info",
      "schema": "information_schema",
      "type": "SYSTEM TABLE"
    }
  ]
}

References

https://docs.aws.amazon.com/redshift/latest/mgmt/data-api.html
https://awscli.amazonaws.com/v2/documentation/api/latest/reference/redshift-data/index.html
https://docs.aws.amazon.com/redshift/latest/mgmt/data-api-calling.html

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!