S3

Supports:

  • ✅ Models
  • ✅ Model sync destination
  • ✅ Bulk sync source
  • ✅ Bulk sync destination

Connection

Configuration

NameTypeDescriptionRequired
auth_modestringAuthentication Method

How to authenticate with AWS. Defaults to Access Key and Secret. Accepted values: access_key_and_secret, iam_role
true
aws_access_key_idstringAWS Access Key ID

Access Key ID with read/write access to a bucket. (required if auth_mode is “aws_access_key_and_secret”)
false
aws_secret_access_keystringAWS Secret Access Key

(required if auth_mode is “aws_access_key_and_secret”)
false
directory_glob_patternstringTables glob pathfalse
iam_role_arnstringIAM Role ARN

(required if auth_mode is “iam_role”)
false
is_directory_snapshotbooleanMulti-directory multi-tablefalse
is_single_tablebooleanFiles are time-based snapshots

Treat the files as a single table.
false
s3_bucket_namestringS3 Bucket Name

Bucket name (folder optional); ex: s3://polytomic/dataset
true
s3_bucket_regionstringS3 Bucket Regiontrue
single_table_file_formatstringFile format

Accepted values: csv, json, parquet
false
single_table_namestringCollection namefalse
skip_linesintegerSkip first lines

Skip first N lines of each CSV file.
false

Example

1{
2 "name": "S3 connection",
3 "type": "s3",
4 "configuration": {
5 "auth_mode": "access_key_and_secret",
6 "aws_access_key_id": "AKIAIOSFODNN7EXAMPLE",
7 "aws_secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
8 "directory_glob_pattern": "",
9 "iam_role_arn": "",
10 "is_directory_snapshot": false,
11 "is_single_table": false,
12 "s3_bucket_name": "s3://polytomic/dataset",
13 "s3_bucket_region": "us-east-1",
14 "single_table_file_format": "csv",
15 "single_table_name": "collection",
16 "skip_lines": 0
17 }
18}

Read-only properties

NameTypeDescriptionRequired
aws_userstringUser ARNfalse
external_idstringExternal ID for the IAM rolefalse

Model Sync

Source

Configuration

NameTypeDescriptionRequired
file_formatstringFile format

Accepted values: csv, json, parquet
false
keystringObject key

The key of the object in the bucket to read from.
false
model_fromstringFiles

The model is generated from a single file or a multi-file archive. Accepted values: single_file, multi_file_archive
true
skip_linesintegerSkip first lines

Skip first N lines of each CSV file.
false
subfolderstringSubfolder to read files from from (optional)false

Example

1{
2 ...
3 "configuration": {
4 "file_format": "",
5 "key": "",
6 "model_from": "",
7 "skip_lines": 0,
8 "subfolder": ""
9 }
10}

Target

S3 connections may be used as the destination in a model sync.

All targets

Configuration
NameTypeDescriptionRequired
formatstringOutput format

Output file encoding. Accepted values: csv, json-doc, json, parquet
false
Example
1{
2 ...
3 "target": {
4 "configuration": {
5 "format": "csv"
6 }
7 }
8}

Bulk Sync

Destination

Configuration

NameTypeDescriptionRequired
formatstringOutput file encodingfalse
subfolderstringSubfolder to write to (optional)false

Example

1{
2 ...
3 "destination_configuration": {
4 "format": "csv",
5 "subfolder": "reports"
6 }
7}