Action item: The September 2024 release of Experience Platform introduces the option to set an endTime
date for export dataset dataflows. Adobe is also introducing a default end date of May 1st 2025 for all dataset export dataflows created prior to the September release. For any of those dataflows, you need to update the end date in the dataflow manually before the end date, otherwise your exports for stop on that date. Use the Experience Platform UI to view which dataflows will be set to stop on May 1st.
Similarly, for any dataflows that you create without specifying an endTime
date, these will default to an end time six months from the time they are created.
This article explains the workflow required to use the Flow Service API to export datasets from Adobe Experience Platform to your preferred cloud storage location, such as Amazon S3, SFTP locations, or Google Cloud Storage.
You can also use the Experience Platform user interface to export datasets. Read the export datasets UI tutorial for more information.
The datasets that you can export depend on the Experience Platform application (Real-Time CDP, Adobe Journey Optimizer), the tier (Prime or Ultimate), and any add-ons that you purchased (for example: Data Distiller).
Refer to the table on the UI tutorial page to understand which datasets you can export.
Currently, you can export datasets to the cloud storage destinations highlighted in the screenshot and listed below.
This guide requires a working understanding of the following components of Adobe Experience Platform:
The following sections provide additional information that you must know in order to export datasets to cloud storage destinations in Platform.
To export datasets, you need the View Destinations, View Datasets, and Manage and Activate Dataset Destinations access control permissions. Read the access control overview or contact your product administrator to obtain the required permissions.
To ensure that you have the necessary permissions to export datasets and that the destination supports exporting datasets, browse the destinations catalog. If a destination has an Activate or an Export datasets control, then you have the appropriate permissions.
This tutorial provides example API calls to demonstrate how to format your requests. These include paths, required headers, and properly formatted request payloads. Sample JSON returned in API responses is also provided. For information on the conventions used in documentation for sample API calls, see the section on how to read example API calls in the Experience Platform troubleshooting guide.
In order to make calls to Platform APIs, you must first complete the Experience Platform authentication tutorial. Completing the authentication tutorial provides the values for each of the required headers in all Experience Platform API calls, as shown below:
{ACCESS_TOKEN}
{API_KEY}
{ORG_ID}
Resources in Experience Platform can be isolated to specific virtual sandboxes. In requests to Platform APIs, you can specify the name and ID of the sandbox that the operation will take place in. These are optional parameters.
{SANDBOX_NAME}
For more information on sandboxes in Experience Platform, see the sandbox overview documentation.
All requests that contain a payload (POST, PUT, PATCH) require an additional media type header:
application/json
You can find accompanying reference documentation for all the API operations in this tutorial. Refer to the Flow Service - Destinations API documentation on the Adobe Developer website. We recommend that you use this tutorial and the API reference documentation in parallel.
For descriptions of the terms that you will be encountering in this API tutorial, read the glossary section of the API reference documentation.
Before starting the workflow to export a dataset, identify the connection spec and flow spec IDs of the destination to which you are intending to export datasets to. Use the table below for reference.
Destination | Connection spec | Flow spec |
---|---|---|
Amazon S3 | 4fce964d-3f37-408f-9778-e597338a21ee |
269ba276-16fc-47db-92b0-c1049a3c131f |
Azure Blob Storage | 6d6b59bf-fb58-4107-9064-4d246c0e5bb2 |
95bd8965-fc8a-4119-b9c3-944c2c2df6d2 |
Azure Data Lake Gen 2(ADLS Gen2) | be2c3209-53bc-47e7-ab25-145db8b873e1 |
17be2013-2549-41ce-96e7-a70363bec293 |
Data Landing Zone(DLZ) | 10440537-2a7b-4583-ac39-ed38d4b848e8 |
cd2fc47e-e838-4f38-a581-8fff2f99b63a |
Google Cloud Storage | c5d93acb-ea8b-4b14-8f53-02138444ae99 |
585c15c4-6cbf-4126-8f87-e26bff78b657 |
SFTP | 36965a81-b1c6-401b-99f8-22508f1e6a26 |
354d6aad-4754-46e4-a576-1b384561c440 |
You need these IDs to construct various Flow Service entities. You also need to refer to parts of the Connection Spec itself to set up certain entities so you can retrieve the Connection Spec from Flow Service APIs. See the examples below of retrieving connection specs for all the destinations in the table:
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/4fce964d-3f37-408f-9778-e597338a21ee' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "4fce964d-3f37-408f-9778-e597338a21ee",
"name": "Amazon S3",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/6d6b59bf-fb58-4107-9064-4d246c0e5bb2' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2",
"name": "Azure Blob Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/be2c3209-53bc-47e7-ab25-145db8b873e1' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1",
"name": "Azure Data Lake Gen2",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/10440537-2a7b-4583-ac39-ed38d4b848e8' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "10440537-2a7b-4583-ac39-ed38d4b848e8",
"name": "Data Landing Zone",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/c5d93acb-ea8b-4b14-8f53-02138444ae99' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99",
"name": "Google Cloud Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/36965a81-b1c6-401b-99f8-22508f1e6a26' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Response
{
"items": [
{
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26",
"name": "SFTP",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
//...
Follow the steps below to set up a dataset dataflow to a cloud storage destination. For some steps, the requests and responses differ between the various cloud storage destinations. In those cases, use the tabs on the page to retrieve the requests and responses specific to the destination that you want to connect and export datasets to. Be sure to use the correct connection spec and flow spec for the destination you are configuring.
To retrieve a list of datasets eligible for activation, start by making an API call to the below endpoint.
Request
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/connectionSpecs/23598e46-f560-407b-88d5-ea6207e49db0/configs?outputType=activationDatasets&outputField=datasets&start=0&limit=20&properties=name,state' \
--header 'accept: application/json' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-api-key: {API_KEY}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}'
Note that to retrieve eligible datasets, the connection spec ID used in the request URL must be the data lake source connection spec ID, 23598e46-f560-407b-88d5-ea6207e49db0
, and the two query parameters outputField=datasets
and outputType=activationDatasets
must be specified. All other query parameters are the standard ones supported by the Catalog Service API.
Response
{
"items": [
{
"id": "5ef3e324052581191aa6a466",
"name": "AAM Authenticated Profiles Meta Data",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
},
{
"id": "5ef3e3259ad2a1191ab7dd7d",
"name": "AAM Devices Data",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
},
{
"id": "5ef3e325582424191b1beb42",
"name": "AAM Devices Profile Meta Data",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
},
{
"id": "5ef3e328582424191b1beb44",
"name": "AAM Realtime",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
},
{
"id": "5ef3e328fe742a191b2b3ea5",
"name": "AAM Realtime Profile Updates",
"description": "Activation profile export dataset",
"fileDescription": {
"persisted": true,
"containerFormat": "parquet",
"format": "parquet"
},
"aspect": "production",
"state": "DRAFT"
}
],
"pageInfo": {
"start": 0,
"end": 4,
"total": 149,
"hasNext": true
}
}
A successful response contains a list of datasets eligible for activation. These datasets can be used when constructing the source connection in the next step.
For information about the various response parameters for each returned dataset, refer to the Datasets API developer documentation.
After retrieving the list of datasets that you want to export, you can create a source connection using those dataset IDs.
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/sourceConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Connecting to Data Lake",
"description": "Data Lake source connection to export datasets",
"connectionSpec": {
"id": "23598e46-f560-407b-88d5-ea6207e49db0", // this connection spec ID is always the same for Source Connections
"version": "1.0"
},
"params": {
"datasets": [ // datasets to activate
{
"dataSetId": "5ef3e3259ad2a1191ab7dd7d",
"name": "AAM Devices Data"
}
]
}
}'
Response
{
"id": "900df191-b983-45cd-90d5-4c7a0326d650",
"etag": "\"0500ebe1-0000-0200-0000-63e28d060000\""
}
A successful response returns the ID (id
) of the newly created source connection and an etag
. Note down the source connection ID as you will need it later when creating the dataflow.
Please also remember that:
A base connection securely stores the credentials to your destination. Depending on the destination type, the credentials needed to authenticate against that destination can vary. To find these authentication parameters, first retrieve the connection spec for your desired destination as described in the section Gather connection specs and flow specs and then look at the authSpec
of the response. Reference the tabs below for the authSpec
properties of all supported destinations.
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "4fce964d-3f37-408f-9778-e597338a21ee",
"name": "Amazon S3",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "Access Key",
"type": "KeyBased",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Defines auth params required for connecting to amazon-s3",
"type": "object",
"properties": {
"s3AccessKey": {
"description": "Access key id",
"type": "string",
"pattern": "^[A-Z2-7]{20}$"
},
"s3SecretKey": {
"description": "Secret access key for the user account",
"type": "string",
"format": "password",
"pattern": "^[A-Za-z0-9\/\\+]{40}$"
}
},
"required": [
"s3SecretKey",
"s3AccessKey"
]
}
}
],
//...
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2",
"name": "Azure Blob Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "ConnectionString",
"type": "ConnectionString",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "Connection String for Azure Blob based destinations",
"type": "object",
"properties": {
"connectionString": {
"description": "connection string for login",
"type": "string",
"format": "password"
}
},
"required": [
"connectionString"
]
}
}
],
//...
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1",
"name": "Azure Data Lake Gen2",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "Azure Service Principal Auth",
"type": "AzureServicePrincipal",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "defines auth params required for connecting to adlsgen2 using service principal",
"type": "object",
"properties": {
"url": {
"description": "Endpoint for Azure Data Lake Storage Gen2.",
"type": "string"
},
"servicePrincipalId": {
"description": "Service Principal Id to connect to ADLSGen2.",
"type": "string"
},
"servicePrincipalKey": {
"description": "Service Principal Key to connect to ADLSGen2.",
"type": "string",
"format": "password"
},
"tenant": {
"description": "Tenant information(domain name or tenant ID).",
"type": "string"
}
},
"required": [
"servicePrincipalKey",
"url",
"tenant",
"servicePrincipalId"
]
}
}
],
//...
The Data Landing Zone destination does not require an auth spec.
{
"items": [
{
"id": "10440537-2a7b-4583-ac39-ed38d4b848e8",
"name": "Data Landing Zone",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [],
//...
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99",
"name": "Google Cloud Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "Google Cloud Storage authentication credentials",
"type": "GoogleCloudStorageAuth",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "defines auth params required for connecting to google cloud storage connector.",
"type": "object",
"properties": {
"accessKeyId": {
"description": "Access Key Id for the user account",
"type": "string"
},
"secretAccessKey": {
"description": "Secret Access Key for the user account",
"type": "string",
"format": "password"
}
},
"required": [
"accessKeyId",
"secretAccessKey"
]
}
}
],
//...
The SFTP destination contains two separate items in the auth spec, as it supports both password and SSH key authentication.
Note the highlighted line with inline comments in the connection spec example below, which provide additional information about where to find the authentication parameters in the connection spec.
{
"items": [
{
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26",
"name": "SFTP",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [ // describes the authentication parameters
{
"name": "SFTP with Password",
"type": "SFTP",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "defines auth params required for connecting to sftp locations with a password",
"type": "object",
"properties": {
"domain": {
"description": "Domain of server",
"type": "string"
},
"username": {
"description": "Username",
"type": "string"
},
"password": {
"description": "Password",
"type": "string",
"format": "password"
}
},
"required": [
"password",
"domain",
"username"
]
}
},
{
"name": "SFTP with SSH Key",
"type": "SFTP",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "defines auth params required for connecting to sftp locations using SSH Key",
"type": "object",
"properties": {
"domain": {
"description": "Domain of server",
"type": "string"
},
"username": {
"description": "Username",
"type": "string"
},
"sshKey": {
"description": "Base64 string of the private SSH key",
"type": "string",
"format": "password",
"contentEncoding": "base64",
"uiAttributes": {
"tooltip": {
"id": "platform_destinations_connect_sftp_ssh",
"fallbackUrl": "http://www.adobe.com/go/destinations-sftp-connection-parameters-en "
}
}
}
},
"required": [
"sshKey",
"domain",
"username"
]
}
}
],
//...
Using the properties specified in the authentication spec (i.e. authSpec
from the response) you can create a base connection with the required credentials, specific to each destination type, as shown in the examples below:
Request
For information on how to obtain the required authentication credentials, refer to the authenticate to destination section of the Amazon S3 destination documentation page.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Amazon S3 Base Connection",
"auth": {
"specName": "Access Key",
"params": {
"s3SecretKey": "<Add secret key>",
"s3AccessKey": "<Add access key>"
}
},
"connectionSpec": {
"id": "4fce964d-3f37-408f-9778-e597338a21ee", // Amazon S3 connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required authentication credentials, refer to the authenticate to destination section of the Azure Blob Storage destination documentation page.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Azure Blob Storage Base Connection",
"auth": {
"specName": "ConnectionString",
"params": {
"connectionString": "<Add Azure Blob connection string>"
}
},
"connectionSpec": {
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2", // Azure Blob Storage connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required authentication credentials, refer to the authenticate to destination section of the Azure Data Lake Gen 2(ADLS Gen2) destination documentation page.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Azure Data Lake Gen 2(ADLS Gen2) Base Connection",
"auth": {
"specName": "Azure Service Principal Auth",
"params": {
"servicePrincipalKey": "<Add servicePrincipalKey>",
"url": "<Add url>",
"tenant": "<Add tenant>",
"servicePrincipalId": "<Add servicePrincipalId>"
}
},
"connectionSpec": {
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1", // Azure Data Lake Gen 2(ADLS Gen2) connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
No authentication credentials are required for the Data Landing Zone destination. For more information, refer to the authenticate to destination section of the Data Landing Zone destination documentation page.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Data Landing Zone Base Connection",
"connectionSpec": {
"id": "3567r537-2a7b-4583-ac39-ed38d4b848e8",
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required authentication credentials, refer to the authenticate to destination section of the Google Cloud Storage destination documentation page.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Google Cloud Storage Base Connection",
"auth": {
"specName": "Google Cloud Storage authentication credentials",
"params": {
"accessKeyId": "<Add accessKeyId>",
"secretAccessKey": "<Add secret Access Key>"
}
},
"connectionSpec": {
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99", // Google Cloud Storage connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required authentication credentials, refer to the authenticate to destination section of the SFTP destination documentation page.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "SFTP with password Base Connection",
"auth": {
"specName": "SFTP with Password",
"params": {
"domain": "<Add domain>",
"username": "<Add username>",
"password": "<Add password>"
}
},
"connectionSpec": {
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26", // SFTP connection spec
"version": "1.0"
}
}'
For information on how to obtain the required authentication credentials, refer to the authenticate to destination section of the SFTP destination documentation page.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/connections' \
--header 'accept: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--header 'x-api-key: <API-KEY>' \
--header 'x-gw-ims-org-id: <IMS-ORG-ID>' \
--header 'x-sandbox-name: <SANDBOX-NAME>' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "SFTP with SSH key Base Connection",
"auth": {
"specName": "SFTP with SSH Key",
"params": {
"domain": "<Add domain>",
"username": "<Add username>",
"sshKey": "<Add SSH key>"
}
},
"connectionSpec": {
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26", // SFTP connection spec
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Note the connection ID from the response. This ID will be required in the next step when creating the target connection.
Next, you need to create a target connection which stores the export parameters for your datasets. Export parameters include location, file format, compression, and other details. Refer to the targetSpec
properties provided in the destination’s connection spec to understand the supported properties for each destination type. Reference the tabs below for the targetSpec
properties of all supported destinations.
Exports to JSON files are supported in a compressed mode only. Exports to Parquet files are supported in both compressed and uncompressed modes.
The format of the exported JSON file is NDJSON, which is the standard interchange format in the big data ecosystem. Adobe recommends using an NDJSON-compatible client to read the exported files.
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "4fce964d-3f37-408f-9778-e597338a21ee",
"name": "Amazon S3",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"bucketName": {
"title": "Bucket name",
"description": "Bucket name",
"type": "string",
"pattern": "(?=^.{3,63}$)(?!^(\\d+\\.)+\\d+$)(^(([a-z0-9]|[a-z0-9][a-z0-9\\-]*[a-z0-9])\\.)*([a-z0-9]|[a-z0-9][a-z0-9\\-]*[a-z0-9])$)",
"uiAttributes": {
"tooltip": {
"id": "platform_destinations_connect_s3_bucket",
"fallbackUrl": "http://www.adobe.com/go/destinations-amazon-s3-connection-parameters-en"
}
}
},
"path": {
"title": "Folder path",
"description": "Output path for copying files",
"type": "string",
"pattern": "^[0-9a-zA-Z\/\\!\\-_\\.\\*\\''\\(\\)]*((\\%SEGMENT_(NAME|ID)\\%)?\/?)+$",
"uiAttributes": {
"tooltip": {
"id": "platform_destinations_connect_s3_folderpath",
"fallbackUrl": "http://www.adobe.com/go/destinations-amazon-s3-connection-parameters-en"
}
}
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"bucketName",
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2",
"name": "Azure Blob Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"path": {
"title": "Folder path",
"description": "Output path (relative) indicating where to upload the data",
"type": "string",
"pattern": "^[0-9a-zA-Z\/\\!\\-_\\.\\*\\'\\(\\)]+$"
},
"container": {
"title": "Container",
"description": "Container within the storage where to upload the data",
"type": "string",
"pattern": "^[a-z0-9](?!.*--)[a-z0-9-]{1,61}[a-z0-9]$"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"container",
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1",
"name": "Azure Data Lake Gen2",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"path": {
"title": "Folder path",
"description": "Enter the path to your Azure Data Lake Storage folder",
"type": "string"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions":{...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
"items": [
{
"id": "10440537-2a7b-4583-ac39-ed38d4b848e8",
"name": "Data Landing Zone",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [],
"encryptionSpecs": [],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"path": {
"title": "Folder path",
"description": "Enter the path to your Azure Data Lake Storage folder",
"type": "string"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99",
"name": "Google Cloud Storage",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"bucketName": {
"title": "Bucket name",
"description": "Bucket name",
"type": "string",
"pattern": "(?!^goog.*$)(?!^.*g(o|0)(o|0)gle.*$)(((?=^.{3,63}$)(^([a-z0-9]|[a-z0-9][a-z0-9\\-_]*)[a-z0-9]$))|((?=^.{3,222}$)(?!^(\\d+\\.)+\\d+$)(^(([a-z0-9]{1,63}|[a-z0-9][a-z0-9\\-_]{1,61}[a-z0-9])\\.)*([a-z0-9]{1,63}|[a-z0-9][a-z0-9\\-_]{1,61}[a-z0-9])$)))"
},
"path": {
"title": "Folder path",
"description": "Output path for copying files",
"type": "string",
"pattern": "^[0-9a-zA-Z\/\\!\\-_\\.\\*\\''\\(\\)]*((\\%SEGMENT_(NAME|ID)\\%)?\/?)+$"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"NONE",
"GZIP"
]
}
},
"required": [
"bucketName",
"path",
"datasetFileType",
"compression",
"fileType"
]
}
//...
Note the highlighted lines with inline comments in the connection spec example below, which provide additional information about where to find the target spec parameters in the connection spec. You can see also in the example below which target parameters are not applicable to dataset export destinations.
{
"items": [
{
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26",
"name": "SFTP",
"providerId": "14e34fac-d307-11e9-bb65-2a2ae2dbcce4",
"version": "1.0",
"authSpec": [...],
"encryptionSpecs": [...],
"targetSpec": { // describes the target connection parameters
"name": "User based target",
"type": "UserNamespace",
"spec": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"remotePath": {
"title": "Folder path",
"description": "Enter your folder path",
"type": "string"
},
"fileType": {...}, // not applicable to dataset destinations
"datasetFileType": {
"conditional": {
"field": "flowSpec.attributes._workflow",
"operator": "CONTAINS",
"value": "DATASETS"
},
"title": "File Type",
"description": "Select file format",
"type": "string",
"enum": [
"JSON",
"PARQUET"
]
},
"csvOptions": {...}, // not applicable to dataset destinations
"compression": {
"title": "Compression format",
"description": "Select the desired file compression format.",
"type": "string",
"enum": [
"GZIP",
"NONE"
]
}
},
"required": [
"remotePath",
"datasetFileType",
"compression",
"fileType"
]
},
//...
By using the above spec, you can construct a target connection request specific to your desired cloud storage destination, as shown in the tabs below.
Request
For information on how to obtain the required target parameters, refer to the fill in destination details section of the Amazon S3 destination documentation page.
For other supported values of datasetFileType
, see the API reference documentation.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Amazon S3 Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"bucketName": "your-bucket-name",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "4fce964d-3f37-408f-9778-e597338a21ee", // Amazon S3 connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required target parameters, refer to the fill in destination details section of the Azure Blob Storage destination documentation page.
For other supported values of datasetFileType
, see the API reference documentation.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Azure Blob Storage Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"container": "your-container-name",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "6d6b59bf-fb58-4107-9064-4d246c0e5bb2", // Azure Blob Storage connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required target parameters, refer to the fill in destination details section of the Azure Data Lake Gen 2(ADLS Gen2) destination documentation page.
For other supported values of datasetFileType
, see the API reference documentation.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Azure Data Lake Gen 2(ADLS Gen2) Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "be2c3209-53bc-47e7-ab25-145db8b873e1", // Azure Data Lake Gen 2(ADLS Gen2) connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required target parameters, refer to the fill in destination details section of the Data Landing Zone destination documentation page.
For other supported values of datasetFileType
, see the API reference documentation.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Data Landing Zone Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "10440537-2a7b-4583-ac39-ed38d4b848e8", // Data Landing Zone connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required target parameters, refer to the fill in destination details section of the Google Cloud Storage destination documentation page.
For other supported values of datasetFileType
, see the API reference documentation.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Google Cloud Storage Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"bucketName": "your-bucket-name",
"path": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "c5d93acb-ea8b-4b14-8f53-02138444ae99", // Google Cloud Storage connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
For information on how to obtain the required target parameters, refer to the fill in destination details section of the SFTP destination documentation page.
For other supported values of datasetFileType
, see the API reference documentation.
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/targetConnections' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "SFTP Target Connection",
"baseConnectionId": "<FROM_STEP_CREATE_TARGET_BASE_CONNECTION>",
"params": {
"mode": "Server-to-server",
"remotePath": "folder/subfolder",
"compression": "NONE",
"datasetFileType": "JSON"
},
"connectionSpec": {
"id": "36965a81-b1c6-401b-99f8-22508f1e6a26", // SFTP connection spec id
"version": "1.0"
}
}'
Response
{
"id": "12401496-2573-4ca7-8137-fef1aeb9dd4c",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Note the Target Connection ID from the response. This ID will be required in the next step when creating the dataflow to export datasets.
The final step in the destination configuration is to set up a dataflow. A dataflow ties together previously created entities and also provides options for configuring the dataset export schedule. To create the dataflow, use the payloads below, depending on your desired cloud storage destination, and replace the entity IDs from previous steps.
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to an Amazon S3 cloud storage destination",
"description": "This operation creates a dataflow to export datasets to an Amazon S3 cloud storage destination",
"flowSpec": {
"id": "269ba276-16fc-47db-92b0-c1049a3c131f", // Amazon S3 flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
Parameter | Description |
---|---|
exportMode |
Select "DAILY_FULL_EXPORT" or "FIRST_FULL_THEN_INCREMENTAL" . For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are: Full file - Once: "DAILY_FULL_EXPORT" can only be used in combination with timeUnit :day and interval :0 for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option. Incremental daily exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :day , and interval :1 for daily incremental exports. Incremental hourly exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :hour , and interval :3 ,6 ,9 , or 12 for hourly incremental exports. |
timeUnit |
Select day or hour depending on the frequency with which you want to export dataset files. |
interval |
Select 1 when the timeUnit is day and 3 ,6 ,9 ,12 when the time unit is hour . |
startTime |
The date and time in UNIX seconds when dataset exports should start. |
endTime |
The date and time in UNIX seconds when dataset exports should end. |
foldernameTemplate |
Specify the expected folder name structure in your storage location where the exported files will be deposited.
|
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to an Azure Blob Storage cloud storage destination",
"description": "This operation creates a dataflow to export datasets to an Azure Blob Storage cloud storage destination",
"flowSpec": {
"id": "95bd8965-fc8a-4119-b9c3-944c2c2df6d2", // Azure Blob Storage flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
Parameter | Description |
---|---|
exportMode |
Select "DAILY_FULL_EXPORT" or "FIRST_FULL_THEN_INCREMENTAL" . For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are: Full file - Once: "DAILY_FULL_EXPORT" can only be used in combination with timeUnit :day and interval :0 for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option. Incremental daily exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :day , and interval :1 for daily incremental exports. Incremental hourly exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :hour , and interval :3 ,6 ,9 , or 12 for hourly incremental exports. |
timeUnit |
Select day or hour depending on the frequency with which you want to export dataset files. |
interval |
Select 1 when the timeUnit is day and 3 ,6 ,9 ,12 when the time unit is hour . |
startTime |
The date and time in UNIX seconds when dataset exports should start. |
endTime |
The date and time in UNIX seconds when dataset exports should end. |
foldernameTemplate |
Specify the expected folder name structure in your storage location where the exported files will be deposited.
|
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to an Azure Data Lake Gen 2(ADLS Gen2) cloud storage destination",
"description": "This operation creates a dataflow to export datasets to an Azure Data Lake Gen 2(ADLS Gen2) cloud storage destination",
"flowSpec": {
"id": "17be2013-2549-41ce-96e7-a70363bec293", // Azure Data Lake Gen 2(ADLS Gen2) flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
Parameter | Description |
---|---|
exportMode |
Select "DAILY_FULL_EXPORT" or "FIRST_FULL_THEN_INCREMENTAL" . For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are: Full file - Once: "DAILY_FULL_EXPORT" can only be used in combination with timeUnit :day and interval :0 for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option. Incremental daily exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :day , and interval :1 for daily incremental exports. Incremental hourly exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :hour , and interval :3 ,6 ,9 , or 12 for hourly incremental exports. |
timeUnit |
Select day or hour depending on the frequency with which you want to export dataset files. |
interval |
Select 1 when the timeUnit is day and 3 ,6 ,9 ,12 when the time unit is hour . |
startTime |
The date and time in UNIX seconds when dataset exports should start. |
endTime |
The date and time in UNIX seconds when dataset exports should end. |
foldernameTemplate |
Specify the expected folder name structure in your storage location where the exported files will be deposited.
|
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to a Data Landing Zone cloud storage destination",
"description": "This operation creates a dataflow to export datasets to a Data Landing Zone cloud storage destination",
"flowSpec": {
"id": "cd2fc47e-e838-4f38-a581-8fff2f99b63a", // Data Landing Zone flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
Parameter | Description |
---|---|
exportMode |
Select "DAILY_FULL_EXPORT" or "FIRST_FULL_THEN_INCREMENTAL" . For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are: Full file - Once: "DAILY_FULL_EXPORT" can only be used in combination with timeUnit :day and interval :0 for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option. Incremental daily exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :day , and interval :1 for daily incremental exports. Incremental hourly exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :hour , and interval :3 ,6 ,9 , or 12 for hourly incremental exports. |
timeUnit |
Select day or hour depending on the frequency with which you want to export dataset files. |
interval |
Select 1 when the timeUnit is day and 3 ,6 ,9 ,12 when the time unit is hour . |
startTime |
The date and time in UNIX seconds when dataset exports should start. |
endTime |
The date and time in UNIX seconds when dataset exports should end. |
foldernameTemplate |
Specify the expected folder name structure in your storage location where the exported files will be deposited.
|
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to a Google Cloud Storage cloud storage destination",
"description": "This operation creates a dataflow to export datasets to a Google Cloud Storage destination",
"flowSpec": {
"id": "585c15c4-6cbf-4126-8f87-e26bff78b657", // Google Cloud Storage flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
Parameter | Description |
---|---|
exportMode |
Select "DAILY_FULL_EXPORT" or "FIRST_FULL_THEN_INCREMENTAL" . For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are: Full file - Once: "DAILY_FULL_EXPORT" can only be used in combination with timeUnit :day and interval :0 for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option. Incremental daily exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :day , and interval :1 for daily incremental exports. Incremental hourly exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :hour , and interval :3 ,6 ,9 , or 12 for hourly incremental exports. |
timeUnit |
Select day or hour depending on the frequency with which you want to export dataset files. |
interval |
Select 1 when the timeUnit is day and 3 ,6 ,9 ,12 when the time unit is hour . |
startTime |
The date and time in UNIX seconds when dataset exports should start. |
endTime |
The date and time in UNIX seconds when dataset exports should end. |
foldernameTemplate |
Specify the expected folder name structure in your storage location where the exported files will be deposited.
|
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Request
Note the highlighted lines with inline comments in the request example, which provide additional information. Remove the inline comments in the request when copy-pasting the request into your terminal of choice.
curl --location --request POST 'https://platform.adobe.io/data/foundation/flowservice/flows' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
--data-raw '{
"name": "Activate datasets to an SFTP cloud storage destination",
"description": "This operation creates a dataflow to export datasets to an SFTP cloud storage destination",
"flowSpec": {
"id": "354d6aad-4754-46e4-a576-1b384561c440", // SFTP flow spec ID
"version": "1.0"
},
"sourceConnectionIds": [
"<FROM_STEP_CREATE_SOURCE_CONNECTION>"
],
"targetConnectionIds": [
"<FROM_STEP_CREATE_TARGET_CONNECTION>"
],
"transformations": [],
"scheduleParams": { // specify the scheduling info
"exportMode": DAILY_FULL_EXPORT or FIRST_FULL_THEN_INCREMENTAL
"interval": 3, // also supports 6, 9, 12 hour increments
"timeUnit": "hour", // also supports "day" for daily increments.
"interval": 1, // when you select "timeUnit": "day"
"startTime": 1675901210, // UNIX timestamp start time (in seconds)
"endTime": 1975901210, // UNIX timestamp end time (in seconds)
"foldernameTemplate": "%DESTINATION%_%DATASET_ID%_%DATETIME(YYYYMMdd_HHmmss)%"
}
}'
The table below provides descriptions of all parameters in the scheduleParams
section, which allows you to customize export times, frequency, location, and more for your dataset exports.
Parameter | Description |
---|---|
exportMode |
Select "DAILY_FULL_EXPORT" or "FIRST_FULL_THEN_INCREMENTAL" . For more information about the two options, refer to export full files and export incremental files in the batch destinations activation tutorial. The three available export options are: Full file - Once: "DAILY_FULL_EXPORT" can only be used in combination with timeUnit :day and interval :0 for a one-time full export of the dataset. Daily full exports of datasets are not supported. If you need daily exports, use the incremental export option. Incremental daily exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :day , and interval :1 for daily incremental exports. Incremental hourly exports: Select "FIRST_FULL_THEN_INCREMENTAL" , timeUnit :hour , and interval :3 ,6 ,9 , or 12 for hourly incremental exports. |
timeUnit |
Select day or hour depending on the frequency with which you want to export dataset files. |
interval |
Select 1 when the timeUnit is day and 3 ,6 ,9 ,12 when the time unit is hour . |
startTime |
The date and time in UNIX seconds when dataset exports should start. |
endTime |
The date and time in UNIX seconds when dataset exports should end. |
foldernameTemplate |
Specify the expected folder name structure in your storage location where the exported files will be deposited.
|
Response
{
"id": "eb54b3b3-3949-4f12-89c8-64eafaba858f",
"etag": "\"0000d781-0000-0200-0000-63e29f420000\""
}
Note the Dataflow ID from the response. This ID will be required in the next step when retrieving the dataflow runs to validate the successful dateset exports.
To check the executions of a dataflow, use the Dataflow Runs API:
Request
In the request to retrieve dataflow runs, add as query parameter the dataflow ID that you obtained in the previous step, when creating the dataflow.
curl --location --request GET 'https://platform.adobe.io/data/foundation/flowservice/runs?property=flowId==eb54b3b3-3949-4f12-89c8-64eafaba858f' \
--header 'accept: application/json' \
--header 'x-api-key: {API_KEY}' \
--header 'x-gw-ims-org-id: {ORG_ID}' \
--header 'x-sandbox-name: {SANDBOX_NAME}' \
--header 'Authorization: Bearer {ACCESS_TOKEN}' \
Response
{
"items": [
{
"id": "4b7728dd-83c9-4c38-95a4-24ddab545404",
"createdAt": 1675807718296,
"updatedAt": 1675807731834,
"createdBy": "aep_activation_batch@AdobeID",
"updatedBy": "acp_foundation_connectors@AdobeID",
"createdClient": "aep_activation_batch",
"updatedClient": "acp_foundation_connectors",
"sandboxId": "7dfdcd30-0a09-11ea-8ea6-7bf93ce86c28",
"sandboxName": "sand-1",
"imsOrgId": "5555467B5D8013E50A494220@AdobeOrg",
"flowId": "aae5ec63-b0ac-4808-9a44-abf2ea67bd5a",
"flowSpec": {
"id": "615d3489-36d2-4671-9467-4ae1129facd3",
"version": "1.0"
},
"providerRefId": "ba56f98e0c49b572adb249980c39b1c7",
"etag": "\"08005e9e-0000-0200-0000-63e2cbf30000\"",
"metrics": {
"durationSummary": {
"startedAtUTC": 1675807719411,
"completedAtUTC": 1675807719416
},
"sizeSummary": {
"inputBytes": 0
},
"recordSummary": {
"inputRecordCount": 0,
"skippedRecordCount": 0,
"sourceSummaries": [
{
"id": "ea2b1205-4692-49de-b448-ebf75b1d188a",
"inputRecordCount": 0,
"skippedRecordCount": 0,
"entitySummaries": [
{
//...
You can find information about the various parameters returned by the Dataflow runs API in the API reference documentation.
When exporting datasets, Experience Platform creates a .json
or .parquet
file in the storage location that you provided. Expect a new file to be deposited in your storage location according to the export schedule you provided when creating a dataflow.
Experience Platform creates a folder structure in the storage location you specified, where it deposits the exported dataset files. A new folder is created for each export time, following the pattern below:
folder-name-you-provided/datasetID/exportTime=YYYYMMDDHHMM
The default file name is randomly generated and ensures that exported file names are unique.
The presence of these files in your storage location is confirmation of a successful export. To understand how the exported files are structured, you can download a sample .parquet file or .json file.
In the step to create a target connection, you can select the exported dataset files to be compressed.
Note the difference in file format between the two file types, when compressed:
json.gz
gz.parquet
The API endpoints in this tutorial follow the general Experience Platform API error message principles. Refer to API status codes and request header errors in the Platform troubleshooting guide for more information on interpreting error responses.
View known limitations about dataset exports.
View a list of frequently asked questions about dataset exports.
By following this tutorial, you have successfully connected Platform to one of your preferred batch cloud storage destinations and set up a dataflow to the respective destination to export datasets. See the following pages for more details, such as how to edit existing dataflows using the Flow Service API: