This article outlines how the Customer Journey Analytics Export datasets can be used to implement the following data export use case:
Exporting data using Experience Platform Export datasets allows you to export data from your Customer Journey Analytics data views to any cloud storage destination.
You can export raw datasets, from the data lake in Experience Platform, to cloud storage destinations. This export is in Experience Platform Destinations terminology referred to as Dataset export destinations. See Export datasets to cloud storage destinations for an overview.
The following cloud storage destinations are supported:
You can export and schedule the export of your datasets through the Experience Platform UI. This section describes the steps involved.
When you have determined the cloud storage destination to where you want to export the dataset to, select the destination. When you have not yet configured a destination for your preferred cloud storage, you must create a new destination connection.
As part of configuring a destination, you can define:
When you have selected the destination, in the next Select datasets step you have to select your dataset from the list of datasets. If you have create multiple scheduled queries, and you want the datasets to send to the same cloud storage destination, you can select the corresponding datasets. See Select your datasets for more information.
Finally, you want to schedule your dataset export as part of the Scheduling step. In that step you can define the schedule and whether the dataset export should be incremental or not. See Schedule dataset export for more information.
Review your selection, and when correct, start exporting your dataset to the cloud storage destination.
First, you must verify a successful data export. When exporting datasets, Experience Platform creates one or multiple .json
or .parquet
files in the storage location defined in your destination. Expect new files to be deposited in your storage location according to the export schedule you set up. Experience Platform creates a folder structure in the storage location that you specified as part of the selected destination, where it deposits the exported files. A new folder is created for each export time, following the pattern: folder-name-you-provided/datasetID/exportTime=YYYYMMDDHHMM
. The default file name is randomly generated and ensures that exported file names are unique.
Alternatively, you can export and schedule the export of datasets using APIs. The steps involved are documented in Export datasets by using the Flow Service API.
To export datasets, ensure you have the required permissions. Also verify that the destination to where you want to send your dataset supports exporting datasets. You then must gather the values for required and optional headers that you use in the API calls. You also need to identify the connection spec and flow spec IDs of the destination you are intending to export datasets to.
You can retrieve a list of eligible datasets for export and verify whether your dataset is part of that list using the GET /connectionSpecs/{id}/configs
API.
Next, you must create a source connection for the dataset, using its unique ID, that you want to export to the cloud storage destination. You use the POST /sourceConnections
API.
You now must create a base connection to authenticate and securely store the credentials to your cloud storage destination using the POST /targetConection
API.
Next, you must create an additional target connection that stores the export parameters for your dataset using, once more, the POST /targetConection
API. These export parameters include location, file format, compression, and more.
Finally, you set up the dataflow to ensure that your dataset is exported to your cloud storage destination using the POST /flows
API. In this step, you can define the schedule for the export, using the scheduleParams
parameter.
To check successful executions of your dataflow, use the GET /runs
API, specifying the dataflow ID as query parameter. This dataflow ID is an identifier returned when you set up the dataflow.
Verify a successful data export. When exporting datasets, Experience Platform creates one or multiple .json
or .parquet
files in the storage location defined in your destination. Expect new files to be deposited in your storage location according to the export schedule you set up. Experience Platform creates a folder structure in the storage location that you specified as part of the selected destination, where it deposits the exported files. A new folder is created for each export time, following the pattern: folder-name-you-provided/datasetID/exportTime=YYYYMMDDHHMM
. The default file name is randomly generated and ensures that exported file names are unique.