ノーザン トレイル アウトフィッターズは毎日、過去 24 時間の店舗トランザクションの概要を Amazon S3 バケット内の新しいファイルにアップロードし、7 日より古いファイルは自動的に削除されます。各ファイルには、標準化された命名規則に従ってタイムスタンプが含まれています。
このデータ ストリームを取り込むときにコンサルタントが構成する必要がある 2 つのオプションはどれですか?
2 つの答えを選択してください
正解:B,C
When ingesting data from an Amazon S3 bucket, the consultant should configure the following options:
The refresh mode should be set to "Upsert", which means that new and updated records will be added or updated in Data Cloud, while existing records will be preserved. This ensures that the data is always up to date and consistent with the source.
The filename should contain a wildcard to accommodate the timestamp, which means that the file name pattern should include a variable part that matches the timestamp format. For example, if the file name is store_transactions_2023-12-18.csv, the wildcard could be store_transactions_*.csv. This ensures that the ingestion process can identify and process the correct file every day.
The other options are not necessary or relevant for this scenario:
Deletion of old files is a feature of the Amazon S3 bucket, not the Data Cloud ingestion process. Data Cloud does not delete any files from the source, nor does it require the source files to be deleted after ingestion.
Full Refresh is a refresh mode that deletes all existing records in Data Cloud and replaces them with the records from the source file. This is not suitable for this scenario, as it would result in data loss and inconsistency, especially if the source file only contains the summary of the last 24 hours of transactions. Reference: Ingest Data from Amazon S3, Refresh Modes