注:この質問は、同じシナリオを提示する一連の質問の一部です。 シリーズの各質問には、記載された目標を達成する可能性のある独自のソリューションが含まれています。 一部の質問セットには複数の正しい解決策がある場合もあれば、正しい解決策がない場合もあります。 このシナリオで質問に答えた後、その質問に戻ることはできません。 その結果、これらの質問はレビュー画面に表示されません。 階層構造を持つAzure Databricksワークスペースを作成する予定です。 ワークスペースには、次の3つのワークロードが含まれます。 * A workload for data engineers who will use Python and SQL * A workload for jobs that will run notebooks that use Python, Spark, Scala, and SQL * A workload that data scientists will use to perform ad hoc analysis in Scala and R The enterprise architecture team at your company identifies the following standards for Databricks environments: * The data engineers must share a cluster. * The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster. * All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists. You need to create the Databrick clusters for the workloads. Solution: You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a Standard cluster for the jobs. Does this meet the goal?
正解:B
We would need a High Concurrency cluster for the jobs. Note: Standard clusters are recommended for a single user. Standard can run workloads developed in any language: Python, R, Scala, and SQL. A high concurrency cluster is a managed cloud resource. The key benefits of high concurrency clusters are that they provide Apache Spark-native fine-grained sharing for maximum resource utilization and minimum query latencies. References: https://docs.azuredatabricks.net/clusters/configure.html