A data engineer needs to apply custom logic to identify employees with more than 5 years of experience in array column employees in table stores. The custom logic should create a new column exp_employees that is an array of all of the employees with more than 5 years of experience for each row. In order to apply this custom logic at scale, the data engineer wants to use the FILTER higher-order function.
Which of the following code blocks successfully completes this task?

A. Option E
B. Option C
C. Option A
D. Option D
E. Option B
正解:C
解説: (Pass4Test メンバーにのみ表示されます)
質問 2:
A data engineering team has two tables. The first table march_transactions is a collection of all retail transactions in the month of March. The second table april_transactions is a collection of all retail transactions in the month of April. There are no duplicate records between the tables.
Which of the following commands should be run to create a new table all_transactions that contains all records from march_transactions and april_transactions without duplicate records?
A. CREATE TABLE all_transactions AS
SELECT * FROM march_transactions
INTERSECT SELECT * from april_transactions;
B. CREATE TABLE all_transactions AS
SELECT * FROM march_transactions
OUTER JOIN SELECT * FROM april_transactions;
C. CREATE TABLE all_transactions AS
SELECT * FROM march_transactions
MERGE SELECT * FROM april_transactions;
D. CREATE TABLE all_transactions AS
SELECT * FROM march_transactions
UNION SELECT * FROM april_transactions;
E. CREATE TABLE all_transactions AS
SELECT * FROM march_transactions
INNER JOIN SELECT * FROM april_transactions;
正解:D
解説: (Pass4Test メンバーにのみ表示されます)
質問 3:
A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.
Which of the following commands could the data engineering team use to access sales in PySpark?
A. spark.delta.table("sales")
B. There is no way to share data between PySpark and SQL.
C. SELECT * FROM sales
D. spark.sql("sales")
E. spark.table("sales")
正解:E
解説: (Pass4Test メンバーにのみ表示されます)
質問 4:
Which of the following is hosted completely in the control plane of the classic Databricks architecture?
A. Databricks Filesystem
B. Worker node
C. Databricks web application
D. JDBC data source
E. Driver node
正解:C
解説: (Pass4Test メンバーにのみ表示されます)
質問 5:
A data engineer that is new to using Python needs to create a Python function to add two integers together and return the sum?
Which of the following code blocks can the data engineer use to complete this task?
A.

B.

C.

D.

E.

正解:D
解説: (Pass4Test メンバーにのみ表示されます)
Minai -
今回も、Pass4Testさんに感謝します。Databricks-Certified-Data-Engineer-Associateの問題集を購入し、勉強して無事にごうかくすることができました。