A machine learning engineer wants to log feature importance data from a CSV file at path importance_path with an MLflow run for model model.
Which of the following code blocks will accomplish this task inside of an existing MLflow run block?
A)

B)

C) mlflow.log_data(importance_path, "feature-importance.csv")
D) mlflow.log_artifact(importance_path, "feature-importance.csv")
E) None of these code blocks tan accomplish the task.
A. Option E
B. Option C
C. Option A
D. Option D
E. Option B
正解:C
質問 2:
A data scientist wants to remove the star_rating column from the Delta table at the location path. To do this, they need to load in data and drop the star_rating column.
Which of the following code blocks accomplishes this task?
A. spark.read.table(path).drop("star_rating")
B. spark.read.format("delta").load(path).drop("star_rating")
C. spark.read.format("delta").table(path).drop("star_rating")
D. Delta tables cannot be modified
E. spark.sql("SELECT * EXCEPT star_rating FROM path")
正解:A
質問 3:
A machine learning engineer needs to deliver predictions of a machine learning model in real-time. However, the feature values needed for computing the predictions are available one week before the query time.
Which of the following is a benefit of using a batch serving deployment in this scenario rather than a real-time serving deployment where predictions are computed at query time?
A. There is no advantage to using batch serving deployments over real-time serving deployments
B. Querying stored predictions can be faster than computing predictions in real-time
C. Batch serving has built-in capabilities in Databricks Machine Learning
D. Testing is not possible in real-time serving deployments
E. Computing predictions in real-time provides more up-to-date results
正解:C
質問 4:
A machine learning engineer has created a webhook with the following code block:

Which of the following code blocks will trigger this webhook to run the associate job?
A.

B.

C.

D.

E.

正解:E
質問 5:
A data scientist has developed and logged a scikit-learn random forest model model, and then they ended their Spark session and terminated their cluster. After starting a new cluster, they want to review the feature_importances_ of the original model object.
Which of the following lines of code can be used to restore the model object so that feature_importances_ is available?
A. This can only be viewed in the MLflow Experiments UI
B. client.list_artifacts(run_id)["feature-importances.csv"]
C. mlflow.sklearn.load_model(model_uri)
D. mlflow.load_model(model_uri)
E. client.pyfunc.load_model(model_uri)
正解:D
質問 6:
Which of the following is a simple statistic to monitor for categorical feature drift?
A. Mode
B. None of these
C. Percentage of missing values
D. Number of unique values
E. Mode, number of unique values, and percentage of missing values
正解:E
Okawa -
過去問を解くことを繰り返していれば問題なくDatabricks-Machine-Learning-Professional合格できると感じました。