Which of the following is a reason for using Jensen-Shannon (JS) distance over a Kolmogorov-Smirnov (KS) test for numeric feature drift detection?
A. JS does not require any manual threshold or cutoff determinations
B. JS is not normalized or smoothed
C. None of these reasons
D. All of these reasons
E. JS is more robust when working with large datasets
正解:E
質問 2:
A machine learning engineer is migrating a machine learning pipeline to use Databricks Machine Learning. They have programmatically identified the best run from an MLflow Experiment and stored its URI in the model_uri variable and its Run ID in the run_id variable. They have also determined that the model was logged with the name "model". Now, the machine learning engineer wants to register that model in the MLflow Model Registry with the name "best_model".
Which of the following lines of code can they use to register the model to the MLflow Model Registry?
A. mlflow.register_model(run_id, "best_model")
B. mlflow.register_model(model_uri, "model")
C. mlflow.register_model(f"runs:/{run_id}/best_model", "model")
D. mlflow.register_model(model_uri, "best_model")
E. mlflow.register_model(f"runs:/{run_id}/model")
正解:B
質問 3:
A machine learning engineer has developed a random forest model using scikit-learn, logged the model using MLflow as random_forest_model, and stored its run ID in the run_id Python variable. They now want to deploy that model by performing batch inference on a Spark DataFrame spark_df.
Which of the following code blocks can they use to create a function called predict that they can use to complete the task?
A.

B.

C.

D.

E. It is not possible to deploy a scikit-learn model on a Spark DataFrame.
正解:A
質問 4:
A machine learning engineer is attempting to create a webhook that will trigger a Databricks Job job_id when a model version for model model transitions into any MLflow Model Registry stage.
They have the following incomplete code block:

Which of the following lines of code can be used to fill in the blank so that the code block accomplishes the task?
A. "MODEL_VERSION_TRANSITIONED_TO_STAGING"
B. "MODEL_VERSION_TRANSITIONED_STAGE"
C. "MODEL_VERSION_TRANSITIONED_TO_STAGING", "MODEL_VERSION_TRANSITIONED_TO_PRODUCTION"
D. "MODEL_VERSION_TRANSITIONED_TO_PRODUCTION"
E. "MODEL_VERSION_CREATED"
正解:A
質問 5:
A data scientist has created a Python function compute_features that returns a Spark DataFrame with the following schema:

The resulting DataFrame is assigned to the features_df variable. The data scientist wants to create a Feature Store table using features_df.
Which of the following code blocks can they use to create and populate the Feature Store table using the Feature Store Client fs?
A. features_df.write.mode("fs").path("new_table")
B. features_df.write.mode("feature").path("new_table")
C.

D.

E.

正解:E
質問 6:
A data scientist would like to enable MLflow Autologging for all machine learning libraries used in a notebook. They want to ensure that MLflow Autologging is used no matter what version of the Databricks Runtime for Machine Learning is used to run the notebook and no matter what workspace-wide configurations are selected in the Admin Console.
Which of the following lines of code can they use to accomplish this task?
A. mlflow.spark.autolog()
B. spark.conf.set("autologging", True)
C. mlflow.autolog()
D. mlflow.sklearn.autolog()
E. It is not possible to automatically log MLflow runs.
正解:B
Nakane -
いちばんやさしいDatabricks-Machine-Learning-Professionalで学習してからがちょうどくらいだと思います。知識は勉強してからチャレンジもあります