Which statistic, calculated from a validation sample, can help decide which model to use for prediction of a binary target variable?
A. Chi Square
B. Average Squared Error
C. Adjusted R Square
D. Mallow's Cp
正解:B
質問 2:
The total modeling data has been split into training, validation, and test data.
What is the best data to use for model assessment?
A. Total data
B. Validation data
C. Training data
D. Test data
正解:B
質問 3:
An analyst knows that the categorical predictor, storeId, is an important predictor of the target.
However, store_Id has too many levels to be a feasible predictor in the model. The analyst wants to combine stores and treat them as members of the same class level.
What are the two most effective ways to address the problem? (Choose two.)
A. Cluster by using Greenacre's method to combine stores that are similar.
B. Use subject matter expertise to combine stores that are similar.
C. Eliminate store_id as a predictor in the model because it has too many levels to be feasible.
D. Randomly combine the stores into five groups to keep the stochastic variation among the observations intact.
正解:A,B
質問 4:
The PROC LOGISTIC options SELECTION=SCORE and BEST=2 are used in a MODEL statement to generate a series of predictive models. The models are assigned numbers in order from 1 to 99 reflecting the fact that there are 50 candidate input variables. Results from the collection of derived models are used to generate the following plot of overall average profit by model number. Results are restricted to models with at least 9 inputs and at most 40 inputs.

The maximum value for the training data occurs for model number 46, and the maximum value for the validation data occurs for model number 43.
If you base model selection solely on overall average profit, what is the correct choice?
A. Select model 46
B. Select model 43
C. Select model 45
D. Select model 21
正解:B
質問 5:
Refer to the confusion matrix:

An analyst determines that loan defaults occur at the rate of 3% in the overall population. The above confusion matrix is from an oversampled test set (1 = default).
What is the sensitivity adjusted for the population event probability?
正解:
Enter your answer in the space below. Round to three decimals (example: n.nnn).
0.617
質問 6:
What does the Pearson product moment correlation coefficient measure?
A. nonlinear and monotonic association between two variables
B. linear and nonmonotonic association between two variables
C. nonlinear and nonmonotonic association between two variables
D. linear and monotonic association between two variables
正解:D
松*舞 -
たくさん問題を解いておきたい方にはおすすめできますね。出題範囲を100%カバーしている。Pass4TestのA00-240は最強。友達にも勧めました。