To process input key-value pairs, your mapper needs to lead a 512 MB data file in memory. What is the best way to accomplish this?
A. Place the data file in the DistributedCache and read the data into memory in the configure method of the mapper.
B. Place the data file in the DistributedCache and read the data into memory in the map method of the mapper.
C. Place the data file in the DataCache and read the data into memory in the configure method of the mapper.
D. Serialize the data file, insert in it the JobConf object, and read the data into memory in the configure method of the mapper.
正解:C
質問 2:
All keys used for intermediate output from mappers must:
A. Override isSplitable.
B. Implement a splittable compression algorithm.
C. Implement a comparator for speedy sorting.
D. Be a subclass of FileInputFormat.
E. Implement WritableComparable.
正解:E
解説: (Pass4Test メンバーにのみ表示されます)
質問 3:
You have just executed a MapReduce job. Where is intermediate data written to after being emitted from the Mapper's map method?
A. Intermediate data in streamed across the network from Mapper to the Reduce and is never written to disk.
B. Into in-memory buffers that spill over to the local file system (outside HDFS) of the TaskTracker node running the Reducer
C. Into in-memory buffers on the TaskTracker node running the Mapper that spill over and are written into HDFS.
D. Into in-memory buffers on the TaskTracker node running the Reducer that spill over and are written into HDFS.
E. Into in-memory buffers that spill over to the local file system of the TaskTracker node running the Mapper.
正解:E
解説: (Pass4Test メンバーにのみ表示されます)
質問 4:
Which one of the following statements is false about HCatalog?
A. Provides a shared schema mechanism
B. Exists as a subproject of Hive
C. Stores HDFS data in a database for performing SQL-like ad-hoc queries
D. Designed to be used by other programs such as Pig, Hive and MapReduce
正解:C
質問 5:
Can you use MapReduce to perform a relational join on two large tables sharing a key? Assume that the two tables are formatted as comma-separated files in HDFS.
A. No, but it can be done with either Pig or Hive.
B. Yes, so long as both tables fit into memory.
C. Yes, but only if one of the tables fits into memory
D. No, MapReduce cannot perform relational operations.
E. Yes.
正解:E
解説: (Pass4Test メンバーにのみ表示されます)
質問 6:
You need to move a file titled "weblogs" into HDFS. When you try to copy the file, you can't. You know you have ample space on your DataNodes. Which action should you take to relieve this situation and store more files in HDFS?
A. Increase the block size on all current files in HDFS.
B. Increase the block size on your remaining files.
C. Increase the number of disks (or size) for the NameNode.
D. Decrease the block size on all current files in HDFS.
E. Decrease the block size on your remaining files.
F. Increase the amount of memory for the NameNode.
正解:E
質問 7:
Consider the following two relations, A and B.

A Pig JOIN statement that combined relations A by its first field and B by its second field would produce what output?
A. 2 Jim Chris 2 3 Terry 3 4 Brian 4
B. 2 cherry Jim, Chris 3 orange Terry 4 peach Brian
C. 2 cherry Jim 2 2 cherry Chris 2 3 orange Terry 3 4 peach Brian 4
D. 2 cherry 2 cherry 3 orange 4 peach
正解:C
質問 8:
Which one of the following is NOT a valid Oozie action?
A. hive
B. pig
C. mrunit
D. mapreduce
正解:C
質問 9:
Given the following Pig command:
logevents = LOAD 'input/my.log' AS (date:chararray, levehstring, code:int, message:string);
Which one of the following statements is true?
A. The logevents relation represents the data from the my.log file, using a comma as the parsing delimiter
B. The statement is not a valid Pig command
C. The logevents relation represents the data from the my.log file, using a tab as the parsing delimiter
D. The first field of logevents must be a properly-formatted date string or table return an error
正解:C
竹内** -
合格することができました。このApache-Hadoop-Developer参考書と過去問を解けば合格できると思います!