최신 CCDH CCD-333 무료샘플문제:
1. Given a directory of files with the following structure: line number, tab character, string:
Example:
1.abialkjfjkaoasdfjksdlkjhqweroij
2.kadf jhuwqounahagtnbvaswslmnbfgy
3.kjfteiomndscxeqalkzhtopedkfslkj
You want to send each line as one record to your Mapper. Which InputFormat would you use to complete the line: setInputFormat (________.class);
A) SequenceFileInputFormat
B) SequenceFileAsTextInputFormat
C) KeyValueTextInputFormat
D) BDBInputFormat
2. Which of the following best describes the workings of TextInputFormat?
A) Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReaders of both splits containing the broken line.
B) Input file splits may cross line breaks. A line that crosses tile splits is ignored.
C) Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the beginning of the broken line.
D) Input file splits may cross line breaks. A line that crosses file splits is read by the RecordReader of the split that contains the end of the broken line.
E) The input file is split exactly at the line breaks, so each Record Reader will read a series of complete lines.
3. Which of the following statements best describes how a large (100 GB) file is stored in HDFS?
A) The file is divided into fixed-size blocks, which are stored on multiple datanodes. Each block is replicated three times by default. Multiple blocks from the same file might reside on the same datanode.
B) The file is replicated three times by default. Eachcopy of the file is stored on a separate datanodes.
C) The master copy of the file is stored on a single datanode. The replica copies are divided into fixed-size blocks, which are stored on multiple datanodes.
D) The file is divided into fixed-size blocks, which are stored on multiple datanodes. Each block is replicated three times by default.HDFS guarantees that different blocks from the same file are never on the same datanode.
E) The file is divided into variable size blocks, which are stored on multiple data nodes. Each block is replicated three times by default.
4. In a MapReduce job, you want each of you input files processed by a single map task. How do you configure a MapReduce job so that a single map task processes each input file regardless of how many blocks the input file occupies?
A) Increase the parameter that controls minimum split size in the job configuration.
B) Write a custom FileInputFormat and override the method isSplittable to always return false.
C) Write a custom MapRunner that iterates over all key-value pairs in the entire file.
D) Set the number of mappers equal to the number of input files you want to process.
5. The NameNode uses RAM for the following purpose:
A) To store the edits log that keeps track of changes in HDFS.
B) To store filenames, list of blocks and other meta information.
C) To manage distributed read and write locks on files in HDFS.
D) To store the contents of files in HDFS.
질문과 대답:
질문 # 1 정답: A | 질문 # 2 정답: D | 질문 # 3 정답: D | 질문 # 4 정답: B | 질문 # 5 정답: B |