최신 Databricks Certification Associate-Developer-Apache-Spark 무료샘플문제:
1. Which of the following code blocks returns a new DataFrame with the same columns as DataFrame transactionsDf, except for columns predError and value which should be removed?
A) transactionsDf.drop(predError, value)
B) transactionsDf.drop("predError", "value")
C) transactionsDf.drop("predError & value")
D) transactionsDf.drop(col("predError"), col("value"))
E) transactionsDf.drop(["predError", "value"])
2. Which of the following code blocks selects all rows from DataFrame transactionsDf in which column productId is zero or smaller or equal to 3?
A) transactionsDf.filter(productId==3 or productId<1)
B) transactionsDf.where("productId"=3).or("productId"<1))
C) transactionsDf.filter(col("productId")==3 | col("productId")<1)
D) transactionsDf.filter((col("productId")==3) or (col("productId")<1))
E) transactionsDf.filter((col("productId")==3) | (col("productId")<1))
3. Which of the following code blocks reads in the JSON file stored at filePath, enforcing the schema expressed in JSON format in variable json_schema, shown in the code block below?
Code block:
1.json_schema = """
2.{"type": "struct",
3. "fields": [
4. {
5. "name": "itemId",
6. "type": "integer",
7. "nullable": true,
8. "metadata": {}
9. },
10. {
11. "name": "supplier",
12. "type": "string",
13. "nullable": true,
14. "metadata": {}
15. }
16. ]
17.}
18."""
A) spark.read.json(filePath, schema=spark.read.json(json_schema))
B) spark.read.schema(json_schema).json(filePath)
1.schema = StructType.fromJson(json.loads(json_schema))
2.spark.read.json(filePath, schema=schema)
C) spark.read.json(filePath, schema=json_schema)
D) spark.read.json(filePath, schema=schema_of_json(json_schema))
4. Which of the following statements about Spark's DataFrames is incorrect?
A) Spark's DataFrames are immutable.
B) RDDs are at the core of DataFrames.
C) The data in DataFrames may be split into multiple chunks.
D) Data in DataFrames is organized into named columns.
E) Spark's DataFrames are equal to Python's DataFrames.
5. Which of the following code blocks stores a part of the data in DataFrame itemsDf on executors?
A) itemsDf.cache(eager=True)
B) itemsDf.cache().count()
C) itemsDf.cache().filter()
D) cache(itemsDf)
E) itemsDf.rdd.storeCopy()
질문과 대답:
질문 # 1 정답: B | 질문 # 2 정답: E | 질문 # 3 정답: B | 질문 # 4 정답: E | 질문 # 5 정답: B |