Bigdata Testblanc [PDF]

Question 1 Which type of cell can be used to document and comment on a process in a Jupyter notebook? Your answer   A.

26 0 177KB

Report DMCA / Copyright

DOWNLOAD PDF FILE

Papiere empfehlen

Bigdata Testblanc [PDF]

  • 0 0 0
  • Gefällt Ihnen dieses papier und der download? Sie können Ihre eigene PDF-Datei in wenigen Minuten kostenlos online veröffentlichen! Anmelden
Datei wird geladen, bitte warten...
Zitiervorschau

Question 1 Which type of cell can be used to document and comment on a process in a Jupyter notebook? Your answer

 

A.  Kernel

 

B.  Markdown

 

C.  Code

 

D.  Output

That's rIght!

Question 2 Where does the unstructured data of a project reside in Watson Studio? Your answer

 

A.  Object Storage

 

B.  Database

 

C.  Wrapper

 

D.  Tables

That's rIght!  

Feedback Questions  Test Overview  Help Big Data Engineer v2 Explorer Award for Students (2018) Time Taken: 42:01 CloseMail Feedback

Question 3 Which Watson Studio offering used to be available through something known as IBM Bluemix?

Your answer  

A.  Watson Studio Cloud

 

B.  Watson Studio Desktop

 

C.  Watson Studio Local

 

D.  Watson Studio Business

That's rIght!

Question 4 What is the architecture of Watson Studio centered on? Your answer

 

A.  Projects

 

B.  Collaborators

 

C.  Analytic Assets

 

D.  Data Assets

That's rIght!

Question 5 Before you create a Jupyter notebook in Watson Studio, which two items are necessary? Your answer

 

A.  URL

 

B.  Project

 

C.  File

Your answer

 

D.  Scala

 

E.  Spark Instance

That's rIght!

Question 6 Which Spark Core function provides the main element of Spark API? Your answer

 

A.  Mesos

 

B.  MLlib

 

C.  YARN

 

D.  RDD

That's rIght!

Question 7 Which statement about Apache Spark is true? Your answer

 

A.  It runs on Hadoop clusters with RAM drives configured on each DataNode.

 

B.  It features APIs for C++ and .NET.

 

C.  It is much faster than MapReduce for complex applications on disk.

 

D.  It supports HDFS, MS-SQL, and Oracle.

That's rIght!

Question 8

Under the YARN/MRv2 framework, which daemon is tasked with negotiating with the NodeManager(s) to execute and monitor tasks? Your answer

 

A.  JobMaster

 

B.  ApplicationMaster

 

C.  ResourceManager

 

D.  TaskManager

That's rIght!

Question 9 What is the preferred replacement for Flume? Your answer

 

A.  NiFi

 

B.  Druid

 

C.  Hortonworks Data Flow

 

D.  Storm

That's rIght!

Question 10 What is an example of a Key-value type of NoSQL datastore? Your answer

 

A.  Neo4j

 

B.  REDIS

Your answer

 

C.  Sesame

 

D.  MongoDB

That's rIght!

Question 11 Under the YARN/MRv2 framework, the Scheduler and ApplicationsManager are components of which daemon? Your answer

 

A.  ApplicationMaster

 

B.  TaskManager

 

C.  ScheduleManager

 

D.  ResourceManager

That's rIght!

Question 12 What are two security features Apache Ranger provides? Your answer

 

A.  Auditing

 

B.  Authorization

 

C.  Authentication

 

D.  Availability

That's rIght!

Question 13

Which is the java class prefix for the MapReduce v1 APIs? Your answer

 

A.  org.apache.mr

 

B.  org.apache.mapreduce

 

C.  org.apache.hadoop.mapred

 

D.  org.apache.hadoop.mr

That's rIght!

Question 14 How can a Sqoop invocation be constrained to only run one mapper? Your answer

 

A.  Use the -m 1 parameter.

 

B.  Use the --single parameter.

 

C.  Use the -mapper 1 parameter.

 

D.  Use the --limit mapper=1 parameter.

That's rIght!

Question 15 Which data encoding format supports exact storage of all data in binary representations such as VARBINARY columns? Your answer

 

A.  RCFile

Your answer

 

B.  Parquet

 

C.  SequenceFiles

 

D.  Flat

That's rIght!

Question 16 Under the MapReduce v1 programming model, which optional phase is executed simultaneously with the Shuffle phase? Your answer

 

A.  Split

 

B.  Reduce

 

C.  Combiner

 

D.  Map

That's rIght!

Question 17 Which two are valid watches for ZNodes in ZooKeeper? Your answer

 

A.  NodeChildrenChanged

 

B.  NodeExpired

 

C.  NodeDeleted

Your answer

 

D.  NodeRefreshed

That's rIght!

Question 18 Which NoSQL datastore type began as an implementation of Google's BigTable that can store any type of data and scale to many petabytes? Your answer

 

A.  HBase

 

B.  Riak

 

C.  MemcacheD

 

D.  CouchDB

That's rIght!

Question 19 Which two factors in a Hadoop cluster increase performance most significantly? Your answer

 

A.  solid state disks

 

B.  parallel reading of large data files

 

C.  data redundancy on management nodes

 

D.  immediate failover of failed disks

 

E.  large number of small data files

Your answer

 

F.  high-speed networking between nodes

That's rIght!

Question 20 Which statement is true about MapReduce v1 APIs? Your answer

 

A.  MapReduce v1 APIs are implemented by applications which are largely independent of the execution environment.

 

B.  MapReduce v1 APIs cannot be used with YARN.

 

C.  MapReduce v1 APIs provide a flexible execution environment to run MapReduce.

 

D.  MapReduce v1 APIs define how MapReduce jobs are executed.

That's rIght!

Question 21 Apache Spark provides a single, unifying platform for which three of the following types of operations? Your answer

 

A.  batch processing

 

B.  ACID transactions

 

C.  machine learning

 

D.  transaction processing

 

E.  record locking

 

F.  graph operations

That's rIght!

Question 22

Which Hadoop ecosystem tool can import data into a Hadoop cluster from a DB2, MySQL, or other databases? Your answer

 

A.  Sqoop

 

B.  Accumulo

 

C.  HBase

 

D.  Oozie

That's rIght!

Question 23 Under the YARN/MRv2 framework, the JobTracker functions are split into which two daemons? Your answer

 

A.  ApplicationMaster

 

B.  TaskManager

 

C.  JobMaster

 

D.  ScheduleManager

 

E.  ResourceManager

That's rIght!

Question 24 Which three programming languages are directly supported by Apache Spark?

Your answer

 

A.  Python

 

B.  Java

 

C.  .NET

 

D.  Scala

 

E.  C++

 

F.  C#

That's rIght!

Question 25 Which component of the Apache Ambari architecture integrates with an organization's LDAP or Active Directory service? Your answer

 

A.  Ambari Alert Framework

 

B.  REST API

 

C.  Authorization Provider

 

D.  Postgres RDBMS

That's rIght!

Question 26 Which statement accurately describes how ZooKeeper works?

Your answer

 

A.  There can be more than one leader server at a time.

 

B.  Clients connect to multiple servers at the same time.

 

C.  All servers keep a copy of the shared data in memory.

 

D.  Writes to a leader server will always succeed.

That's rIght!

Question 27 Which description characterizes a function provided by Apache Ambari? Your answer

 

A.  A wizard for installing Hadoop services on host servers.

 

B.  Moves information to/from structured databases.

 

C.  Moves large amounts of streaming event data.

 

D.  A messaging system for real-time data pipelines.

That's rIght!

Question 28 What are two services provided by ZooKeeper? Your answer

 

A.  Loading bulk data into an Hadoop cluster.

 

B.  Providing distributed synchronization.

 

C.  Authenticating and auditing user access.

 

D.  Maintaining configuration information.

That's rIght!

Question 29 Which three are a part of the Five Pillars of Security? Your answer

 

A.  Resiliency

 

B.  Audit

 

C.  Administration

 

D.  Data Protection

 

E.  Speed

That's rIght!

Question 30 If a Hadoop node goes down, which Ambari component will notify the Administrator? Your answer

 

A.  Ambari Metrics System

 

B.  Ambari Wizard

 

C.  REST API

 

D.  Ambari Alert Framework

That's rIght!

Question 31 Under the MapReduce v1 programming model, what happens in a "Reduce" step? Your answer

 

A.  Worker nodes process pieces in parallel.

Your answer

 

B.  Data is aggregated by worker nodes.

 

C.  Worker nodes store results on their own local file systems.

 

D.  Input is split into pieces.

That's rIght!

Question 32 Which component of the Spark Unified Stack allows developers to intermix structured database queries with Spark's programming language? Your answer

 

A.  Java

 

B.  Spark SQL

 

C.  MLlib

 

D.  Mesos

That's rIght!

Question 33 What are three IBM value-add components to the Hortonworks Data Platform (HDP)? Your answer

 

A.  Big Index

 

B.  Big Data

 

C.  Big Match

Your answer

 

D.  Big Replicate

 

E.  Big SQL

 

F.  Big YARN

That's rIght!

Question 34 Which component of an Hadoop system is the primary cause of poor performance? Your answer

 

A.  RAM

 

B.  disk latency

 

C.  network

 

D.  CPU

That's rIght!

Question 35 What are two ways the command-line parameters for a Sqoop invocation can be simplified? Your answer

 

A.  Include the --options-file command line argument.

 

B.  Run Sqoop using the vi editor.

 

C.  Use the --import-command line argument.

 

D.  Place the commands in a file.

That's rIght!

Question 36 Which component of the Hortonworks Data Platform (HDP) is the architectural center of Hadoop and provides resource management and a central platform for Hadoop applications? Your answer

 

A.  HDFS

 

B.  MapReduce

 

C.  HBase

 

D.  YARN

That's rIght!

Question 37 Which hardware feature on an Hadoop datanode is recommended for cost efficient performance? Your answer

 

A.  RAID

 

B.  LVM

 

C.  SSD

 

D.  JBOD

That's rIght!

Question 38 Hadoop uses which two Google technologies as its foundation?

Your answer

 

A.  HBase

 

B.  Ambari

 

C.  MapReduce

 

D.  Google File System

 

E.  YARN

That's rIght!

Question 39 What are two primary limitations of MapReduce v1? Your answer

 

A.  TaskTrackers can be a bottleneck to MapReduce jobs

 

B.  Number of TaskTrackers limited to 1,000

 

C.  Scalability

 

D.  Resource utilization

 

E.  Workloads limited to MapReduce

That's rIght!

Question 40 Which computing technology provides Hadoop's high performance? Your answer

 

A.  Online Analytical Processing

 

B.  RAID-0

Your answer

 

C.  Parallel Processing

 

D.  Online Transactional Processing

That's rIght!

Question 41 What command is used to list the "magic" commands in Jupyter? Your answer

 

A.  %list-all-magic

 

B.  %lsmagic

 

C.  %dirmagic

 

D.  %list-magic

That's rIght!

Question 42 What is the first step in a data science pipeline? Your answer

 

A.  Exploration

 

B.  Manipulation

 

C.  Acquisition

 

D.  Analytics

That's rIght!

Question 43 Why might a data scientist need a particular kind of GPU (graphics processing unit)?

Your answer

 

A.  To perform certain data transformation quickly.

 

B.  To display a simple bar chart of data on the screen.

 

C.  To collect video for use in streaming data applications.

 

D.  To input commands to a data science notebook.

That's rIght!

Question 44 Which is an advantage that Zeppelin holds over Jupyter? Your answer

 

A.  Notebooks can be used by multiple people at the same time.

 

B.  Notebooks can be connected to big data engines such as Spark.

 

C.  Zeppelin is able to use the R language.

 

D.  Users must authenticate before using a notebook.

That's rIght!

Question 45 What is a "magic" command used for in Jupyter? Your answer

 

A.  Extending the core language with shortcuts.

 

B.  Parsing and loading data into a notebook.

 

C.  Running common statistical analyses.

 

D.  Autoconfiguring data connections using a registry.

That's rIght!

Question 46 Which directory permissions need to be set to allow all users to create their own schema? Your answer

 

A.  700

 

B.  755

 

C.  666

 

D.  777

That's rIght!

Question 47 You are creating a new table and need to format it with parquet. Which partial SQL statement would create the table in parquet format? Your answer

 

A.  STORED AS parquetfile

 

B.  CREATE AS parquetfile

 

C.  STORED AS parquet

 

D.  CREATE AS parquet

That's rIght!

Question 48 What is an advantage of the ORC file format? Your answer

 

A.  Big SQL can exploit advanced features

Your answer

 

B.  Efficient compression

 

C.  Data interchange outside Hadoop

 

D.  Supported by multiple I/O engines

That's rIght!

Question 49 You need to enable impersonation. Which two properties in the bigsql-conf.xml file need to be marked true? Your answer

 

A.  bigsql.alltables.io.doAs

 

B.  $BIGSQL_HOME/conf

 

C.  DB2COMPOPT

 

D.  bigsql.impersonation.create.table.grant.public

 

E.  DB2_ATS_ENABLE

That's rIght!

Question 50 Using the Java SQL Shell, which command will connect to a database called mybigdata? Your answer

 

A.  ./jsqsh mybigdata

 

B.  ./java tables

Your answer

 

C.  ./jsqsh go mybigdata

 

D.  ./java mybigdata

That's rIght!

Question 51 Which two commands would you use to give or remove certain privileges to/from a user? Your answer

 

A.  GRANT

 

B.  SELECT

 

C.  LOAD

 

D.  REVOKE

 

E.  INSERT

That's rIght!

Question 52 You need to determine the permission setting for a new schema directory. Which tool would you use? Your answer

 

A.  umask

 

B.  Kerberos

 

C.  HDFS

 

D.  GRANT

That's rIght!

Question 53 What is the default directory in HDFS where tables are stored? Your answer

 

A.  /apps/hive/warehouse/

 

B.  /apps/hive/warehouse/data

 

C.  /apps/hive/warehouse/schema

 

D.  /apps/hive/warehouse/bigsql

That's rIght!

Question 54 Which statement best describes a Big SQL database table? Your answer

 

A.  A directory with zero or more data files.

 

B.  A data type of a column describing its value.

 

C.  A container for any record format.

 

D.  The defined format and rules around a delimited file.

That's rIght!

Question 55 Which command creates a user-defined schema function? Your answer

 

A.  CREATE FUNCTION

Your answer

 

B.  ALTER MODULE ADD FUNCTION

 

C.  TRANSLATE FUNCTION

 

D.  ALTER MODULE PUBLISH FUNCTION

That's rIght!

Question 56 Which definition best describes RCAC? Your answer

 

A.  It grants or revokes certain directory privileges.

 

B.  It limits access by using views and stored procedures.

 

C.  It grants or revokes certain user privileges.

 

D.  It limits the rows or columns returned based on certain criteria.

That's rIght!

Question 57 What are Big SQL database tables organized into? Your answer

 

A.  Hives

 

B.  Files

 

C.  Directories

 

D.  Schemas

That's rIght!

Question 58

You have a distributed file system (DFS) and need to set permissions on the the /hive/warehouse directory to allow access to ONLY the bigsql user. Which command would you run? Your answer

 

A.  hdfs dfs -chmod 770 /hive/warehouse

 

B.  hdfs dfs -chmod 700 /hive/warehouse

 

C.  hdfs dfs -chmod 755 /hive/warehouse

 

D.  hdfs dfs -chmod 666 /hive/warehouse

That's rIght!

Question 59 When connecting to an external database in a federation, you need to use the correct database driver and protocol. What is this federation component called in Big SQL? Your answer

 

A.  User mapping

 

B.  Data source

 

C.  Wrapper

 

D.  Nickname

That's rIght!

Question 60 Which Big SQL feature allows users to join a Hadoop data set to data in external databases?

Your answer

 

A.  Impersonation

 

B.  Grant/Revoke privileges

 

C.  Fluid query

 

D.  Integration

That's rIght!