Big Data QCM 1 PDF [PDF]

  • 0 0 0
  • Gefällt Ihnen dieses papier und der download? Sie können Ihre eigene PDF-Datei in wenigen Minuten kostenlos online veröffentlichen! Anmelden
Datei wird geladen, bitte warten...
Zitiervorschau

1- Select all the components of HDP which provides data access capabilities     

Pig Sqoop Flume MapReduce Hive

2- Select the components that provides the capability to move data from relational database into Hadoop.  Sql  Sqoop  Hive  Kafka  Flume 3- Managing Hadoop clusters can be accomplished using which component?     

Ambari HBase Phoenix Hive Sqoop

4- True or False: The following components are value-add from IBM: Big Replicate, Big SQL, BigIntegrate, BigQuality, Big Match  TRUE  FALSE 5- True or False: Data Science capabilities can be achieved using only HDP. 

TRUE

This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00

 FALSE 6- True or False: Ambari is backed by RESTful APIs for developers to easily integrate with their own applications.  

TRUE FALSE

7- Which Hadoop functionalities does Ambari provide?      

None of the above All of the above Monitor Manage Provision Integrate

8- Which page from the Ambari UI allows you to check the versions of the software installed on your cluster?  Monitor page  Integrate page  The Admin > Manage Ambari page  The Admin > Provision page 9- True or False?Creating users through the Ambari UI will also create the user on the HDFS.  

TRUE FALSE 10- True or False? You can use the CURL commands to issue commands to Ambari.  

TRUE FALSE 11- True or False: Hadoop systems are designed for transaction processing.

  12-

TRUE FALSE What is the default number of replicas in a Hadoop system?

   

5 4 3 2 13- True or False: One of the driving principal of Hadoop is that the data is brought to the program.  

TRUE FALSE 14- True or False: Atleast 2 Name Nodes are required for a standalone Hadoop cluster.  

TRUE FALSE 15- True or False: The phases in a MR job are Map, Shuffle, Reduce and Combiner  

TRUE FALSE 16- Centralized handling of job control flow is one of the the limitations of MR v1.  

TRUE FALSE 17- The Job Tracker in MR1 is replaced by which component(s) in YARN?    

ResourceMaster ApplicationMaster ApplicationManager ResourceManager

This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00

18- What are the benefits of using Spark? (Please select the THREE that apply)    

Generality Versality Speed Ease of use 19- What are the languages supported by Spark? (Please select the THREE that apply)     

Javascript HTML Python Java Scala

20- Resilient Distributed Dataset (RDD) is the primary abstraction of Spark.  TRUE  FALSE 21- What would you need to do in a Spark application that you would not need to do in a Spark shell to start using Spark?    

Extract the necessary libraries to load the SparkContext Export the necessary libraries to load the SparkContext Delete the necessary libraries to load the SparkContext Import the necessary libraries to load the SparkContext

22- True or False: NoSQL database is designed for those that do not want to use SQL.  TRUE  FALSE 23- Which database is a columnar storage database?

 SQL  Hive  HBase 24- Which database provides a SQL for Hadoop interface?  Hive  Hadoop  HBase 25-

Streams Spark Zeppelin ZooKeeper What is ZooKeeper's role in the Hadoop infrastructure?

 Manage the coordination between HBase servers  None of the above  Hadoop and MapReduce uses ZooKeeper to aid in high availability of Resource Manager  All of the above  Flume uses ZooKeeper for configuration purposes in recent releases 27- True or False: Slider provides an intuitive UI which allows you to dynamically allocate YARN resources.  TRUE  FALSE 28- True or False: Knox can provide all the security you need within your Hadoop infrastructure.  TRUE  FALSE 29-

 TRUE  FALSE 30- True or False: For Sqoop to connect to a relational database, the JDBC JAR files for that database must be located in $SQOOP_HOME/bin.  TRUE  FALSE

Which Apache project provides coordination of resources?    

26-

and relational databases.

True or False: Sqoop is used to transfer data between Hadoop

This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00

31-

True or False: Each Flume node receives data as "source", stores it in a "channel", and sends it via a "sink".

 TRUE  FALSE 32- Through what HDP component are Kerberos, Knox, and Ranger managed?  Zookeeper  Ambari  Apache Knox

33- Which security component is used to provide peripheral security?  Apache Ranger  Apache Camel  Apache Knox 34- One of the governance issue that Hortonworks DataPlane Service (DPS) address is visibility over all of an organization's data across all of their environments — on-prem, cloud, hybrid — while making it easy to maintain consistent security and governance  TRUE  FALSE

 35- True or false: The typical sources of streaming data are Sensors, "Data exhaust" and high-rate transaction data.  TRUE  FALSE 36- What are the components of Hortonworks Data Flow(HDF)?     

Flow management Stream processing All of the above None of the above Enterprise services

37- True or False: NiFi is a disk-based, microbatch ETL tool that provides flow management  TRUE  FALSE

38- True or False: MiNiFi is a complementary data collection tool that feeds collected data to NiFi  TRUE  FALSE 39- What main features does IBM Streams provide as a Streaming Data Platform? (Please select the THREE that apply)  Flow management  Analysis and visualization  Sensors 40- What are the three types of Big Data? (Please select the THREE that apply)  Natural Language This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00

    

Semi-structured Graph-based Structured Machine-Generated Unstructured

41- What are the 4Vs of Big Data? (Please select the FOUR that apply)      

Veracity Velocity Variety Value Volume Visualization 42- What are the most important computer languages for Data Analytics? (Please select the THREE that apply)  Scala  HTML  R  SQL  Python 43- True or False: GPUs are special-purpose processors that traditionally can be used to power graphical displays, but for Data Analytics lend themselves to faster algorithm execution because of the large number of independent processing cores.  TRUE  FALSE 44- True or False: Jupyter stores its workbooks in files with the .ipynb suffix. These files can not be stored locally or on a hub server.  TRUE  FALSE

45- $BIGSQL_HOME/bin/bigsql startcommand is used to start Big SQL from the command line?

51    

 TRUE  FALSE 46- What are the two ways you can work with Big SQL. (Please select the TWO that apply)     47-

JQuery R JSqsh Web tooling from DSM What is one of the reasons to use Big SQL?

 Want to access your Hadoop data without using MapReduce  You want to learn new languages like MapReduce  Has deep learning curve because Big SQL uses standard 2011 query structure 48- Should you use the default STRING data type?  Yes  No 49- The BOOLEAN type is defined as SMALLINT SQL type in Big SQL.  TRUE  FALSE 50- Using the LOAD operation is the recommended method for getting data into your Big SQL table for best performance.  TRUE  FALSE  This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00

Which file storage format has the highest performance?

52-

Delimited Sequence RC Parquet Avro What are the two ways to classify functions?

   

Built-in functions Scalar functions User-defined functions None of the above 53- True or False: UMASK is used to determine permissions on directories and files.  TRUE  FALSE 54- True or False: You can only Kerberize a Big SQL server before it is installed.  TRUE  FALSE 55- True or False: Authentication with Big SQL only occurs at the Big SQL layer or the client's application layer.  TRUE  FALSE 56- True or False: Ranger and impersonation works well together.  TRUE  FALSE 57- True or False: RCAC can hide rows and columns.  TRUE  FALSE

58- True or False: Nicknames can be used for wrappers and servers.

64- True or False: Community provides access to articles, tutorials, and even data sets that you can use.  TRUE  FALSE

 TRUE  FALSE 59- True or False: Server objects defines the property and values of the connection.  TRUE  FALSE 60- True or False: The purpose of a wrapper provide a library of routines that doesn't communicates with the data source.  TRUE  FALSE

61- True or False: User mappings are used to authenticate to the remote data source.  TRUE  FALSE 62- True or False: Collaboration with Watson Studio is an optional add-on component that must be purchased.  TRUE  FALSE 63- True or False: Watson Studio is designed only for Data Scientists, other personas would not know how to use it.  TRUE  FALSE

This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00

65- True or False: You can import visualization libraries into Watson Studio.  TRUE  FALSE 66-

True or False: Collaborators can be given certain access levels.

 TRUE  FALSE 67- True or False: Watson Studio contains Zeppelin as a notebook interface.  TRUE  FALSE 68- Spark is developed in which language    

Java Scala Python R 69- In Spark Streaming the data can be from what all sources?  Kafka  Flume  Kinesis  All of the above

70-

Apache Spark has API's in  Java  Scala  Python  All of the above  71- Which of the following is not a component of Spark Ecosystem?  Sqoop  GraphX  MLlib  BlinkDB

This study source was downloaded by 100000813657135 from CourseHero.com on 01-09-2022 16:03:20 GMT -06:00 Powered by TCPDF (www.tcpdf.org)