Uploaded by Kripali Agrawal

question bank

advertisement
QUESTION BANK 2018
SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR
(Autonomous)
Siddharth Nagar, Narayanavanam Road – 517583
QUESTION BANK (DESCRIPTIVE)
Subject with Code : Big Data (16CS520)
Course & Branch: B.Tech - CSE Year &
Sem: III-B.Tech & I-Sem
Regulation: R16
UNIT –I
1. Classification of Digital Data. Explain?
2. Write about challenges with Big Data?
3.Explain about 5v's?
4. What is Big Data Analytics?
5.What Big Data Analytics isn't?
6. Explain about classification of Analytics?
7.Write about top challenges facing big data?
8. Discuss why is big data analytics important?
9. Write about Data Scence?
10. Explain the difference between parallel and distributed system?
11. Explain CAP Theorem?
UNIT 2
1. Compare Reporting and Analysis with its process.
2. Explain the following
a. Advanced analytics
b. Operationalized analytics
c. Monetized analytics
3. How to develop an analytical team and what is the skill required for an analyst?
4. Distinguish statistical significance and business importance.
5. What are the roles of analytical team and IT team with a detailed note on text analysis?
6. Explain in detail the commonly used analytical approaches?
7. Discuss in detail the history of analytical tools.
8. How analytical tools have evolved from graphical user interfaces to point solutions to data
visualization tools?
9. Give a detailed note on features and limitations of R programming and IBM SPSS.
Big Data (16CS520)
Page 1
QUESTION BANK 2018
10. Explain in detail the following
a. SAS
b. Compare various analytical tools.
UNIT 3
1. List the main feature of MapReduce.
2. Describe the working of Map reduce with an relevant example.
3. Discuss the techniques which is used to optimize the map reduce jobs.
4. Discuss the points to be considered while designing a file system in mapreduce.
5. What is HBASE? Give detailed note on features of HBASE.
6. Write a short note on the Hadoop ecosystem and HDFS archiecture.
7. How does HDFS ensure data integrity in a Hadoop cluster?
8. Discuss the following terms
a.Streaming information access.
b.Low latency information access.
c.Rest and thrift
d.org.apcahe.hadoop.io.package
9. What is Meta data? What information does it provide and explain the role of Namenode in a HDFS
clusters?
10. Define Command line interface using HDFS files and give a brief note on Hadoop-specific file
system types and HDFS commands.
UNIT 4
1. What is NoSQL? What are the advantages of NoSQL? And Explain types of NoSQL
Databases?
2. Differentiate between SQL vs NoSQL?
3. What is NewSQL? Differentiate between NewSQL and NoSQL?
4. With Neat sketch explain in detail Hadoop architecture and its components?
5. a) List hadoop distributions
b) Compare Hadoop vs SQL
6. With neat sketch explain HDFS?
7. With neat sketch explain processing data with Hadoop?
8. Explain in detail interacting with Hadoop Ecosystem?
Big Data (16CS520)
Page 2
QUESTION BANK 2018
9. List and Explain HDFC commands?
10. What are the limitations of Hadoop 1.0? Explain Hadoop 2: HDFS and Hadoop 2: YARN?
UNIT 5
1. List some key elements of social media.
2. Describe the steps to perform text mining.
3. Discuss some commonly used text mining software.
4. List some common online tools used to perform sentiment analysis.
5. What do you understand by sentiment analysis?
6. Discuss some application areas of mobile analytics.
7. Briefly explain some popular mobile analytics tools available in the market.
8. What is the importance of location –based tracking tools?
9. Discuss the necessity of keeping data secure while conducting analytics.
10. Discuss some fields where mobile analytics can be used.
Big Data (16CS520)
Page 3
QUESTION BANK 2018
SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR
(Autonomous)
Siddharth Nagar, Narayanavanam Road – 517583
QUESTION BANK (OBJECTIVE)
Subject with Code : Big Data (16CS520)
Course & Branch: B.Tech - CSE
Year & Sem: III-B.Tech & I-Sem
Regulation: R16
UNIT – I
1. Data is present in a _____________ source
A) Homogenous
B) Heterogeneous
C) Both A & B
B) Unstructured data C) Semi –structured data
B) Unstructured data C) Semi-Structured data
A) Dependent variable B) Independent variable C) spatial variable
B) Internet of things
B) Gartner Analyst Doug
C) Relational database
C) Roger Mougalas
B) Natural language
C) Text analytics
B) Market basket analysis
B) Structured
]
[
]
[
]
[
]
[
]
[
]
D) all
C) Affinity analysis D) Both B &C
9.What category you place a CCTV footage into
C) Text
10. What category you place the consumer complaints and feedback
Big Data (16CS520)
]
D) None
8. Association rule mining is called as
A) Unstructured
[
D) Mainframes
7.Dealing with unstructured data with the help of
A) Data mining
]
[
6. The 3V concept was proposed by the
A) Data mining
[
D) Cross variable
5.1970’s was the era of
A) Davis
]
D) All
4. The variable whose value needs to be predicted is called
A) Structured data
[
D) None
3. XML is an Example of
A) Structured data
]
D) Insight
2. 80 – 90% data of an organization available in the format of ________________
A) Structured data
[
D) Both A &B
Page 4
QUESTION BANK 2018
A) Unstructured
B) Structured
C) Semi-structured
D) All
11.Yotaabytes is equal to
A) 10248 bytes
[
B) 10246bytes
C) 10247bytes
]
D) 10244 bytes
12.The human and technical infrastructure needed to support storage, processing and _______
A) Analysis
B) Insight
C) Retrieving
B) Structured
C) Semi-structured
B) Data filtering
C) Data regression
B) Heterogeneous
C) Both A & B
B) Non-shareable
C) shared disk
B) Availability
B) Availability
B) Variety
C) Volume
B) Weka
C) SAS
B) Davis theorem
C) Consistency theorem
B) 1
C) 3
23. _________ is an important advantage of shared nothing architecture
Big Data (16CS520)
[
]
[
]
[
]
[
]
[
]
D) Brooklyn theorem
22. In CAP theorem, at best you can have ______out of 3 guarantees.
A) 2
]
D) none
21 .The CAP theorem is called as
A) Brewer’s theorem
[
D) Volatility
20.Which of the following is not open source analytical tool
A) R-Analytical
]
C) Partition tolerance D) none
19. ________ deals with wide range of data types and source of data.
A) Velocity
[
C) Partition tolerance D) all
18.________ Implies that read and write always success
A) Consistency
]
D) Shareable
17.________implies that every read fetches the last write.
A) Consistency
[
D) I/O
16.In shared nothing Architecture, Memory and disk is ___________
A) Shared memory
]
D) Data market
15.In SMP,there is a single Common main memory that is shared by _____ processor
A) Identical
[
D) None
14._________ is the science of extracting knowledge from the data
A) Data Science
]
D) result
13.RDMS is an example of
A) Unstructured
[
[
]
[
]
D) 0
Page 5
QUESTION BANK 2018
A) Shared disk
B) Shared memory
C) Scalability
D) All
24. Data is growing at a 40% compound annual rate, reaching nearly 45 ZB by
A) 2019
B) 2020
C) 2022
[
]
[
]
D) 2025
25. Parallel database system is a
A) Loosely couple system
B) Tightly couple system
C) Both A&B
D) none
26. _______implies that the system will continue to function when network partition occurs.
[
A) Consistency
B) Partition tolerance
C) Availability
D) Variability
27._________ is the characteristics of data dealing with its retention
A) Volatility
B) Velocity
C) Variety
[
B) Variability
C) Velocity
B) Variability
C) Velocity
B) Structured
C) Semi-structured
B) 10246bytes
C) 10247bytes
B) Velocity
C) Variety
B) Scalability
C) Data recovery
[
]
[
]
[
]
[
]
D) All
33. What are the different features of Big data analytics
A) Open source
]
D) 10244 bytes
32. What are the five Vs of the Big data
A) Volume
[
D) All
31. Zettabytes is equal to
A) 10248 bytes
]
D) Volatility
30. What category you place a weather forecast report
A) Unstructured
[
D) Volatility
29. _______ refers to the accuracy and correctness of the data.
A) Validity
]
D) Variability
28. Real time processing deals with _________ characteristics of data.
A) Variety
]
D) All
34.Which of the four characteristics of the Big data indicates that many data formats can be store and
analyze?
[
]
A) Variability
B) Variety
C) Velocity
D) Volume
35. Basically Available soft eventual consistency used in
A) Distributed computing
Big Data (16CS520)
B) Cloud computing C) Parallel computing
[
]
D) All
Page 6
QUESTION BANK 2018
36. A system that has achieved eventual consistency is said to have converged or achieved [
A) Read repair
B) Write repair
C) Replica convergence
]
D) None
37. Big data analytics is about a tight handshake between three communities IT, Business user and
____
[
]
A) Analytical tool
B) Data scientists
C) Data management
D) Business
38. Which of the following are not analytics tools?
A) SAS
B) IBM SPSS modeler
[
C) R analytics
]
D) Spark
39. A coordinated processing of a program by multiple processors, each working on different parts of
the program and using its own operating system and memory is called
[
]
A) Distributed system B) Massively parallel processing
C) Memory analytics D) Availability
40. A collection of independent computers that appear to its user as a single coherent system is
[
A) Distributed system B) Parallel computing C) Cloud computing
]
D) coherent
Unit-II
1. Which among the following is not a characteristic of reporting?
A] Provides data
B] Provide answers
[
C] Is fairly inflexible
D] Provides what is asked
2. Exploratory data analysis using graphs in nothing but:
A] Data cleaning
B] Basic reporting
[
C] Predictive modelling
C] Charts
]
D] Model implementation
3. Reporting does not involve:
A] Predictive models B] Graphs
]
[
]
[
]
D] Tables
4. In a data analysis report, you will find:
A] Descriptive statistics B] Optimization C] Formatted text D] White papers or journals
5. Data collection is one of the steps in statistical data analysis. This step is performed after which of
the following steps?
[
]
A] Model building
B] Model implementation C] Business objective
D] Evaluation
6. When we don’t have access to a population, we tend to consider:
A] A simulated population
B] A survey
[
]
C] A random sample D] Judgemental insights
7. Missing value treatment of the data is:
[
]
A] Necessary to get the correct results
B] Often leads to wrong results
C] Should never be practiced
D] Simply dropping all the missing records
8. Which of the following is not a task of an analytics team?
Big Data (16CS520)
[
]
Page 7
QUESTION BANK 2018
A] Heavily utilize resources
B] Work with limited rules and restrictions
C] Tightly manage resource usage
D] Run complex ad-hoc queries
9. Text analytics refers to the process of analysing ____________text
A] Structured
B] Semi-structured
C] Un-structured
[
]
[
]
D] All of the above
10. Reporting environment is also known as a _________________environment
A] Artificial Intelligence B] Business Intelligence C] Data Science D] None of the above
11. ________ is a process in which data and reports are examined to get insights from them.[
A] Reporting
B] Analysis
C] Both A & B
]
D) None
12. Big data analytics refers to the process of _______ large sets of data to identify patterns and other
important information
[
]
A] Collecting
B] Organizing
C] Analyzing
D] All
13. _____ is a process in which data is organized and summarized in an easy-to-understand format
[
A] Analysis
B] Reporting
C] Data science
]
D] None
15. The process of________data is an important task in executing a project plan accurately.[
A] Summarized
B] Organized
C] Collection
D] None
16. A ________test is performed to ensure that the right inferences are derived from the data.[
A] Statistical
B] Random
C] Physical
B] Analytics
C] Both A & B
B] Reporting
C] Collecting
B] Diagnostic
C] Prescriptive
B] Diagnostic
C] Prescriptive
[
]
[
]
[
]
[
]
[
]
D] Descriptive
20. _____analytics, the future is predicted on the basis of past patterns.
A] Predictive
]
D] None
19. ______ analytics helps in determining what is happening at the present time.
A] Predictive
[
D] All
18. ______does not involve a person as it is generated automatically.
A] Analysis
]
D] All
17.A ____ is need not be analysed further if it provides the required information.
A] Report
]
D] Descriptive
21.Which of the following statements is correct for R
A] R cannot connect with other languages
B] R does not support extensions
C] R is more object-oriented than many other analytical tool sets
D] R is not compatible with multiple formats
22.Which of the following is a commercial text analysis tool.
A] Tableau
B] Clarabridge
C] JMP
D] Spotfire
23. Which of the following analytical tools is mostly used in social sciences, marketing and healthcare?
[
]
A] R
Big Data (16CS520)
B] SPSS
C] JMP
D] Spotfire
Page 8
QUESTION BANK 2018
24. Which of the following is a data visualization tool?
A] Attensity
B] Advizor
C] Clarabridge
B] Bootstrap average C] Bootstrap leverage
B] Boosting
[
]
[
]
D] All
26. In the following – is not a ensemble method or algorithm
A] Bagging
]
D] SAS
25. Bagging stands for
A] Boostrap aggregating
[
C] Random forest
D] Booting
27. __________method refers to a process of generating multiple models and combining them to solve
a specific problem
[
]
A] Ensemble
B] Analytics
C] Text data analysis D] None
28. The processing and modelling of textual data to gain useful business insights is called_[
]
A] Ensemble method B] Text data analysis C] Big data analytics D] None
29.____is scripting language used to run batch jobs on mainframes.
A] JCL
B] JML
C] JDL
[
]
[
]
[
]
D] All
30.The following is/are analytical approaches.
A] Ensemble method B] Text data analysis C] Big data analytics D] Both A & B
31._____analytics means making analytics an important part of the business process.
A] Monetized
B] Operational
C] Basic
D] None
32. ______analytics helps businesses to take important and better decisions and helps earn revenues
A] Monetized
B] Operational
C] Basic
D] None
[
]
33. _____is usually used to calculate averages, percentages, and parameter estimates arising out of
statistical models.
[
]
A] Statistical significance
B] Statistical modelling
C] Big data analytics D] None
34. ______is used to explore the data whose value is not known.
[
]
[
]
[
]
[
]
[
]
[
]
40. Analytic point solutions refer to ____packages that solve a specific group of problems.[
]
A] Monetized
B] Operational
C] Basic
D] None
35.____uses system resources heavily.
A] IT team
B] Analytics team
C] Both A & B
D] None
36.__________professionals follows approved approaches.
A] IT team
B] Analytics team
C] Both A & B
D] None
37. _______analysts must use code to handle raw data.
A] Big data
B] Data science
C] IT
D] None
38. ___analytics determines what has happened in the past and why.
A] Predictive
B] Diagnostic
C] Prescriptive
D] Descriptive
39. ____analytics finds the best course of action for a given situation.
A] Predictive
A] Hardware
Big Data (16CS520)
B] Diagnostic
B] Software
C] Prescriptive
C] TCL
D] Descriptive
D] JCL
Page 9
QUESTION BANK 2018
UNIT III
1. Which of the following options most aptly explains the reason behind the creation of Mapreduce?
A)Need to increase the processing of new h/w B)Need to perform complex analysis of structured data.
C)Need to increase the number of web users D)Need to spread distributed computing.
[
]
2. In designing the mapreduce framework,which of the following needs did the engineers consider?
A)Cheap and distributed
B)Processing should expand and contract automatically.
C)Network failure.
D)Creation of new language
[
]
[
]
[
]
[
]
D)none
[
]
D)all the above
[
]
[
]
[
]
[
]
3. Which of the following describes the map function.
A)Key pairs
B)Indexing
C)Relational Data base
D)Clusters.
4. Which of the following describes the reduce function.
A)frequently occurring values B)combine map function
C)columnar data base
D)new key value pair to answer the query.
5. Which techniques is used to optimize mapreduce jobs
A)H/W network topology
B)Synchroniztion
C)File system
D)All the above
6. The input is provided from large data files in the form of
A)kvp
B)Hdfs
C)tracker
7. Hbase can run only on
A)clusters
B)presistent clusters C)servers.
8. Logs of Hbase is present on
A)Master node
B)slave node C)Both a and b
D)None
9. In big data environment file size of ------ less is not preferred.
A)50 mb B)100mb.
C)150 mb
D)No restrictions
10. Streaming is used for expressing input and output in ----A)csv
B)image format
C)text format
D)both a and b
11. Which of the following term is used to denote the small subsets of a large file created by HDFS
A)Name node
Big Data (16CS520)
B)Data node C)Blocks
D)Namespace
[
Page 10
]
QUESTION BANK 2018
12. What message is generated by a datanode to indicate its connectivity with name nod
A)Beep
B)Heartbeat
C)analog pluse
D)map
[
]
[
]
[
]
[
]
[
]
[
]
[
]
[
]
[
]
13. Which of the following defines metadata?
A)data about data B)data from web logs C)data from govt sources D) data from market
14. which of the following is managed by map reduce environment
A)weblogs
B)images
C)structured data
D)unstructured data
15. Which of the following services provided by YARN
A)Global resource management
B)Images
C)mapreduce engine D)data mining
16. In hdfs cluser----- manages cluster data
A)Name node
B)Data node
C)Inode
D)Namespace
17. Which of the following commands of HDFS can issue directives to blocks
A)fcsk
B)fkcs
C)fsck
D)fkcs
18. Which of the file system provides read-only access to hdfs over HTTPs.
A)HAR
B)HDFS
C)HFTP
D)HSFTP
19. ------ is a tool used to transfer data between hadoop and relational database
A)sqoop
B)hive
C)pig latin
D)oozie
20. ------ used to transfer large amount of data from distributed resources to a single repository.
A)sqoop
B)flume
C)zookeeper D)hive
21. ________ systems are scale-out file-based (HDD) systems moving to more uses of memory in the
nodes.
A) NoSQL
B) NewSQL C) SQL
D) All of the mentioned
[ ]
22. Point out the correct statement :
A) Hadoop is ideal for the analytical, post-operational, data-warehouse-ish type of workload
B) HDFS runs on a small cluster of commodity-class nodes
C) NEWSQL is frequently the collection point for big data
D) None of the mentioned
[
]
23. Which of the following command sets the value of a particular configuration variable (key)?
A) set –v
B) set <key>=<value>
C) set
D) reset
[
]
24. Hive also support custom extensions written in:
A) C#
B) Java
C) C
D) C++
]
Big Data (16CS520)
[
Page 11
QUESTION BANK 2018
25. The Pig Latin scripting language is not only a higher-level data flow language but also has
operators similar to :
A) SQL
B) JSON
C) XML
D) All of the mentioned
[
]
26. The Pig Latin scripting language is not only a higher-level data flow language but also has
operators similar to :
A) SQL
B) JSON
C) XML
D) All of the mentioned
[
]
27. Which of the following is used for the MapReduce job Tracker node?
A) mradmin B) tasktracker
C) jobtracker
D) none of the mentioned
[
]
28. How many formats of SequenceFile are present in Hadoop I/O?
A) 2
B) 3
C) 4
D) 5
[
]
29. Input to the _______ is the sorted output of the mappers.
A) Reducer B) Mapper
C) Shuffle
D) All of the mentioned
[
]
[
]
[
]
[
]
30. What was Hadoop written in ?
A) Java (software platform)
B) Perl
C) Java (programming language)
D) Lua (programming language)
31. ________ is the slave/worker node and holds the user data in the form of Data Blocks.
A) DataNode B) NameNode
C) Data block D) Replication
32. HDFS provides a command line interface called __________ used to interact with HDFS.
A) “HDFS Shell”
B) “FS Shell”
C) “DFS Shell”
D) hbase
33. The __________ is responsible for allocating resources to the various running applications subject
to familiar constraints of capacities, queues etc.
A) Manager B) Master
C) Scheduler D) manager
[ ]
34. Which of the following platforms does Hadoop run on ?
A) Bare metal B) Debian
C) Cross-platform
D) Unix-like
[
]
35. The Hadoop list includes the HBase database, the Apache Mahout ________ system, and matrix
operations.
A) Machine learning B) Pattern recognition C) Statistical classification D) Artificial intelligence [ ]
36. InputFormat class calls the ________ function and computes splits for each file and then sends
them to the jobtracker.
A) puts
B) gets
C) getSplits D) all of the mentioned
[
]
37. _________ is the primary interface for a user to describe a MapReduce job to the Hadoop
framework for execution.
A) Map Parameters B) JobConf C) MemoryConf
D) None of the mentioned
]
Big Data (16CS520)
Page 12
[
QUESTION BANK 2018
38. Which of the following class is provided by Aggregate package ?
A) Map
B) Reducer C) Reduce
D) None
[
]
39. ___________ is an open source SQL query engine for Apache HBase
A) Pig
B) Phoenix C) Pivot
D) None
[
]
D) All the above
[
]
[
]
[
]
[
]
[
]
40. Which of the following has methods to deal with metadata ?
A) LoadPushDown
B) LoadMetadata
C) LoadCaster
Unit-IV
1.The expansion for CAP is Consistency, Availability, and _______
A] Portability
B] Partition Tolerance C] Reliability
D] None
2. The mongoDB is Consistency and __________
A] Availability
B] Partition Tolerance C] Reliability
D] None
3. The Cassandra is Availability and _____________
A] Consistency
B] Partition Tolerance C] Reliability
D] None
4. _____has no support for ACID properties of transactions.
A] NoSQL
B] SQL
C] NewSQL
D] All
5. ________is a robust database that supports ACID properties of transactions and has the scalability of
NoSQL.
[
]
A] NoSQL
B] SQL
C] NewSQL
D] All
6. ________is a non-relational, open-source, distributed databases.
A] NoSQL
B] SQL
C] NewSQL
B] SQL
C] NewSQL
B] SQL
C] NewSQL
B] SQL
C] NewSQL
Big Data (16CS520)
B] SQL
C] NewSQL
[
]
[
]
[
]
D] All
10. _________is not a best fit for hierarchical data.
A] NoSQL
]
D] All
9. ________ is preferred for large datasets.
A] NoSQL
[
D] All
8. ________uses dynamic schema for unstructured data.
A] NoSQL
]
D] Cassandra
7._________contain relational/pre-defined schema.
A] NoSQL
[
D] All
Page 13
QUESTION BANK 2018
11. __________is uses off-line processing
A] Hadoop
B] SQL
C] NewSQL
B] SQL
C] NewSQL
B] Data processing framework
C]both A&B
D]Hadoop MapReduce
14.________ is a coordination service for distributed applications.
A]Oozie
B]ZooKeeper
C]Mahout
B]ZooKeeper
C]Mahout
B]ZooKeeper
C]Oozie
B]Oozie
C]Mahout
]
[
]
[
]
[
]
[
]
D]Sqoop
17.________ is a data collection system for managing large distributed systems.
A]Chukwa
[
D]Chukwa
16.________ is a scalable machine learning and data mining library.
A]Mahout
]
D]Sqoop
15.________ is a workflow scheduler system to manage Apache Hadoop jobs
A]Oozie
[
D] All
13.Hadoop 1.0 has _________and _________ as its main parts.
A] Data storage framework
]
D] All
12. ___________ uses on-line processing
A] Hadoop
[
D]Ambari
18.________ is a web-based tool for provisioning,managing and monitoring Apache Hadoop
clusters.
[
A] Ambari
B]Chukwa
C]Mahout
D]ZooKeeper
19.The core aspects of Hadoop includes_________
A]Hadoop Common B]HDFS
C]Hadoop YARN & Reduce
B]Hadoop
C]NoSQl
[
]
[
]
D]All
20.__________ is an open source project of Apache Foundation.
A]SQL
D]NewSQL
21._________ is used to transfer bulk data between Hadoop and structured data stores such as
relational databases
[
A]Chukwa
B]Sqoop
C]Oozie
23.Pig is a
Big Data (16CS520)
B]Sqoop
C]MapReduce
]
D]Zookeeper
22._________ are the two components of Hadoop.
A]HDFS
]
[
]
[
]
D]both A&C
Page 14
QUESTION BANK 2018
A]Data flow language B]Import export tool
C]Scheduling engine D]Shuffler
24.What does the Job Tracker do?
A]Stores block of data
B]Coordinates and Schedules the job
C]Stores metadata
D]Acts as a mini reducer
25._______ file is used for updating MapReduce settings.
A]core site
B]hdfs-site
C]mapred-site
B]Million
C]Crore
B]Hunk
c]IBM
B]mapred-site.xml
C]core-site.xml
B]IBM
C]Yahoo
B]IBM
C]Yahoo
B]Hunk
C]IBM
B]Containers
C]YARN
B]NameNode
C]TaskTracker
B]NameNode
C]TaskTracker
B]Commodity machines C]Code
36.__________ is a data Warehousing Layer on top of Hadoop.
Big Data (16CS520)
[
]
[
]
[
]
[
]
[
]
[
]
[
]
[
]
D]JobTracker
35.Hadoop runs on large clusters of __________
A]Data
]
D]DataNode
34.________ is a single point of failure of Hadoop cluster.
A]DataNode
[
D]Fault
33.There is a single _________ per slave node.
A]Job Tracker
]
D]Dough cutting
32.Hadoop 2.x is based on __________ architecture
A]Resources
[
D]BigData
31._______ is Splunk’s new product to search, access and report on Hadoop data sets
A] ]Toy Elephant
]
D]BigData
30.________ traditional IT company is the largest Big Data vendor in the world
A] Google
[
D]All
29.The MapReduce programming mode widely used in analytics was developed at
A]Google
]
D]Dough cutting
28._________ are Hadoop’s configuration files.
A]hdfs-site.xml
[
D]none
27._____________ gave Hadoop its name.
A]Toy Elephant
]
D]hadoop-env.sh
26.One ___________ Gigabytes are there in one Exabyte.
A]Billion
[
D]Job
Page 15
QUESTION BANK 2018
A]Pig
B]Hive
C]Sqoop
D]Hbase
37.HDFS is built using the _________ language.
A]Java
B]C
C]C++
B]DataNode
C]Hadoop
B]Data/NameNode
C]Job/TaskTracker
B]24MB
]
C]64MB
[
]
[
]
D]Primary/Secondary
40.A typical block size used by HDFS is __________.
A]32MB
[
D]Cluster Management
39.HDFS has a ___________ / _______ architecture.
A]Master/Slave
]
D]Basic
38.YARN is responsible for __________
A] NameNode
[
D]12MB
Unit-V
1. Which of the following collectively represent a social network bound via specific sets of social
relationships?
[
]
(A) Websites (B) Big Data
(C) People (D) Analytics Tools
2. Which of the following is not performed by social media?
(A) Participation (B) Online shopping
[
]
(C) Content Sharing (D) Conversation
3. Which of the following text mining tools is used to extract who, what, where, when and why
facts?
[
]
(A) Active-point (B) Attensity
(C) Cross minder (D) Compare suite
4. Which of the following is a popular blogging website?
(A) Face book (B) LinkedIn
[
]
(C) Twitter (D) Word press
5. Which of the following terms represents passive observation of social media activities?[
(A) Participation
(B) Interpretation (C) Both a and b (d) Engaging
6. Social media denotes a group of internal –based applications build over the foundation of
______ of that support the creation and exchange of user generated content.
[
(A)Web 2.0 (B) Web 3.0
Big Data (16CS520)
]
(C) Web 2.1
(D) web 4.0
Page 16
]
QUESTION BANK 2018
7. Which blogs allow people to share and showcase small posts and are suitable for quick sharing
of content in a few lines of a text or an individual photo or video.
[
]
(A) Blogs (B) Micro blogs
(C) Wiki (D) Face Book.
8. _____pattern learning is applied to create pattern from the extracting text.
(A)Statistical
(B) ICT
[
]
[
]
[
]
[
]
[
]
(C) Text mining (D) Life Science.
9. A popular stemmers include
(A) Brute force Algorithm (B) Suffix Tripping Algorithm
(C) Both A and B (D) Pattern Algorithm.
10. _______ used for statistical data analysis, text processing and sentiment analysis.
(A)R (B) Active point
(C) Monarch (D) Text alyzer
11. Which tool is applied for cross-lingual text analytics?
(A)Cross minder (B) Compare suite
(C) SAS Text miner (D) Attensity.
12. ______is one of the most important components of text mining.
(A)Sentiment Analysis
(B) one pass Clustering (C) Buckshot Clustering (D) Monarch
13. Which tool is used to measure success of a website on twitter?
(A)Topsy (B) Tweet beep
[
]
[
]
[
]
(C) SAS Text Miner (D) Compare suite.
15. Which software is use for extraction of facts?
(A)Attensity (B) R
]
(C) Reachli (D) None.
14. Which software used for analysis and transformation of reports into live data.
(A)Monarch (B) Text alyzer
[
(C) both a and b (D) Text alyzer.
16. Which tool is used to improve search engine ranking of a website?
(A)Back tweets (B) Reachli
(C) Twitterfall (D) Topsy
17. Which tools helps in tracking data and scheduling and organizing pin in advance?
(A)Reachli (B) Tweet Beep
[
]
[
]
(C) Active point (D) Compare suite.
19. First Generation (1G) mobile devices provided only a ________.
Big Data (16CS520)
]
(C) Topsy (D) twitter fall.
18. Which tool used for online text analysis?
(A)Text alyzer (B) Monarch
[
Page 17
QUESTION BANK 2018
(A) Digital quality (B) Multimedia application (C) Mobile voice (D) Both a and b
20. LIT stands for
[
]
[
]
(A) Long inter telecom (B) Leverage Information transform
(B) (C) Long term Evolution (D) None.
21. ____ can be used by service provider to help them monitor and improve their service.
(A) Session (B) Bounce rate
(C) track sales (D) Customers engaged.
22. The use of mobile phone or other device like tables to view online content via light we browser
refers as
[
]
(A)Mobile Web (B) Screen (C) Page (D) None.
23. Which is a big Marketing and Analytics platform for mobile and Web application?
[
]
[
]
[
]
[
]
27. Which of the following mobile Analytics tools will you use to work on all platforms for the
measurement of user acquisition, engagement, and outcomes in native mobile apps?
[
]
(A) Appse (B)Mix panel
(C) Both and b (D) Localytics.
24. Which tool is a platform for location based service?
(A) Placed (B) Geoloqi (C) R (D) SAS
25. ___________ is the data collection service design for experts.
(A)Data Winner (B) Statviz
(C) Test flight
(D) Both A and C
26. Which of the following Technologies supports LIT and Wi-Max techniques?
(A) 1G (B) 2G
(C) 3G
(A) AdMob (B) Bango
(D) 4G
(C) Google Analytics (D) Localytics.
28. Which of the following location –based mobile analytics tracking tools will you use to
incorporate advanced geolocation functionality to mobile devices running on iOS, Windows, as
well as Android?
[
]
(A) Geoloqi (B) Placed
(C) Geckboard
(D) Mixpanel
29. Which of the following mobile analytics tools is used to test an app?
(A)Test Flight
[
]
(B) Mobile App Tracking (C) Apsalar (D) Mixpanel.
30. Which of the following mobile analytics data collection tools provides data collection services
and reduce decision –making time by interpreting data efficiently?
[
]
(A) Open Data Kit
Big Data (16CS520)
(B) Data Winners
(C) Command Mobile (D) Enterprise Server.
Page 18
QUESTION BANK 2018
31. Which of the following types of mobile app analytics reports will you use to understand the
demographics of the people using a particular mobile application?
[
]
(A)Audience (B) Acquisition (C) Behaviour (D) Conversion.
32.
Which of the following reports will you use to display the details about the actual sign-ups
and sale of mobile applications?
[
]
(A) Audience (B) Acquisition (C) Behaviour (D) Conversion
33. Which of the following is an example of mobile application?
(A) Android
[
]
[
]
(B) HTML (C) Blackberry (D) Java Script
34. Localytics is an example of a
(A)Mobile device (B) Mobile application (C) Mobile platform (D) Mobile Analytics tool
35. Some of the popular mobile analytics tools available in the market are:
[
]
[
]
(A)AdMob (B) Bango (C) Google Analytics (D) ALL
36. Hand Base is a ____
(A)RDMS
(B) DBMS (C) Behavior (D) Conversion
37. An Application that can perform mobile data collection and workforce management service.[ ]
(A)COMMAND mobile (B) Data winners (C) Stat Viz (D) Play store
38. _____ used to describe the process in which the system automatically opens another page.[
]
(A) AdMob (B) Redirect (C) Geloqi (D) Both A and B
39. Appsee was founded by
[
]
[
]
(A) Zahi Boussiba (B) Yoni Douek (C) Both A and B (D) IBM
40. The session timeouts of the Mobile app analytics is around
(A) 20 seconds (B) 30 seconds (C) 15 seconds ( D) 150 seconds
OBJECTIVE - ANSWERS
unit 1
unit 2
unit 3
unit 4
unit 5
1
1
B
1
A
1
B
1
C
2
2
D
2
B
2
A
2
B
3
3
A
3
A
3
A
3
B
Big Data (16CS520)
Page 19
QUESTION BANK 2018
4
4
B
4
D
4
A
4
D
5
5
C
5
D
5
C
5
C
6
6
C
6
A
6
A
6
A
7
7
A
7
B
7
B
7
B
8
8
C
8
A
8
A
8
A
9
9
C
9
B
9
A
9
C
10
10
B
10
C
10
B
10
A
11
11
B
11
C
11
A
11
A
12
12
D
12
B
12
B
12
A
13
13
B
13
A
13
C
13
A
14
14
14
D
14
B
14
A
15
15
C
15
D
15
A
15
A
16
16
A
16
A
16
A
16
A
17
17
A
17
C
17
A
17
A
18
18
B
18
D
18
A
18
A
19
19
D
19
A
19
D
19
C
20
20
A
20
B
20
B
20
C
21
21
C
21
A
21
B
21
B
22
22
B
22
A
22
D
22
A
23
23
B
23
B
23
A
23
D
24
24
B
24
B
24
B
24
B
25
25
A
25
A
25
C
25
D
26
26
D
26
B
26
A
26
D
27
27
A
27
C
27
A
27
C
28
28
B
28
B
28
D
28
A
29
29
A
29
C
29
A
29
A
30
30
D
30
C
30
B
30
B
31
31
B
31
A
31
B
31
A
32
32
A
32
B
32
C
32
D
33
33
A
33
C
33
C
33
D
34
34
C
34
C
34
B
34
A,D
35
35
B
35
A
35
B
35
D
36
36
A
36
C
36
B
36
A
Big Data (16CS520)
Page 20
QUESTION BANK 2018
37
37
A
37
B
37
A
37
A
38
38
B
38
B
38
D
38
B
39
39
C
39
B
39
A
39
C
40
40
B
40
B
40
C
40
B
Prepared by: Mr. P.Balaji, Dr.P.Kavitha Rani & Mr.Sayed Waris.
Big Data (16CS520)
Page 21
Download