Uploaded by polishettyanusha

Big data

advertisement
Big data is a software installed in high end
m/c..
We can use Open Source version of Hadoop. There are 3 vendors in market who provide support for
Hadoop and charge. But we can use Open Source
We can subscribe to Public cloud having Big Data Appliance.
We need a user and password to access the stream of tweets.
Centimental analysis is done based on words.There is a dictionary available in hive
Oozie is to schedule
Schedule is written in xml templates. Standard templates are available. There are tools with drag and
drop options.
Subjective Lexicom dictionary – It has acuuracy of 50%.
Each word is assigned -1, 0, 1. Each sentence is analyzed by summing up centiments of all words.
Twitter, FB has their own APIs .You can leverage them for source
They used VM - Big Data Lite. It has all bundled software and you can leverage all of them.
You can download bigdata lite and start working on your own.
Map R version of certification
Download