Make Your Data Dance Demystifying Data Analytics & Visualization Today’s Agenda • • • • • • This guy? Definition & Discussion: “Big Data Hype” What is an analytic? How do we visualize Demo: of Data Analytics and Visualization Questions/Discussion 2 My Wife! This Guy? Creepy Kids My Wife Madeď 3 Big Data or Big Hype? • Its everywhere • We all hear it, but what does it mean? • Does it really mean anything or is it just more marketing hype? • Is bigger really better? 4 Logs Logs Everywhere • How many logs do we have now? • Too many to count • Not just on your file system, but in traffic too! • Human – Human • Machine – Human • Machine - Machine • Linux/Unix/Mac(BSD) • Microsoft • Bro Logs – Or plain Netflow generation • Snort or other IDS • Switches/Routers 5 What do you do with all this? 6 Get Them In Your Database • How do you decide which logs you want? – – – – Compliance Policy Curiosity Just because • Normalization – On the fly (streams) – On the remote/local file system (batch) 7 Some Free Tools To Help • Tools for Transport: – Flume, fluentd, rsyslog, syslog-ng, sqoop, logstash • Tools for Storage: – Note: Relational/Non-relational is important – mySQL, cassandra, Hadoop (HDFS), Elasticsearch • Degree’s of Wholeness – ELSA, graylog2, Snare 8 Data is Big... But So What? • All data is not gold • You need a strategy that gets you the right data at the right time 9 Defining: Analytics • Wikipedia Definition – “the discovery and communication of meaningful patterns in data” 10 Simply a Question • • • • • Simple! What! A question?! I can understand that! These questions can be used to create – – – – Metrics Statistics Network behaviors These all help the application of Analytics as analytics help are used to create them. 11 Ask Questions of Your Data • I received an IDS alert, is there other similar behavior on my network that I did not receive an alert for? • I have an IP blacklist, what hosts on my network connected to those IP addresses? • Better yet, is there other similar behavior on my network to non–black-listed IP addresses? 12 What Other Kinds of Insight • Unpatched Systems • Misconfigured Devices • File access – Rates – Personnel • Visibility – Of your network – Of your hosts 13 Visualization. • So you normalized and stored the data • You’ve asked good questions of our data with analytics • Now what? • We visualize • But how? 14 Demo Time! 15 Questions? Source links in the notes on this slide jlawler@21ct.com 16 17