Week 09 – Splunk Data

advertisement
MIS 5208
Week 9: Big Data & Splunk
Ed Ferrara, MSIA, CISSP
eferrara@temple.edu
Agenda

Chapter 1 Introduction / Splunk & Big Data





What is Big Data?
Alternate Data Processing Techniques
Machine Data
What is Splunk?
Chapter 2



Variety of Data
Dealing with Data
File & Directories
Fox School of Business
What is Big Data? The Three Vs

Big Data are:




High volume
High velocity
High variety
Information assets that
require new forms of
processing to enable:




Enhanced decision making
Insight discovery
Process optimization
Volume – Data measured in
petabytes





Highway sensors
Data processing logs
Amazon purchase data
Velocity – Speed of data
generation and frequency
of delivery
Variety – Difference in the
number of data types
Fox School of Business
BIG DATA







Facebook had more than 1B users with more than 618M
active on a daily basis
LinkedIn had more than 200M members – with the service
adding 2 new members every second
Instagram members upload 40M photos per day
Twitter has 500M users – with the service adding 150K per day
Wordpress has more than 40M new posts per day
Pandora music streaming service has more than 13,700 years
of music
Etc.
Fox School of Business
Splunk and the Kill Chain

There are four classes of
data that security teams
need to leverage for a
complete view:





log data
binary data (flow and PCAP)
threat intelligence data
and contextual data.
If any of these data types
are missing, there’s a higher
risk that an attack will go
unnoticed.


These data types are the
building blocks for knowing
what’s normal and what’s
not in your environment.
This single question lies at
the intersection of both
system availability (IT
operations and application)
and security use cases.
Fox School of Business
Splunk and the Kill Chain

Effective data-driven security
decisions require:


Tens of terabytes of data per day
without normalization
Access data anywhere in the
environment, including:







Traditional security data sources
Personnel time management systems
HR databases
Industrial control systems
Hadoop data stores and custom
enterprise applications that run the
business
Delivers fast time-to-answer for
forensic analysis and can be quickly
operationalized for security
operations teams
Makes data more available for
analysis and helps staff view events in
https://www.splunk.com/web_assets/pdfs/secure/Splunk_for_Security.pdf
context.
Fox School of Business
Machine Data


Machine data contains a definitive
record of all the activity and
behavior of your customers, users,
transactions, applications, servers,
networks and mobile devices.
Machine data includes:







configurations,
API data
Message queues
Change events,
Diagnostic command output
Call detail records
Sensor data from industrial systems


Machine data comes in an array
of unpredictable formats and the
traditional set of monitoring and
analysis tools were not designed
for the variety, velocity, volume
or variability of this data.
A new approach, one specifically
architected for this unique class
of data, is required to quickly
diagnose service problems,
detect sophisticated security
threats, understand the health
and performance of remote
equipment and demonstrate
compliance.
Fox School of Business
Splunk Data Sources
Fox School of Business
Machine Data
Data Type
Where
What
Application Logs
Local log files, log4j,
log4net, Weblogic,
WebSphere, JBoss,
.NET, PHP
User activity, fraud
detection,
application
performance
Business Process
Logs
Business process
management logs
Customer activity
across channels,
purchases, account
changes, trouble
reports
Call Detail Records
Clickstream Data
Call detail records
(CDRs), charging
data records, event
data records logged
by telecoms and
network switches
Billing, revenue
assurance, customer
assurance, partner
settlements,
marketing
intelligence
Web server, routers, Usability analysis,
proxy servers, ad
digital marketing
servers
and general
research
Data Type
Where
What
Configuration Files
System
configuration files
How an
infrastructure has
been set up,
debugging failures,
backdoor attacks,
time bombs
Database Audit
Logs
Database log files,
audit tables
How database data
was modified over
time and who made
the changes
Filesystem Audit
Logs
Sensitive data
stored in shared
filesystems
Monitoring and
auditing read access
to sensitive data
Management and
Logging APIs
Checkpoint firewalls Management data
log via the OPSEC
and log events
Log Export API
(OPSEC LEA) and
other vendor
specific APIs from
VMware and Citrix
Fox School of Business
Machine Data
Data Type
Where
What
Data Type
Where
What
Message Queues
JMS, RabbitMQ, and Debug problems in
AquaLogic
complex
applications and as
the backbone of
logging
architectures for
applications
SCADA Data
Supervisory Control Identify trends,
and Data
patterns, anomalies
Acquisition (SCADA) in the SCADA
infrastructure and
used to drive
customer value
Operating System
Metrics, Status and
Diagnostic
Commands
CPU and memory
utilization and
status information
using command-line
utilities like ps and
iostat on Unix and
Linux and
performance
monitor on
Windows
Packet/Flow Data
tcpdump and
tcpflow, which
generate pcap or
flow data and other
useful packet-level
and session-level
information
Troubleshooting,
analyzing trends to
discover latent
issues and
investigating
security incidents
Performance
degradation,
timeouts,
bottlenecks or
suspicious activity
that indicates that
the network may be
compromised or
the object of a
remote attack
Fox School of Business
Module Quiz

Machine data is always
structured?


True
False

Machine data makes up
more than ___% of the data
accumulated by
organizations.


False


10%
25%
50%
90%
90%
Fox School of Business
Module Quiz

Machine data can give you
insights into:?





Application performance
Security
Hardware monitoring
Sales
User behavior
All of the
Above

Machine data is only log
files on web servers.


True
False
False
Fox School of Business
Splunk Components
Splunk Components
Fox School of Business
Index Data

Collects data from any
source




Data Enters
Inspectors decide how to
process the data into a
consistent format
When the indexer finds a
match – Splunk tags the
data type for future use
Events are then stored in
Splunk Index
Fox School of Business
Splunk Event Processing
Fox School of Business
Search & Investigate




Enter a query into the
Splunk search bar
Run statistics using the
Splunk search language
Collects and indexes log and
machine data from any
source
Powerful search, analysis
and visualization
capabilities
Fox School of Business
Add Knowledge

Knowledge Objects






Event Types
Transactions
Tags
Saved Searches
Lookups
Data interpretation: Fields and field
extractions




Fields and field extractions make up the first
order of Splunk Enterprise knowledge.
The fields that Splunk Enterprise automatically
extracts from your IT data help bring meaning
to your raw data, clarifying what can at first
glance seem incomprehensible.
The fields that you extract manually expand
and improve upon this layer of meaning.
Data classification: Event types and
transactions



Event types and transactions group together
interesting sets of similar events.
Event types group together sets of events
discovered through searches, while
transactions are collections of conceptuallyrelated events that span time.
Data models


Data models are representations of one or
more datasets, and they drive the Pivot tool,
enabling quick generation of useful tables,
complex visualizations, and reports without
needing to interact with the Splunk Enterprise
search language.
Data models are designed by knowledge
managers who fully understand the format
and semantics of their indexed data.
Fox School of Business
Add Knowledge

Data Models cont.


A typical data model makes use of other
knowledge object types discussed in this
manual, including lookups, transactions,
search-time field extractions, and calculated
fields.
Data enrichment: Lookups and workflow
actions



Lookups and workflow actions are categories
of knowledge objects that extend the
usefulness of your data in various ways.
Field lookups enable you to add fields to your
data from external data sources such as
static tables (CSV files) or Python-based
commands.
Workflow actions enable interactions
between fields in your data and other
applications or web resources, such as a
WHOIS lookup on a field containing an IP
address.

Data normalization: Tags and aliases




Tags and aliases are used to manage and
normalize sets of field information.
You can use tags and aliases to group sets of
related field values together, and to give
extracted fields tags that reflect different
aspects of their identity.
For example, you can group events from set of
hosts in a particular location (such as a
building or city) together--just give each host
the same tag.
Or maybe you have two different sources
using different field names to refer to same
data--you can normalize your data by using
aliases (by aliasing client ip to ip address, for
example).
Fox School of Business
Monitor & Alert
Type of alert
Alerts based on realtime searches that
trigger every time the
base search returns a
result.
Base search is a...
Description
Alert examples
Real-time search (runs
over all time)
Use this alert type if
you need to know the
moment a matching
result comes in. This
type is also useful if
you need to design an
alert for machine
consumption (such as
a workflow-oriented
application). You can
throttle these alerts to
ensure that they don't
trigger too frequently.
Referred to as a "perresult alert."
• Trigger an alert for
every failed login
attempt, but alert at
most once an hour for
any given username.
• Trigger an alert when
a "file system full"
error occurs on any
host, but only send
notifications for any
given host once per 30
minutes.
• Trigger an alert when
a CPU on a host
sustains 100%
utilization for an
extended period of
time, but only alert
once every 5 minutes.
Fox School of Business
Monitor & Alert
Type of alert
Alerts based
on historical
searches that run on
aregular schedule.
Base search is a...
Historical search
Description
Alert examples
This alert type triggers
whenever a scheduled
run of a historical
search returns results
that meet a particular
condition that you
have configured in the
alert definition. Best
for cases where
immediate reaction to
an alert is not a
priority. You can use
throttling to reduce
the frequency of
redundant alerts.
Referred to as a
"scheduled alert."
• Trigger an alert
whenever the number
of items sold in the
previous day is less
than 500.
• Trigger an alert when
the number of 404
errors in any 1 hour
interval exceeds 100.
Fox School of Business
Monitor & Alert
Type of alert
Alerts based on realtime searches that
monitor events within
a rolling time
"window".
Base search is a...
Real-time search
Description
Alert examples
Use this alert type to
monitor events in real
time within a rolling
time window of a
width that you define,
such as a minute, 10
minutes, or an hour.
The alert triggers when
its conditions are met
by events as they pass
through this window in
real time. You can
throttle these alerts to
ensure that they don't
trigger too frequently.
Referred to as a
"rolling-window alert."
• Trigger an alert
whenever there are
three consecutive
failed logins for a user
between now and 10
minutes ago, but don't
alert for any given user
more than once an
hour.
• Trigger an alert when
a host is unable to
complete an hourly file
transfer to another
host within the last
hour, but don't alert
more than once an
hour for any particular
host.
Fox School of Business
Report & Analyze

When you create a search or a pivot that you would like to run again or
share with others, you can save it as a report. This means that you can
create reports from both the Search and the Pivot sides of Splunk
Enterprise. After you create a report you can:



Run the report on an ad hoc basis to review the results it returns on the
report viewing page. You can get to the viewing page for a report by clicking
the report's name on the Reports listing page.
Open the report and edit it so that it returns different data or displays its
data in a different manner. Your report will open in either Pivot or Search,
depending on how it was created.
This topic explains how you can create and edit reports.In addition, if
your permissions enable you to do so, you can:




Change the report permissions to share it with other Splunk Enterprise
users. Schedule the report so that it runs on a regular interval.
Scheduled reports can be set up to perform actions each time they're run,
such as sending the results of each report run to a set of stakeholders.
Accelerate slow-completing reports built in Search.
Add the report to a dashboard as a dashboard panel. For more information
about scheduling reports, see "Schedule reports," in this manual.
http://docs.splunk.com/Documentation/Splunk/6.0.2/Report/Createandeditreports
Fox School of Business
Splunk User Roles
Fox School of Business
Module 2 Quiz

Which of these is not a main
component of Splunk?




Collect and Index the data
Search and Investigate
Add knowledge
Compress and Archive
Compress
and Archive

The index does not play a
major role in Splunk


True
False
False
Fox School of Business
Module 2 Quiz

Data is broken into single
events by:




Sourcetype
Host
Number of files
The “-” character

Time stamps are stored
_____.




Sourcetype
In a consistent format
Differently for each indexed
item
Differently for each year
As Images files
In a consistent
format
Fox School of Business
Module 2 Quiz

Which role defines what
apps a user will see by
default:




Which two apps ship with
Splunk Enteprise


Admin
Power
User


Admin
DB Connect
Search & reporting
Sideview Utils
Home App
Search & Reporting
Home App
Fox School of Business
Installing Splunk
Demonstration
Thank you
Download