Using Logstash for PI Mediation

advertisement
Using Logstash for PI
Robert Mckeown
Dec 12, 2014
© 2014 IBM Corporation
ETL context
Use Logstash as ETL (Extract, Transform,
Load) tool to transform data to required PI
format
Logstash
2
© 2014 IBM Corporation
Supporting information
• Main logstash web site
• http://http://logstash.net/
• The Logstash Book
• http://www.logstashbook.com/
• Logstash Foru lots of good Q&A
• https://groups.google.com/forum/#!forum/logstash-users
• Doug McClure's 'Logstash is your friend' doc – log-centric but has a good
end-to-end example and advice.
• This doc is a quick skim to get you started. To become proficient, refer
the sites above!
3
© 2014 IBM Corporation
Key logstash functions
• Logstash is an event pipeline
– Inputs → codecs/filters → outputs
• Inputs generate events, codecs and filters modify them, outputs ship
them
• Types are set and used mainly for filter activiation. They always persist
with the event
• Tags can be used to specify an order for event processing (apply filter A,
then filter D then filter F) as well as event routing to specific filters and
outputs
• Conditionals give you if, else if etc, as well as comparison tests and
boolean logic for sophisticated analysis, processing and routing
Chart from Doug McClure
4
© 2014 IBM Corporation
Key logstash functions
Pipeline
myType
PI files
Metric files
scacsv
Original chart from Doug
McClure
5
© 2014 IBM Corporation
Standard set of plug-ins plus two PI specific ones
scapivot
6
scabmcfile
scacsv
© 2014 IBM Corporation
Installation
• Download Logstash 1.4.2 from
https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz
• Unpack in a dir of your choice
• Add logstash to your $PATH for convenience (if desired )
• Install additional standard plug-ins aka 'contribs'
– cd /path/to/your/logstash
– bin/plugin install contrib
• Obtain the SCAPI Plugin package
• Currently avail here in the CSI Predict, Search and Event
Analytics technial sales forum
• Note: Logstash is already installed on the current 'standard' SoftLayer1.3
images
7
© 2014 IBM Corporation
Running Logstash
• Only additional item beyond standard Logstash invocation is to ensure
that you reference the custom SCA plugins on the command line (if you
are using them)
• e.g. my
– Logstash is installed at /home/rmckeown/dev/logstash-1.4.2
– Plugins installed in /home/rmckeown/dev/logstashDev
• Running Logstash would be
•
/home/rmckeown/dev/logstash-1.4.2/bin/logstash -f myConf.conf –pluginpath
/home/rmckeown/dev/logstashDev/scaLogstash
•
Use of $PATH can make this a bit shorter
8
© 2014 IBM Corporation
Example 1
'group' name
Date ok(?)
'metric' name
Metric value
Device number
•
•
•
9
Skinny-format
Multiple 'groups' implied
No header
© 2014 IBM Corporation
Example 1
See http://logstash.net/docs/1.4.2/inputs/stdin
http://logstash.net/docs/1.4.2/outputs/stdout
Host which processed record
Timestamp when message/record was
processed
10
Actual record
© 2014 IBM Corporation
Example 1
Outputs data using ruby 'awesome_print'
Outputs data as json
Output formatted by jsonlint
11
© 2014 IBM Corporation
Example 1 - filter
The CSV filter takes an event field containing CSV data,
parses it, and stores it as individual fields (can optionally
specify the names). This filter can also parse data with any
separator, not just commas.
Create desired columns
Remove arbitary fields
Columns added
Field 'interval' removed
Note: two timestamps
12
© 2014 IBM Corporation
Example 1 – first csv
Desired data output but
No header
Data not separated by group
13
© 2014 IBM Corporation
Example 1 – Conditional & CSV
Example of conditional
Not standard PI name
No header
14
© 2014 IBM Corporation
Example 1 – using scaCSV
Custom operator
Output files
Still 'skinny'! - Need to 'pivot'
15
© 2014 IBM Corporation
Example 2 – scapivot
Custom operator
Values (metric identities)
become column names
cpu
16
Metric values mapped to correct
column
net
© 2014 IBM Corporation
Example 2
•
•
•
17
Meta-data in header
Selection of header and data lines
Simple format clean up of individual feels (e.g. 'G', '%', '-')
© 2014 IBM Corporation
Example 2 – basic classification by type
Conditional with regular expression – match
Any line that starts with '20' – this will be our
date
Classify these as DATA_Line and for output
later
No tags
added
Tags added
18
© 2014 IBM Corporation
Example 2 – capture timestamp via Grok
•
•
Grok is one of the most important plug-ins for use with PI (see
http://logstash.net/docs/1.4.2/filters/grok )
– Grok : Parse arbitrary text and structure it
https://github.com/esasticsearch/logstash/tree/v1.4.2/patterns are used to convert matched
strings e.g.TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T
]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?
•
19
For developing Grok patterns http://grokdebug/herokuapp.com is very useful
© 2014 IBM Corporation
Example 2 – Grok / grokdebug
20
© 2014 IBM Corporation
Example 2 – capture timestamp via Grok
21
© 2014 IBM Corporation
Example 2 – cleaning up fields
Convert a string field by applying a regular
expression and replacement.
Here we are replacing - or % with “”
22
© 2014 IBM Corporation
Example 2 – splitting in to LogStash 'CSV'
Aligns with input file
Reformatting timestamp
Watch this spot !
23
© 2014 IBM Corporation
Example 2 – Outputting
Subset of fields
Still need to
determine this
24
© 2014 IBM Corporation
Example 2 – Determining Server Name
(associative behavior)
•
•
•
•
•
Events are generally **independent**
– 'Multiline Events' are an exception
Cannot obviously carry information from is available across events
In our NAB example, the server identity is in separate 'event' in the header.
Processing information 'across' events is more challenging
Think outside the box (or outside single instances of Logstash)
– Two-step approach. May be others
Of course, it doesn't have to be logstash
either
Logstash
Will use as replacement
ServerName : serverX
serverMap
Logstash
main
processing
Final output
Original
file
25
© 2014 IBM Corporation
Example 2 – Determining Server Name
26
© 2014 IBM Corporation
Example 2 – Replacing server name (translate)
27
© 2014 IBM Corporation
Extending Logstash aka Building custom plug-ins
•
•
Plug-ins are written primarily in Ruby
Can call out to Java easily (since Logstash runs on jRuby )
•
•
Chapter 8 of The Logstash Book – 'Extending Logstash' has all the details
Also, look at the source code for existing plug-ins for lots of good examples on how to
proceed
28
© 2014 IBM Corporation
Location of plug-ins
•
Can also specify a directory outside Logstash installation and work out of that
– mkdir -p /etc/logstash/{inputs,filters,outputs}
– Specify this path when running logstash e.g.
– ..../logstash/bin/logstash –pluginpath /etc/
.........
29
© 2014 IBM Corporation
Extending Logstash aka Building custom plug-ins – scaJDBC
(new plug-in & Java interaction)
Plug in name
New config options
Standard CSV
30
© 2014 IBM Corporation
Extending Logstash aka Building custom plug-ins - scaJDBC
Inherit from Base
Plug-in name
Config
Register at runtime
31
© 2014 IBM Corporation
Almost Java!
How many columns?
Create a brand new event
Assign attribute/values for
each data item returned from
DB
Finalize and
dispatch!
32
© 2014 IBM Corporation
Download