Bootcamp_cheatsheet copy

advertisement
SearchCheatSheet
This is the "cook book" cheat sheet containing tips for using Splunk's powerful search language. If you want a manual of search commands, check out the
Splunk Docs for Search Commands.
Filtering Results
keep only those results that have the required src or dst values
* | search src="10.9.165.*" OR dst="10.9.165.8"
keep only results whose _raw field contains ip addresses in the non-routable
class A (10.0.0.0/8)
* | regex _raw=(?<!\d)10.\d{1,3}\.\d{1,3}\.\d{1,3}(?!\d)
remove duplicates of results with the same host
* | dedup host
Ordering Results
sort by IP ascending and then URL descending.
* | sort ip, -url
reverse the order of the results.
* | reverse
return the first 20 results.
* | head 20
return the last 20 results, in reverse order.
* | tail 20
Filtering and Ordering Results
keep only the host and ip attributes, setting the order of attributes: host first,
ip, second
* | fields host, ip
same as above, but removes all interal attributes as well (e.g. time), which
may cause UI problems.
* | fields + host, ip
remove the host and ip attributes but leaves all others untouched.
* | fields - host, ip
Extracting or Adding Attributes
extract attribute/value pairs while reloading settings from disk.
* | extract reload=true
extract attribute/value pairs that are delimited by "|;", and attribute/values that
are delimited by "=:".
* | extract pairdelim="|;", kvdelim="=:", auto=f
extract attribute/value pairs from xml, with the attribute set to the xml tag, and
the value to the value between the xml tags
* | xmlkv
extract the COMMAND field only when it occurs in rows that contain "splunkd".
* | multikv fields COMMAND filter splunkd
extract 'from' and 'to' fields using regular expressions. If raw was "From: Susan
To: Bob", then from=Susan and to=Bob.
* | rex field=_raw "From: (?<from>.*) To: (?<to>.*)"
add a new attribute 'comboIP', which is sourceIP + "/" + destIP
* | strcat sourceIP "/" destIP comboIP
add a new attribute 'velocity' equal to distance / time, by calling SQLite.
* | eval velocity=distance/time
add ip="10.10.10.10" and foo="foo bar" to each result
* | setfields ip="10.10.10.10", foo="foo bar"
add location information to the first twenty 404 errors on the host webserver1,
based on the IP addresses.
404 host=webserver1 | head 20 | iplocation
Converting or Changing the Names, Units, or Datatypes of Attributes
convert every field (that doesn't start with an '') except for the field 'foo'. None
tells convert to ignore a field.
* | convert auto(*) none(foo)
change all memory amounts into kilobytes. A number by itself specifies kB;
numbers with 'm', MB; and numbers with 'g', GB.
* | convert memk(virt)
changes the sendmail syslog duration format of [D+HH:MM:SS] to seconds, e.g.
'00:10:15' -> '615' for the xdelay field.
* | convert dur2sec(delay)
convert the duration into a number, removing later string, i.e. '212 sec' -> '212'.
* | convert rmunit(duration)
rename the ip field as IPAddress.
* | rename _ip as IPAddress
replaces any hosts ending with "localhost" to just be "localhost".
* | replace *localhost with localhost in host
Reporting and Statistical Graphing Functions
return the least common values of the url field.
* | rare url
return the 20 most common URLs.
* | top limit=20 url
remove duplicates of results with the same host value and return the total count of the
remaining results.
* | stats dc(host)
for each hour, return the average of any unique field that ends with the the string 'lay'
(e.g. delay, xdelay, relay, etc).
* | stats avg(*lay) BY date_hour
search the access logs, and return the number of hits from the top 100 referer domains.
sourcetype=access_combined | top limit=100
referer_domain | stats sum(count)
search the access logs, and return the results associated with each other, having at
least 3 references to each other.
sourcetype=access_combined | associate supcnt=3
get the average (mean) size for each distinct host.
* | chart avg(size) by host
get the max delay by size, where size is broken down into up to 10 equal
sized buckets.
* | chart max(delay) by size bins=10
graph the average thruput over time, with time sets differentiated by 5 minute
spans.
* | timechart span=5m avg(thruput) by host
create a timechart, average the cpu_seconds by host, then truncate outlying
values to remove data that may distort the timechart's axis.
* | timechart avg(cpu_seconds) by host | outlier action=TR
get ps events, extract values from ps's tabular output, and calculate the average
of CPU of each minute for each host.
sourcetype=ps | multikv | timechart span=1m avg(CPU) by host
search for events with the sourcetype "web", and produces a timechart count by
host. Then fills all null values with "NULL".
sourcetype=web | timechart count by host | fillnull value=NULL
search all events and builds a contingency table for datafields.
search all events, and calculate the co-currence correlation between all fields.
* | contingency datafield1 datafield2 maxrows=5 maxcols=5
usetotal=F
* | correlate type=cocur
sums the numeric fields of each result, putting the sums in the field "sum".
* | addtotals fieldname=sum
returns only events with uncommon values.
* | anomalousvalue action=filter pthresh=0.02
bucket search results into 10 bins, and counts the number of raw events and order
them by size.
* | bucket size bins=10 | stats count(_raw) by size
return the average thruput for each host for each 5 minute time span.
* | bucket _time span=5m | stats avg(thruput) by=_time host
Classifying Events
apply eventtypes to search results (automatically called from UI when viewing
eventtype field)
* | typer
discover types of events that have errors
error | typelearner
Finding Transactions or Grouping Related Events/Results Together
group into a transaction all events that have the same host and cookie, that occur
within 30 seconds, and do not have a pause of more than 5 seconds between the
events.
* | transaction fields="host,cookie" maxspan=30s maxpause=5s
group into a transaction all events that share the same 'from' value, within a maximum
span of 30 seconds, have a pause between events no greater than 5 seconds.
* | transaction fields=from maxspan=30s maxpause=5s
group search results into 4 clusters, based on the values of the date_hour and
date_minute fields.
* | kmeans k=4 date_hour date_minute
cluster events, sorted by cluster_count, and then return the first 20 events, which are the
20 largest clusters (in data size).
* | cluster t=0.9 showcount=true | sort - cluster_count | head 20
Generating Results with Non-Search Commands
reads results from csv file $SPLUNK_HOME/var/run/splunk/all.csv, keeping any that
have errors, and then saving them out to errors.csv.
| inputcsv all.csv | search error | outputcsv errors.csv /code>
return the events in the file messages.1, as if it were indexed in Splunk.
| file /var/log/messages.1
run the mysecurityquery saved search, and email any results to user@domain.com.
| savedsearch mysecurityquery AND _count > 0 | sendemail
to=user@domain.com
User Interface Commands
highlight the terms "login" and "logout".
* | highlight login,logout
show the best 5 lines of each search result.
* | abstract maxlines=5
compare the ip of the first and 3rd search result.
* | diff pos1=1 pos2=3 attribute=ip
search for "xml_escaped", then unescape XML characters.
source="xml_escaped" | xmlunescape
Commands Related to Administration
crawl root and home directory and add all inputs found to inputs.conf
| crawl root="/;/Users/bob" | input add
return processing properties - time zones, breaking characters, etc contained in
props.conf.
| admin props
view audit trail information, stored in the local audit index, and decrypt signed audit events,
checking for gaps and tampering.
index=audit | audit
Searches to do Subsearches
return all URLs that have 404 errors but not 303 errors.
| set diff [search 404 | select url] [search 303 | fields url]
find 5 minute time regions around root logins, and then search each of those time ranges
for "failure"
login root | localize maxspan=5m maxpause=5m | map search=
"search failure starttimeu::$starttime$ endtimeu::$endtime$"
get all hourly time ranges from 5 days ago until today, and search each of those time ranges
for "failure"
| gentimes start=-5 increment=1h | map search="search failure
starttimeu::$starttime$ endtimeu::$endtime$"
create a search string from the host, source and sourcetype values.
[* | fields + source, sourcetype, host | format ]
Download