Uploaded by Hieu Ngo Phuoc

Monitoring pfSense logs using ELK

advertisement
Monitoring pfSense logs using ELK
(ElasticSearch 1.7, Logstash 1.5,
Kibana 4.1) - PART 1
This post is essentially an updated guide to my previous post on monitoring pfSense logs using
the ELK stack. Part 1 will cover the instillation and configuration of ELK and Part 2 will cover
configuring Kibana 4 to visualize pfSense logs.
So what's new?


Full guide to installing & setting up ELK on Linux
Short tutorial on creating visualizations and dashboards using collected pfSense logs
OK. So the goal is to use ELK to gather and visualize firewall logs from one (or more) pfSense
servers.
This Logstash / Kibana setup has three main components:



Logstash: Processes the incoming logs sent from pfSense
Elasticsearch: Stores all of the logs
Kibana 4: Web interface for searching and visualizing logs (proxied through Nginx)
(It is possible to manually install the logstash-forwarder on pfsense, however this tutorial will
only cover forwarding logs via the default settings in pfSense.)
For this tutorial all three components (ElasticSearch, Logstash & Kibana + Nginx) will be
installed on a single server.
Prerequisites:
1. For CentOS 7, enable the EPEL repository
# rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
2. Make sure wget is installed
CentOS 7
# sudo yum -y install wget
Ubuntu 14.xx
$ sudo apt-get -y install wget
3. Install the latest JDK on your server:
CentOS 7
Download Java SDK:
# wget --no-check-certificate -c --header "Cookie: oraclelicense=accept-securebackup-cookie"
http://download.oracle.com/otn-pub/java/jdk/8u60-b27/jdk-8u60-linux-x64.tar.gz
# tar -xzf jdk-8u60-linux-x64.tar.gz
# mv jdk1.8.0_60/ /usr/
Install Java:
# /usr/sbin/alternatives --install /usr/bin/java java /usr/jdk1.8.0_60/bin/java 2
# /usr/sbin/alternatives --config java
There is 1 program that provides 'java'.
Selection
Command
----------------------------------------------*+ 1
/usr/jdk1.8.0_60/bin/java
Enter to keep the current selection[+], or type selection number:
Press ENTER
Verify Java Verison:
# java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
Setup Environment Variables:
# export JAVA_HOME=/usr/jdk1.8.0_60/
# export JRE_HOME=/usr/jdk1.8.0_60/jre/
Set PATH variable:
# export PATH=$JAVA_HOME/bin:$PATH
To set it as a permanent, place the above three commands in the /etc/profile (All Users) or
.bash_profile (Single User)
Ubuntu 14.xx
Remove the OpenJDK from the system, if you have it already installed.
$ sudo apt-get remove --purge openjdk*
Add repository.
$ sudo add-apt-repository -y ppa:webupd8team/java
Run the apt-get update command to pull the packages information from the newly added
repository.
$ sudo apt-get update
Issue the following command to install Java jdk 1.8.
$ sudo apt-get -y install oracle-java8-installer
While installing, you will be required to accept the Oracle binary licenses.
Verify java version
$ java -version
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
Configure java Environment
$ sudo apt-get install oracle-java8-set-default
Install ElasticSearch
Download and install the public GPG signing key:
CentOS 7
# rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
Ubuntu 14.xx
$ wget -qO - https://packages.elasticsearch.co/GPG-KEY-elasticsearch | sudo apt-key add -
Add and enable ElasticSearch repo:
CentOS 7
# cat <<EOF >> /etc/yum.repos.d/elasticsearch.repo
[elasticsearch-1.7]
name=Elasticsearch repository for 1.7.x packages
baseurl=http://packages.elastic.co/elasticsearch/1.7/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
EOF
Ubuntu 14.xx
$ echo "deb http://packages.elastic.co/elasticsearch/1.7/debian stable main" | sudo tee -a
/etc/apt/sources.list.d/elasticsearch-1.7.list
Install ElasticSearch
CentOS 7
# yum -y install elasticsearch
Ubuntu 14.xx
$ sudo apt-get update && sudo apt-get install elasticsearch
Configure Elasticsearch to auto-start during system startup:
CentOS 7
# /bin/systemctl daemon-reload
# /bin/systemctl enable elasticsearch.service
# /bin/systemctl start elasticsearch.service
Ubuntu 14.xx
$ sudo update-rc.d elasticsearch defaults 95 10
Now wait, at least a minute to let the Elasticsearch get fully restarted, otherwise testing will fail.
ElasticSearch should be now listen on 9200 for processing HTTP request, we can use CURL to
get the response.
# curl -X GET http://localhost:9200
{
"status" : 200,
"name" : "Alex",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "1.7.1",
"build_hash" : "b88f43fc40b0bcd7f173a1f9ee2e97816de80b19",
"build_timestamp" : "2015-07-29T09:54:16Z",
"build_snapshot" : false,
"lucene_version" : "4.10.4"
},
"tagline" : "You Know, for Search"
}
Install Logstash
Add the Logstash repo, enable & install
CentOS 7
# cat <<EOF >> /etc/yum.repos.d/logstash.repo
[logstash-1.5]
name=Logstash repository for 1.5.x packages
baseurl=http://packages.elasticsearch.org/logstash/1.5/centos
gpgcheck=1
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
enabled=1
EOF
# yum install logstash -y
Ubuntu 14.xx
$ echo "deb http://packages.elasticsearch.org/logstash/1.5/debian stable main" | sudo tee -a /etc/apt/sources.list
$ sudo apt-get update && sudo apt-get install logstash
Create SSL Certificate (Optional)*
*You can skip this step if you don't intend to use your ELK install to monitor logs for anything other than
pfSense (or any TCP/UDP forwarded logs).
(pfSense forwards it's logs via UDP, therefore there's no requirement to set up a Logstash-Forwarder
environment. Having said that, it's still good practice to set it up, since you'll most likely be using your ELK stack
for more than collecting and parsing only pfSense logs.)
Logstash-Forwarder (formerly LumberJack) utilizes an SSL certificate and key pair to verify the
identity of your Logstash server.
You have two options when generating this SSL certificate.
1. Hostname/FQDN (DNS) Setup
2. IP Address Setup
Option 1
If you have DNS setup within your private/internal network, add a DNS A record pointing to the
private IP address of your ELK/Logstash server. Alternatively add a DNS A record with your
DNS provider pointing to your ELK/Logstash servers public IP address. As long as each server
you're gathering logs from can resolve the Logstash servers hostname/domain name, either is
fine.
Alternatively you can edit the etc/hosts file of the servers you're collecting logs from by adding an
IP address (Public or Private) and hostname entry pointing to your Logstash server (Private IP
192.168.0.77 in my case).
# nano /etc/hosts
192.168.0.77 elk.mydomain.com elk
Now to generate the SSL certificate and key pair. Go to OpenSSL directory.
CentOS 7
# cd /etc/pki/tls
Ubuntu 14.xx
Use the following commands to create the directories that will store you certificate and private
key.
$ sudo mkdir -p /etc/pki/tls/certs
$ sudo mkdir /etc/pki/tls/private
Execute the following command to create a SSL certificate, replace “elk” with the hostname of
your real logstash server.
# cd /etc/pki/tls
# openssl req -x509 -nodes -newkey rsa:2048 -days 3650 -keyout private/logstash-forwarder.key -out certs/logstashforwarder.crt -subj /CN=elk
The generated logstash-forwarder.crt should be copied to all client servers who's logs you intend
to send to your Logstash server.
N.B. If you've decided to go with the above described Option 1, please ignore Option 2
below, and skip straight to the 'Configure Logstash' section.
Option 2
If for some reason you don't have DNS setup and/or can't resolve the hostname of your Logstash
server, you can add the IP address of your Logstash server to the subjectAltName (SAN) of the
certificate we're about to generate.
Start by editing the OpenSSL configuration file:
$ nano /etc/pki/tls/openssl.cnf
Find the section starting with [ v3_ca ] and add the following line, substituting the IP address for
that of your own Logstash server.
subjectAltName = IP: 192.168.0.77
Save & exit
Execute the following command to create the SSL certificate and private key.
# cd /etc/pki/tls
# openssl req -config /etc/pki/tls/openssl.cnf -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout
private/logstash-forwarder.key -out certs/logstash-forwarder.crt
The generated logstash-forwarder.crt should be copied to all client servers you intend to collect
logs from.
Configure Logstash
Logstash configuration files are JSON-Format files located in the /etc/logstash/conf.d/ directory. A
Logstash server configuration consists of three sections; input, filter and output, all of which
can be placed in a single configuration file. However in practice is it's much more practical to
place these sections into separate config files.
Create an input configuration:
# nano /etc/logstash/conf.d/01-inputs.conf
Paste the following:
#logstash-forwarder [Not utilized by pfSense by default]
#input {
# lumberjack {
#
port => 5000
#
type => "logs"
#
ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"
#
ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"
# }
#}
#tcp syslog stream via 5140
input {
tcp {
type => "syslog"
port => 5140
}
}
#udp syslogs tream via 5140
input {
udp {
type => "syslog"
port => 5140
}
}
Create an syslog configuration:
# nano /etc/logstash/conf.d/10-syslog.conf
Paste the following:
filter {
if [type] == "syslog" {
#change to pfSense ip address
if [host] =~ /192\.168\.0\.2/ {
mutate {
add_tag => ["PFSense", "Ready"]
}
}
if "Ready" not in [tags] {
mutate {
add_tag => [ "syslog" ]
}
}
}
}
filter {
if [type] == "syslog" {
mutate {
remove_tag => "Ready"
}
}
}
filter {
if "syslog" in [tags] {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp}
%{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?:
%{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
locale => "en"
}
if !("_grokparsefailure" in [tags]) {
mutate {
replace => [ "@source_host", "%{syslog_hostname}" ]
replace => [ "@message", "%{syslog_message}" ]
}
}
mutate {
remove_field => [ "syslog_hostname", "syslog_message", "syslog_timestamp" ]
}
#
if "_grokparsefailure" in [tags] {
#
drop { }
#
}
}
}
Create an outputs configuration:
# nano /etc/logstash/conf.d/30-outputs.conf
Paste the following:
output {
elasticsearch { hosts => localhost index => "logstash-%{+YYYY.MM.dd}" }
stdout { codec => rubydebug }
}
Create your pfSense configuration:
# nano /etc/logstash/conf.d/11-pfsense.conf
Paste the following:
filter {
if "PFSense" in [tags] {
grok {
add_tag => [ "firewall" ]
match => [ "message",
"<(?<evtid>.*)>(?<datetime>(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)
?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\s+(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])
(?:2[0123]|[01]?[0-9]):(?:[0-5][0-9]):(?:[0-5][0-9])) (?<prog>.*?): (?<msg>.*)" ]
}
mutate {
gsub => ["datetime"," "," "]
}
date {
match => [ "datetime", "MMM dd HH:mm:ss" ]
timezone => "UTC"
}
mutate {
replace => [ "message", "%{msg}" ]
}
mutate {
remove_field => [ "msg", "datetime" ]
}
}
if [prog] =~ /^filterlog$/ {
mutate {
remove_field => [ "msg", "datetime" ]
}
grok {
patterns_dir => "/etc/logstash/conf.d/patterns"
match => [ "message",
"%{PFSENSE_LOG_DATA}%{PFSENSE_IP_SPECIFIC_DATA}%{PFSENSE_IP_DATA}%{PFSENSE_PROT
OCOL_DATA}",
"message",
"%{PFSENSE_LOG_DATA}%{PFSENSE_IPv4_SPECIFIC_DATA_ECN}%{PFSENSE_IP_DATA}%{PFSENS
E_PROTOCOL_DATA}" ]
}
mutate {
lowercase => [ 'proto' ]
}
geoip {
add_tag => [ "GeoIP" ]
source => "src_ip"
# Optional GeoIP database
database => "/etc/logstash/GeoLiteCity.dat"
}
}
}
The above configuration uses a pattern file. Create a patterns directory:
# mkdir /etc/logstash/conf.d/patterns
And download the following pattern file to it:
# cd /etc/logstash/conf.d/patterns
# wget
https://gist.githubusercontent.com/elijahpaul/3d80030ac3e8138848b5/raw/abba6aa8398ba601389457284f7c34bbdb
bef4c7/pfsense2-2.grok
(Optional) Download and install the MaxMind GeoIP database:
$ cd /etc/logstash
$ sudo curl -O "http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz"
$ sudo gunzip GeoLiteCity.dat.gz
Now restart the logstash service.
CentOS 7
# systemctl restart logstash.service
Ubuntu 14.xx
$ sudo service logstash restart
Logstash server logs are stored in the following file,
# cat /var/log/logstash/logstash.log
Logs retrieved from pfSense (once setup is complete) can be viewed via,
# tail -f /var/log/logstash/logstash.stdout
these will help you troubleshoot any issues you encounter.
Configuring pfSense for remote logging to ELK
Login to pfSense and check the dashboard to ensure you're running pfSense 2.2.x
Now go to the settings tab via Status > System Logs. Check 'Send log messages to remote
syslog server', enter your ELK servers IP address and custom port (port 5140 in this case), and
check 'Firewall events' (or 'Everything' if you wish to send everything pfSense logs to ELK).
That's it for pfSense!
Configure Kibana4
Kibana 4 provides visualization of your pfSense logs. Use the following command to download
it in terminal.
wget https://download.elastic.co/kibana/kibana/kibana-4.1.2-linux-x64.tar.gz
# tar -zxf kibana-4.1.2-linux-x64.tar.gz
# mv kibana-4.1.2-linux-x64 /opt/kibana4
Enable PID file for Kibana, this is required to create a systemd init file.
# sed -i 's/#pid_file/pid_file/g' /opt/kibana4/config/kibana.yml
Kibana can be started by running /opt/kibana4/bin/kibana, to run kibana as a server we will
create a systemd file.
# nano /etc/systemd/system/kibana4.service
[Unit]
Description=Kibana 4 Web Interface
After=elasticsearch.service
After=logstash.service
[Service]
ExecStartPre=rm -rf /var/run/kibana.pid
ExecStart=/opt/kibana4/bin/kibana
ExecReload=kill -9 $(cat /var/run/kibana.pid) && rm -rf /var/run/kibana.pid && /opt/kibana4/bin/kibana
ExecStop=kill -9 $(cat /var/run/kibana.pid)
[Install]
WantedBy=multi-user.target
Start and enable kibana to start automatically at system startup.
# systemctl start kibana4.service
# systemctl enable kibana4.service
Check to see if Kibana is working properly by going to http://your-ELK-IP:5601/ in your
browser.
You should see the following page where you have to map Logstash index to use Kibana. From
the Time-field name dropdown menu select @timestamp.
Spot any mistakes/errors? Or have any suggestions? Please make a comment below.
Part 2: Configuring Kibana 4 visualizations and dashboards. (Coming soon)...
pfSense
Collect logs from pfSense and OPNsense with Elastic Agent.
What is an Elastic integration?
This integration is powered by Elastic Agent. Elastic Agent is a single, unified way to add monitoring for
logs, metrics, and other types of data to a host. It can also protect hosts from security threats, query data
from operating systems, forward data from remote services or hardware, and more. Refer to our
documentation for a detailed comparison between Beats and Elastic Agent.
Prefer to use Beats for this use case? See Filebeat modules for logs or Metricbeat modules for metrics.
Get started with integrations


This is an integration to parse certain logs from pfSense and OPNsense firewalls. It parses logs received
over the network via syslog (UDP/TCP/TLS). pfSense natively only supports UDP. OPNsense supports all
3 transports.
Currently the integration supports parsing the Firewall, Unbound, DHCP Daemon, OpenVPN, IPsec,
HAProxy, Squid, and PHP-FPM (Authentication) logs.
All other events will be dropped. The HAProxy logs are setup to be compatible with the dashboards from
the HAProxy integration. Install the HAPrxoy integration assets to use them.
pfSense Setup
1. Navigate to Status -> System Logs, then click on Settings
2. At the bottom check Enable Remote Logging
3. (Optional) Select a specific interface to use for forwarding
4. Input the agent IP address and port as set via the integration config into the field Remote log
servers (e.g. 192.168.100.50:5140)
5. Under Remote Syslog Contents select what logs to forward to the agent
 Select Everything to forward all logs to the agent or select the individual services to
forward. Any log entry not in the list above will be dropped. This will cause additional
data to be sent to the agent and Elasticsearch. The firewall, VPN, DHCP, DNS, and
Authentication (PHP-FPM) logs are able to be individually selected. In order to collect
HAProxy and Squid or other "package" logs, the Everything option must be selected.
OPNsense Setup
1. Navigate to System -> Settings -> Logging/Targets
2. Add a new Logging/Target (Click the plus icon)
 Transport = UDP or TCP or TLS
 Applications = Select a list of applications to send to remote syslog. Leave empty for
all.
 Levels = Nothing Selected
 Facilities = Nothing Selected
 Hostname = IP of Elastic agent as configured in the integration config
 Port = Port of Elastic agent as configured in the integration config
 Certificate = Client certificate to use (when selecting a tls transport type)
 Description = Syslog to Elasticsearch
 Click Save
The module is by default configured to run with the udp input on port 9001.
Important
The pfSense integration supports both the BSD logging format (used by pfSense by default and
OPNsense) and the Syslog format (optional for pfSense). However the syslog format is recommended. It
will provide the firewall hostname and timestamps with timezone information. When using the BSD
format, the Timezone Offset config must be set when deploying the agent or else the timezone will
default to the timezone of the agent. See https://<pfsense
url>/status_logs_settings.php and https://docs.netgate.com/pfsense/en/latest/monitoring/logs/se
ttings.html for more information.
A huge thanks to a3ilson for the https://github.com/pfelk/pfelk repo, which is the foundation for the
majority of the grok patterns and dashboards in this integration.
Logs
pfSense log
This is the pfSense log dataset.
An example event for log looks as following:
{
"@timestamp": "2021-07-04T00:10:14.578Z",
"agent": {
"ephemeral_id": "6b82ecb8-3739-4d1c-aeca-3a62c5340c7f",
"id": "c5c06c39-0b86-45ec-9ae3-c773f4562eaa",
"name": "docker-fleet-agent",
"type": "filebeat",
"version": "8.3.2"
},
"data_stream": {
"dataset": "pfsense.log",
"namespace": "ep",
"type": "logs"
},
"destination": {
"address": "175.16.199.1",
"geo": {
"city_name": "Changchun",
"continent_name": "Asia",
"country_iso_code": "CN",
"country_name": "China",
"location": {
"lat": 43.88,
"lon": 125.3228
},
"region_iso_code": "CN-22",
"region_name": "Jilin Sheng"
},
"ip": "175.16.199.1",
"port": 853
},
"ecs": {
"version": "8.5.0"
},
"elastic_agent": {
"id": "c5c06c39-0b86-45ec-9ae3-c773f4562eaa",
"snapshot": false,
"version": "8.3.2"
},
"event": {
"action": "block",
"agent_id_status": "verified",
"category": [
"network"
],
"dataset": "pfsense.log",
"ingested": "2022-07-30T02:57:35Z",
"kind": "event",
"original": "\u003c134\u003e1 2021-07-03T19:10:14.578288-05:00
pfSense.example.com filterlog 72237 - 146,,,1535324496,igb1.12,match,block,in,4,0x0,,63,32989,0,DF,6,tcp,60,10.170.12.50,17
5.16.199.1,49652,853,0,S,1818117648,,64240,,mss;sackOK;TS;nop;wscale",
"provider": "filterlog",
"reason": "match",
"timezone": "-05:00",
"type": [
"connection",
"denied"
]
},
"input": {
"type": "tcp"
},
"log": {
"source": {
"address": "172.19.0.6:50688"
},
"syslog": {
"priority": 134
}
},
"message":
"146,,,1535324496,igb1.12,match,block,in,4,0x0,,63,32989,0,DF,6,tcp,60,10.170.12.50,1
75.16.199.1,49652,853,0,S,1818117648,,64240,,mss;sackOK;TS;nop;wscale",
"network": {
"bytes": 60,
"community_id": "1:pOXVyPJTFJI5seusI/UD6SwvBjg=",
"direction": "inbound",
"iana_number": "6",
"transport": "tcp",
"type": "ipv4"
},
"observer": {
"ingress": {
"interface": {
"name": "igb1.12"
},
"vlan": {
"id": "12"
}
},
"name": "pfSense.example.com",
"type": "firewall",
"vendor": "netgate"
},
"pfsense": {
"ip": {
"flags": "DF",
"id": 32989,
"offset": 0,
"tos": "0x0",
"ttl": 63
},
"tcp": {
"flags": "S",
"length": 0,
"options": [
"mss",
"sackOK",
"TS",
"nop",
"wscale"
],
"window": 64240
}
},
"process": {
"name": "filterlog",
"pid": 72237
},
"related": {
"ip": [
"175.16.199.1",
"10.170.12.50"
]
},
"rule": {
"id": "1535324496"
},
"source": {
"address": "10.170.12.50",
"ip": "10.170.12.50",
"port": 49652
},
"tags": [
"preserve_original_event",
"pfsense",
"forwarded"
]
}
Exported fields
Field
@timestamp
client.address
client.as.number
client.as.organization.name
Description
Type
Date/time when the event originated. This is the
date/time extracted from the event, typically
representing when the event was generated by
the source. If the event source has no original date
timestamp, this value is typically populated by
the first time the event was received by the
pipeline. Required field for all events.
Some event client addresses are defined
ambiguously. The event will sometimes list an
IP, a domain or a unix socket. You should
always store the raw address in
keyword
the .address field. Then it should be
duplicated to .ip or .domain, depending on
which one it is.
Unique number allocated to the autonomous
system. The autonomous system number (ASN)
long
uniquely identifies each network on the
Internet.
Organization name.
keyword
Field
client.as.organization.name.text
client.bytes
client.domain
client.geo.city_name
client.geo.continent_name
client.geo.country_iso_code
client.geo.country_name
client.geo.location
client.geo.region_iso_code
client.geo.region_name
client.ip
client.mac
client.port
cloud.account.id
cloud.availability_zone
cloud.image.id
cloud.instance.id
cloud.instance.name
cloud.machine.type
cloud.project.id
cloud.provider
cloud.region
container.id
container.image.name
container.labels
container.name
data_stream.dataset
Description
Type
Multi-field
match_only_
text
of client.as.organization.name.
Bytes sent from the client to the server.
long
The domain name of the client system. This
value may be a host name, a fully qualified
domain name, or another host naming format. keyword
The value may derive from the original event or
be added from enrichment.
City name.
keyword
Name of the continent.
keyword
Country ISO code.
keyword
Country name.
keyword
Longitude and latitude.
geo_point
Region ISO code.
keyword
Region name.
keyword
IP address of the client (IPv4 or IPv6).
ip
MAC address of the client. The notation format
from RFC 7042 is suggested: Each octet (that
is, 8-bit byte) is represented by two [uppercase]
keyword
hexadecimal digits giving the value of the octet
as an unsigned integer. Successive octets are
separated by a hyphen.
Port of the client.
long
The cloud account or organization id used to
identify different entities in a multi-tenant
environment. Examples: AWS account id,
keyword
Google Cloud ORG Id, or other unique
identifier.
Availability zone in which this host is running. keyword
Image ID for the cloud instance.
keyword
Instance ID of the host machine.
keyword
Instance name of the host machine.
keyword
Machine type of the host machine.
keyword
Name of the project in Google Cloud.
keyword
Name of the cloud provider. Example values
keyword
are aws, azure, gcp, or digitalocean.
Region in which this host is running.
keyword
Unique container id.
keyword
Name of the image the container was built on. keyword
Image labels.
object
Container name.
keyword
constant_key
Data stream dataset.
word
Field
Description
data_stream.namespace
Data stream namespace.
data_stream.type
Data stream type.
Type
constant_key
word
constant_key
word
Some event destination addresses are defined
ambiguously. The event will sometimes list an
IP, a domain or a unix socket. You should
always store the raw address in
destination.address
keyword
the .address field. Then it should be
duplicated to .ip or .domain, depending on
which one it is.
Unique number allocated to the autonomous
system. The autonomous system number (ASN)
destination.as.number
long
uniquely identifies each network on the
Internet.
destination.as.organization.name Organization name.
keyword
destination.as.organization.name.te Multi-field
match_only_
xt
text
of destination.as.organization.name .
destination.bytes
Bytes sent from the destination to the source. long
destination.geo.city_name
City name.
keyword
destination.geo.continent_name
Name of the continent.
keyword
destination.geo.country_iso_code Country ISO code.
keyword
destination.geo.country_name
Country name.
keyword
destination.geo.location
Longitude and latitude.
geo_point
User-defined description of a location, at the
level of granularity they care about. Could be
the name of their data centers, the floor number,
destination.geo.name
keyword
if this describes a local physical entity, city
names. Not typically used in automated
geolocation.
destination.geo.region_iso_code Region ISO code.
keyword
destination.geo.region_name
Region name.
keyword
destination.ip
IP address of the destination (IPv4 or IPv6).
ip
MAC address of the destination. The notation
format from RFC 7042 is suggested: Each octet
(that is, 8-bit byte) is represented by two
destination.mac
keyword
[uppercase] hexadecimal digits giving the value
of the octet as an unsigned integer. Successive
octets are separated by a hyphen.
destination.port
Port of the destination.
long
dns.question.class
The class of records being queried.
keyword
The name being queried. If the name field
dns.question.name
contains non-printable characters (below 32 or keyword
above 126), those characters should be
Field
dns.question.registered_domain
dns.question.subdomain
dns.question.top_level_domain
dns.question.type
dns.type
ecs.version
Description
Type
represented as escaped base 10 integers
(\DDD). Back slashes and quotes should be
escaped. Tabs, carriage returns, and line feeds
should be converted to \t, \r, and \n respectively.
The highest registered domain, stripped of the
subdomain. For example, the registered domain
for "foo.example.com" is "example.com". This
value can be determined precisely with a list
keyword
like the public suffix list
(http://publicsuffix.org). Trying to approximate
this by simply taking the last two labels will not
work well for TLDs such as "co.uk".
The subdomain is all of the labels under the
registered_domain. If the domain has multiple
levels of subdomain, such as
keyword
"sub2.sub1.example.com", the subdomain field
should contain "sub2.sub1", with no trailing
period.
The effective top level domain (eTLD), also
known as the domain suffix, is the last part of
the domain name. For example, the top level
domain for example.com is "com". This value
can be determined precisely with a list like the keyword
public suffix list (http://publicsuffix.org).
Trying to approximate this by simply taking the
last label will not work well for effective TLDs
such as "co.uk".
The type of record being queried.
keyword
The type of DNS event captured, query or
answer. If your source of DNS events only
gives you DNS queries, you should only create
dns events of type dns.type:query. If your
keyword
source of DNS events gives you answers as
well, you should create one event per query
(optionally as soon as the query is seen). And a
second event containing all query details as
well as an array of answers.
ECS version this event conforms
to. ecs.version is a required field and must
exist in all events. When querying across
keyword
multiple indices -- which may conform to
slightly different ECS versions -- this field lets
integrations adjust to the schema version of the
events.
Field
Description
error.message
Error message.
event.action
event.category
event.dataset
event.duration
event.id
event.ingested
event.kind
Type
match_only_
text
The action captured by the event. This
describes the information in the event. It is
more specific than event.category. Examples
keyword
are group-add, process-started, filecreated. The value is normally defined by the
implementer.
This is one of four ECS Categorization Fields,
and indicates the second level in the ECS
category
hierarchy. event.category represents the "big
buckets" of ECS categories. For example,
filtering on event.category:process yields keyword
all events relating to process activity. This field
is closely related to event.type, which is used
as a subcategory. This field is an array. This
will allow proper categorization of some events
that fall in multiple categories.
constant_key
Event dataset
word
Duration of the event in nanoseconds. If
event.start and event.end are known this value
long
should be the difference between the end and
start time.
Unique ID to describe the event.
keyword
Timestamp when an event arrived in the central
data store. This is different from @timestamp,
which is when the event originally occurred. It's
also different from event.created, which is
meant to capture the first time an agent saw the
date
event. In normal conditions, assuming no
tampering, the timestamps should
chronologically look like
this: @timestamp < event.created < event.i
ngested.
This is one of four ECS Categorization Fields,
and indicates the highest level in the ECS
category hierarchy. event.kind gives highlevel information about what type of
information the event contains, without being keyword
specific to the contents of the event. For
example, values of this field distinguish alert
events from metric events. The value of this
field can be used to inform how these kinds of
Field
Description
events should be handled. They may warrant
different retention, different access control, it
may also help understand whether the data
coming in at a regular interval or not.
event.module
Event module
event.original
event.outcome
event.provider
event.reason
Type
constant_key
word
Raw text message of entire event. Used to
demonstrate log integrity or where the full log
message (before splitting it up in multiple parts)
may be required, e.g. for reindex. This field is
not indexed and doc_values are disabled. It
keyword
cannot be searched, but it can be retrieved
from _source. If users wish to override this
and index this field, please see Field data
types in the Elasticsearch Reference.
This is one of four ECS Categorization Fields,
and indicates the lowest level in the ECS
category hierarchy. event.outcome simply
denotes whether the event represents a success
or a failure from the perspective of the entity
that produced the event. Note that when a single
transaction is described in multiple events, each
event may populate different values
of event.outcome, according to their
perspective. Also note that in the case of a
keyword
compound event (a single event that contains
multiple logical events), this field should be
populated with the value that best captures the
overall success or failure from the perspective
of the event producer. Further note that not all
events will have an associated outcome. For
example, this field is generally not populated
for metric events, events
with event.type:info, or any events for
which an outcome does not make logical sense.
Source of the event. Event transports such as
Syslog or the Windows Event Log typically
mention the source of an event. It can be the
name of the software that generated the event keyword
(e.g. Sysmon, httpd), or of a subsystem of the
operating system (kernel, Microsoft-WindowsSecurity-Auditing).
Reason why this event happened, according to
the source. This describes the why of a
keyword
particular action or outcome captured in the
Field
Description
Type
event. Where event.action captures the
action from the event, event.reason describes
why that action was taken. For example, a web
proxy with an event.action which denied the
request may also populate event.reason with
the reason why (e.g. blocked site).
This field should be populated when the event's
timestamp does not include timezone
information already (e.g. default Syslog
timestamps). It's optional otherwise. Acceptable
event.timezone
keyword
timezone formats are: a canonical ID (e.g.
"Europe/Amsterdam"), abbreviated (e.g.
"EST") or an HH:mm differential (e.g. "05:00").
This is one of four ECS Categorization Fields,
and indicates the third level in the ECS
category hierarchy. event.type represents a
categorization "sub-bucket" that, when used
event.type
along with the event.category field values, keyword
enables filtering events down to a level
appropriate for single visualization. This field is
an array. This will allow proper categorization
of some events that fall in multiple event types.
Name of the backend (or listener) which was
haproxy.backend_name
keyword
selected to manage the connection to the server.
Total number of requests which were processed
haproxy.backend_queue
long
before this one in the backend's global queue.
Name of the listening address which received
haproxy.bind_name
keyword
the connection.
Total number of bytes transmitted to the client
haproxy.bytes_read
long
when the log is emitted.
Total time in milliseconds spent waiting for the
haproxy.connection_wait_time_ms
long
connection to establish to the final server
Total number of concurrent connections on the
haproxy.connections.active
long
process when the session was logged.
Total number of concurrent connections
haproxy.connections.backend
handled by the backend when the session was long
logged.
Total number of concurrent connections on the
haproxy.connections.frontend
long
frontend when the session was logged.
Number of connection retries experienced by
haproxy.connections.retries
this session when trying to connect to the
long
server.
Field
Description
Type
Total number of concurrent connections still
haproxy.connections.server
active on the server when the session was
long
logged.
Error message logged by HAProxy in case of
haproxy.error_message
text
error.
Name of the frontend (or listener) which
haproxy.frontend_name
keyword
received and processed the connection.
haproxy.http.request.captured_cook Optional "name=value" entry indicating that the
keyword
ie
server has returned a cookie with its request.
List of headers captured in the request due to
haproxy.http.request.captured_head
the presence of the "capture request header"
keyword
ers
statement in the frontend.
haproxy.http.request.raw_request_l Complete HTTP request line, including the
keyword
ine
method, request and HTTP version string.
Total time in milliseconds spent waiting for a
haproxy.http.request.time_wait_ms full HTTP request from the client (not counting long
body) after the first byte was received.
Total time in milliseconds spent waiting for the
haproxy.http.request.time_wait_wit
server to send a full HTTP response, not
long
hout_data_ms
counting data.
haproxy.http.response.captured_co Optional "name=value" entry indicating that the
keyword
okie
client had this cookie in the response.
List of headers captured in the response due to
haproxy.http.response.captured_he
the presence of the "capture response header" keyword
aders
statement in the frontend.
mode that the frontend is operating (TCP or
haproxy.mode
keyword
HTTP)
Name of the last server to which the connection
haproxy.server_name
keyword
was sent.
Total number of requests which were processed
haproxy.server_queue
long
before this one in the server queue.
haproxy.source
The HAProxy source of the log
keyword
haproxy.tcp.connection_waiting_ti Total time in milliseconds elapsed between the
long
me_ms
accept and the last close
Condition the session was in when the session
haproxy.termination_state
keyword
ended.
Total time in milliseconds spent waiting for the
haproxy.time_backend_connect
connection to establish to the final server,
long
including retries.
Total time in milliseconds spent waiting in the
haproxy.time_queue
long
various queues.
Total time in milliseconds spent waiting in the
haproxy.total_waiting_time_ms
long
various queues
host.architecture
Operating system architecture.
keyword
Field
host.containerized
host.domain
host.hostname
host.id
host.ip
host.mac
host.name
host.os.build
host.os.codename
host.os.family
host.os.kernel
host.os.name
host.os.name.text
host.os.platform
host.os.version
host.type
hostname
http.request.body.bytes
http.request.method
http.request.referrer
http.response.body.bytes
http.response.bytes
Description
Type
If the host is a container.
boolean
Name of the domain of which the host is a
member. For example, on Windows this could
be the host's Active Directory domain or
keyword
NetBIOS domain name. For Linux this could be
the domain of the host's LDAP provider.
Hostname of the host. It normally contains what
keyword
the hostname command returns on the host
machine.
Unique host id. As hostname is not always
unique, use values that are meaningful in your
keyword
environment. Example: The current usage
of beat.name.
Host ip addresses.
ip
Host mac addresses.
keyword
Name of the host. It can contain
what hostname returns on Unix systems, the
keyword
fully qualified domain name, or a name
specified by the user. The sender decides which
value to use.
OS build information.
keyword
OS codename, if any.
keyword
OS family (such as redhat, debian, freebsd,
keyword
windows).
Operating system kernel version as a raw string. keyword
Operating system name, without the version. keyword
text
Multi-field of host.os.name.
Operating system platform (such centos,
keyword
ubuntu, windows).
Operating system version as a raw string.
keyword
Type of host. For Cloud providers this can be
the machine type like t2.medium. If vm, this
keyword
could be the container, for example, or other
information meaningful in your environment.
Hostname from syslog header.
keyword
Size in bytes of the request body.
long
HTTP request method. The value should retain
its casing from the original event. For
keyword
example, GET, get, and GeT are all considered
valid values for this field.
Referrer for this HTTP request.
keyword
Size in bytes of the response body.
long
Total size in bytes of the response (body and
long
headers).
Field
http.response.mime_type
http.response.status_code
http.version
input.type
log.level
log.source.address
log.syslog.priority
message
network.bytes
network.community_id
network.direction
Description
Type
Mime type of the body of the response. This
value must only be populated based on the
content of the response body, not on
keyword
the Content-Type header. Comparing the
mime type of a response with the response's
Content-Type header can be helpful in
detecting misconfigured servers.
HTTP response status code.
long
HTTP version.
keyword
Type of Filebeat input.
keyword
Original log level of the log event. If the source
of the event provides a log level or textual
severity, this is the one that goes in log.level.
If your source doesn't specify one, you may put keyword
your event transport's severity here (e.g. Syslog
severity). Some examples
are warn, err, i, informational.
Source address of the syslog message.
keyword
Syslog numeric priority of the event, if
available. According to RFCs 5424 and 3164,
the priority is 8 * facility + severity. This
long
number is therefore expected to contain a value
between 0 and 191.
For log events the message field contains the
log message, optimized for viewing in a log
viewer. For structured logs without an original
match_only_
message field, other fields can be concatenated
text
to form a human-readable summary of the
event. If multiple messages exist, they can be
combined into one message.
Total bytes transferred in both directions.
If source.bytes and destination.bytes are long
known, network.bytes is their sum.
A hash of source and destination IPs and ports,
as well as the protocol used in a
communication. This is a tool-agnostic standard
keyword
to identify flows. Learn more
at https://github.com/corelight/community-idspec.
Direction of the network traffic. When mapping
events from a host-based monitoring context,
populate this field from the host's point of view,
keyword
using the values "ingress" or "egress". When
mapping events from a network or perimeterbased monitoring context, populate this field
Field
network.iana_number
network.packets
network.protocol
network.transport
network.type
network.vlan.id
observer.ingress.interface.name
observer.ingress.vlan.id
observer.ip
observer.name
observer.type
Description
Type
from the point of view of the network
perimeter, using the values "inbound",
"outbound", "internal" or "external". Note that
"internal" is not crossing perimeter boundaries,
and is meant to describe communication
between two hosts within the perimeter. Note
also that "external" is meant to describe traffic
between two hosts that are external to the
perimeter. This could for example be useful for
ISPs or VPN service providers.
IANA Protocol Number
(https://www.iana.org/assignments/protocolnumbers/protocol-numbers.xhtml).
keyword
Standardized list of protocols. This aligns well
with NetFlow and sFlow related logs which use
the IANA Protocol Number.
Total packets transferred in both directions.
If source.packets and destination.packet long
s are known, network.packets is their sum.
In the OSI Model this would be the Application
Layer protocol. For example, http, dns,
keyword
or ssh. The field value must be normalized to
lowercase for querying.
Same as network.iana_number, but instead
using the Keyword name of the transport layer
keyword
(udp, tcp, ipv6-icmp, etc.) The field value must
be normalized to lowercase for querying.
In the OSI Model this would be the Network
Layer. ipv4, ipv6, ipsec, pim, etc The field
keyword
value must be normalized to lowercase for
querying.
VLAN ID as reported by the observer.
keyword
Interface name as reported by the system.
keyword
VLAN ID as reported by the observer.
keyword
IP addresses of the observer.
ip
Custom name of the observer. This is a name
that can be given to an observer. This can be
helpful for example if multiple firewalls of the
keyword
same model are used in an organization. If no
custom name is needed, the field can be left
empty.
The type of the observer the data is coming
from. There is no predefined list of observer
keyword
types. Some examples
Field
observer.vendor
pfsense.dhcp.age
pfsense.dhcp.duid
pfsense.dhcp.hostname
pfsense.dhcp.iaid
pfsense.dhcp.lease_time
pfsense.dhcp.subnet
pfsense.dhcp.transaction_id
pfsense.icmp.code
pfsense.icmp.destination.ip
pfsense.icmp.id
pfsense.icmp.mtu
Description
Type
are forwarder, firewall, ids, ips, proxy, po
ller, sensor, APM server.
Vendor name of the observer.
keyword
Age of DHCP lease in seconds
long
The DHCP unique identifier (DUID) is used by
a client to get an IP address from a DHCPv6
keyword
server.
Hostname of DHCP client
keyword
Identity Association Identifier used alongside
keyword
the DUID to uniquely identify a DHCP client
The DHCP lease time in seconds
long
The subnet for which the DHCP server is
keyword
issuing IPs
The DHCP transaction ID
keyword
ICMP code.
long
Original destination address of the connection
ip
that caused this notification
ID of the echo request/reply
long
MTU to use for subsequent data to this
long
destination
Originate Timestamp
date
ICMP parameter.
long
ICMP redirect address.
ip
Receive Timestamp
date
ICMP sequence number.
long
Transmit Timestamp
date
ICMP type.
keyword
pfsense.icmp.otime
pfsense.icmp.parameter
pfsense.icmp.redirect
pfsense.icmp.rtime
pfsense.icmp.seq
pfsense.icmp.ttime
pfsense.icmp.type
pfsense.icmp.unreachable.iana_nu
Protocol ID number that was unreachable
mber
pfsense.icmp.unreachable.other
Other unreachable information
pfsense.icmp.unreachable.port
Port number that was unreachable
pfsense.ip.ecn
Explicit Congestion Notification.
pfsense.ip.flags
IP flags.
pfsense.ip.flow_label
Flow label
pfsense.ip.id
ID of the packet
pfsense.ip.offset
Fragment offset
pfsense.ip.tos
IP Type of Service identification.
pfsense.ip.ttl
Time To Live (TTL) of the packet
pfsense.openvpn.peer_info
Information about the Open VPN client
pfsense.tcp.ack
TCP Acknowledgment number.
pfsense.tcp.flags
TCP flags.
pfsense.tcp.length
Length of the TCP header and payload.
pfsense.tcp.options
TCP Options.
long
keyword
long
keyword
keyword
keyword
long
long
keyword
long
keyword
long
keyword
long
array
Field
pfsense.tcp.seq
pfsense.tcp.urg
pfsense.tcp.window
pfsense.udp.length
process.name
process.name.text
process.pid
process.program
related.ip
related.user
rule.id
server.address
server.bytes
server.ip
server.mac
server.port
source.address
source.as.number
source.as.organization.name
Description
Type
TCP sequence number.
long
Urgent pointer data.
keyword
Advertised TCP window size.
long
Length of the UDP header and payload.
long
Process name. Sometimes called program name
keyword
or similar.
match_only_
Multi-field of process.name.
text
Process id.
long
Process from syslog header.
keyword
All of the IPs seen on your event.
ip
All the user names or other user identifiers seen
keyword
on the event.
A rule ID that is unique within the scope of an
agent, observer, or other entity using the rule keyword
for detection of this event.
Some event server addresses are defined
ambiguously. The event will sometimes list an
IP, a domain or a unix socket. You should
always store the raw address in
keyword
the .address field. Then it should be
duplicated to .ip or .domain, depending on
which one it is.
Bytes sent from the server to the client.
long
IP address of the server (IPv4 or IPv6).
ip
MAC address of the server. The notation format
from RFC 7042 is suggested: Each octet (that
is, 8-bit byte) is represented by two [uppercase]
keyword
hexadecimal digits giving the value of the octet
as an unsigned integer. Successive octets are
separated by a hyphen.
Port of the server.
long
Some event source addresses are defined
ambiguously. The event will sometimes list an
IP, a domain or a unix socket. You should
always store the raw address in
keyword
the .address field. Then it should be
duplicated to .ip or .domain, depending on
which one it is.
Unique number allocated to the autonomous
system. The autonomous system number (ASN)
long
uniquely identifies each network on the
Internet.
Organization name.
keyword
Field
source.as.organization.name.text
source.bytes
source.domain
source.geo.city_name
source.geo.continent_name
source.geo.country_iso_code
source.geo.country_name
source.geo.location
source.geo.name
source.geo.region_iso_code
source.geo.region_name
source.ip
source.mac
source.nat.ip
source.port
source.user.full_name
source.user.full_name.text
source.user.id
squid.hierarchy_status
squid.request_status
tags
Description
Type
Multi-field
match_only_
text
of source.as.organization.name.
Bytes sent from the source to the destination. long
The domain name of the source system. This
value may be a host name, a fully qualified
domain name, or another host naming format. keyword
The value may derive from the original event or
be added from enrichment.
City name.
keyword
Name of the continent.
keyword
Country ISO code.
keyword
Country name.
keyword
Longitude and latitude.
geo_point
User-defined description of a location, at the
level of granularity they care about. Could be
the name of their data centers, the floor number,
keyword
if this describes a local physical entity, city
names. Not typically used in automated
geolocation.
Region ISO code.
keyword
Region name.
keyword
IP address of the source (IPv4 or IPv6).
ip
MAC address of the source. The notation
format from RFC 7042 is suggested: Each octet
(that is, 8-bit byte) is represented by two
keyword
[uppercase] hexadecimal digits giving the value
of the octet as an unsigned integer. Successive
octets are separated by a hyphen.
Translated ip of source based NAT sessions
(e.g. internal client to internet) Typically
ip
connections traversing load balancers, firewalls,
or routers.
Port of the source.
long
User's full name, if available.
keyword
match_only_
Multi-field of source.user.full_name.
text
Unique identifier of the user.
keyword
The proxy hierarchy route; the route Content
keyword
Gateway used to retrieve the object.
The cache result code; how the cache responded
to the request: HIT, MISS, and so on. Cache
keyword
result codes are described here.
List of keywords used to tag each event.
keyword
Field
tls.cipher
tls.version
tls.version_protocol
url.domain
url.extension
url.full
url.full.text
url.original
url.original.text
url.password
url.path
url.port
url.query
Description
Type
String indicating the cipher used during the
keyword
current connection.
Numeric part of the version parsed from the
keyword
original string.
Normalized lowercase protocol name parsed
keyword
from original string.
Domain of the url, such as "www.elastic.co". In
some cases a URL may refer to an IP and/or
port directly, without a domain name. In this
case, the IP address would go to
keyword
the domain field. If the URL contains a literal
IPv6 address enclosed by [ and ] (IETF RFC
2732), the [ and ] characters should also be
captured in the domain field.
The field contains the file extension from the
original request url, excluding the leading dot.
The file extension is only set if it exists, as not
every url has a file extension. The leading
period must not be included. For example, the keyword
value must be "png", not ".png". Note that
when the file name has multiple extensions
(example.tar.gz), only the last one should be
captured ("gz", not "tar.gz").
If full URLs are important to your use case,
they should be stored in url.full, whether this
wildcard
field is reconstructed or present in the event
source.
match_only_
Multi-field of url.full.
text
Unmodified original url as seen in the event
source. Note that in network monitoring, the
observed URL may be a full URL, whereas in
wildcard
access logs, the URL is often just represented as
a path. This field is meant to represent the URL
as it was observed, complete or not.
match_only_
Multi-field of url.original.
text
Password of the request.
keyword
Path of the request, such as "/search".
wildcard
Port of the request, such as 443.
long
The query field describes the query string of the
request, such as "q=elasticsearch". The ? is
excluded from the query string. If a URL
keyword
contains no ?, there is no query field. If there is
a ? but no query, the query field exists with an
Field
url.scheme
url.username
user.domain
user.email
user.full_name
user.full_name.text
user.id
user.name
user.name.text
user_agent.device.name
user_agent.name
user_agent.original
user_agent.original.text
user_agent.os.full
user_agent.os.full.text
user_agent.os.name
user_agent.os.name.text
user_agent.os.version
user_agent.version
Description
Type
empty string. The exists query can be used to
differentiate between the two cases.
Scheme of the request, such as "https". Note:
keyword
The : is not part of the scheme.
Username of the request.
keyword
Name of the directory the user is a member of.
For example, an LDAP or Active Directory
keyword
domain name.
User email address.
keyword
User's full name, if available.
keyword
match_only_
Multi-field of user.full_name.
text
Unique identifier of the user.
keyword
Short name or login of the user.
keyword
match_only_
Multi-field of user.name.
text
Name of the device.
keyword
Name of the user agent.
keyword
Unparsed user_agent string.
keyword
match_only_
Multi-field of user_agent.original.
text
Operating system name, including the version
keyword
or code name.
match_only_
Multi-field of user_agent.os.full.
text
Operating system name, without the version. keyword
match_only_
Multi-field of user_agent.os.name.
text
Operating system version as a raw string.
keyword
Version of the user agent.
keyword
Download