Monitoring pfSense logs using ELK (ElasticSearch 1.7, Logstash 1.5, Kibana 4.1) - PART 1 This post is essentially an updated guide to my previous post on monitoring pfSense logs using the ELK stack. Part 1 will cover the instillation and configuration of ELK and Part 2 will cover configuring Kibana 4 to visualize pfSense logs. So what's new? Full guide to installing & setting up ELK on Linux Short tutorial on creating visualizations and dashboards using collected pfSense logs OK. So the goal is to use ELK to gather and visualize firewall logs from one (or more) pfSense servers. This Logstash / Kibana setup has three main components: Logstash: Processes the incoming logs sent from pfSense Elasticsearch: Stores all of the logs Kibana 4: Web interface for searching and visualizing logs (proxied through Nginx) (It is possible to manually install the logstash-forwarder on pfsense, however this tutorial will only cover forwarding logs via the default settings in pfSense.) For this tutorial all three components (ElasticSearch, Logstash & Kibana + Nginx) will be installed on a single server. Prerequisites: 1. For CentOS 7, enable the EPEL repository # rpm -Uvh 2. Make sure wget is installed CentOS 7 # sudo yum -y install wget Ubuntu 14.xx $ sudo apt-get -y install wget 3. Install the latest JDK on your server: CentOS 7 Download Java SDK: # wget --no-check-certificate -c --header "Cookie: oraclelicense=accept-securebackup-cookie" # tar -xzf jdk-8u60-linux-x64.tar.gz # mv jdk1.8.0_60/ /usr/ Install Java: # /usr/sbin/alternatives --install /usr/bin/java java /usr/jdk1.8.0_60/bin/java 2 # /usr/sbin/alternatives --config java There is 1 program that provides 'java'. Selection Command ----------------------------------------------*+ 1 /usr/jdk1.8.0_60/bin/java Enter to keep the current selection[+], or type selection number: Press ENTER Verify Java Verison: # java -version java version "1.8.0_60" Java(TM) SE Runtime Environment (build 1.8.0_60-b27) Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode) Setup Environment Variables: # export JAVA_HOME=/usr/jdk1.8.0_60/ # export JRE_HOME=/usr/jdk1.8.0_60/jre/ Set PATH variable: # export PATH=$JAVA_HOME/bin:$PATH To set it as a permanent, place the above three commands in the /etc/profile (All Users) or .bash_profile (Single User) Ubuntu 14.xx Remove the OpenJDK from the system, if you have it already installed. $ sudo apt-get remove --purge openjdk* Add repository. $ sudo add-apt-repository -y ppa:webupd8team/java Run the apt-get update command to pull the packages information from the newly added repository. $ sudo apt-get update Issue the following command to install Java jdk 1.8. $ sudo apt-get -y install oracle-java8-installer While installing, you will be required to accept the Oracle binary licenses. Verify java version $ java -version java version "1.8.0_60" Java(TM) SE Runtime Environment (build 1.8.0_60-b27) Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode) Configure java Environment $ sudo apt-get install oracle-java8-set-default Install ElasticSearch Download and install the public GPG signing key: CentOS 7 # rpm --import Ubuntu 14.xx $ wget -qO - | sudo apt-key add - Add and enable ElasticSearch repo: CentOS 7 # cat <<EOF >> /etc/yum.repos.d/elasticsearch.repo [elasticsearch-1.7] name=Elasticsearch repository for 1.7.x packages baseurl= gpgcheck=1 gpgkey= enabled=1 EOF Ubuntu 14.xx $ echo "deb stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch-1.7.list Install ElasticSearch CentOS 7 # yum -y install elasticsearch Ubuntu 14.xx $ sudo apt-get update && sudo apt-get install elasticsearch Configure Elasticsearch to auto-start during system startup: CentOS 7 # /bin/systemctl daemon-reload # /bin/systemctl enable elasticsearch.service # /bin/systemctl start elasticsearch.service Ubuntu 14.xx $ sudo update-rc.d elasticsearch defaults 95 10 Now wait, at least a minute to let the Elasticsearch get fully restarted, otherwise testing will fail. ElasticSearch should be now listen on 9200 for processing HTTP request, we can use CURL to get the response. # curl -X GET http://localhost:9200 { "status" : 200, "name" : "Alex", "cluster_name" : "elasticsearch", "version" : { "number" : "1.7.1", "build_hash" : "b88f43fc40b0bcd7f173a1f9ee2e97816de80b19", "build_timestamp" : "2015-07-29T09:54:16Z", "build_snapshot" : false, "lucene_version" : "4.10.4" }, "tagline" : "You Know, for Search" } Install Logstash Add the Logstash repo, enable & install CentOS 7 # cat <<EOF >> /etc/yum.repos.d/logstash.repo [logstash-1.5] name=Logstash repository for 1.5.x packages baseurl= gpgcheck=1 gpgkey= enabled=1 EOF # yum install logstash -y Ubuntu 14.xx $ echo "deb stable main" | sudo tee -a /etc/apt/sources.list $ sudo apt-get update && sudo apt-get install logstash Create SSL Certificate (Optional)* *You can skip this step if you don't intend to use your ELK install to monitor logs for anything other than pfSense (or any TCP/UDP forwarded logs). (pfSense forwards it's logs via UDP, therefore there's no requirement to set up a Logstash-Forwarder environment. Having said that, it's still good practice to set it up, since you'll most likely be using your ELK stack for more than collecting and parsing only pfSense logs.) Logstash-Forwarder (formerly LumberJack) utilizes an SSL certificate and key pair to verify the identity of your Logstash server. You have two options when generating this SSL certificate. 1. Hostname/FQDN (DNS) Setup 2. IP Address Setup Option 1 If you have DNS setup within your private/internal network, add a DNS A record pointing to the private IP address of your ELK/Logstash server. Alternatively add a DNS A record with your DNS provider pointing to your ELK/Logstash servers public IP address. As long as each server you're gathering logs from can resolve the Logstash servers hostname/domain name, either is fine. Alternatively you can edit the etc/hosts file of the servers you're collecting logs from by adding an IP address (Public or Private) and hostname entry pointing to your Logstash server (Private IP in my case). # nano /etc/hosts elk Now to generate the SSL certificate and key pair. Go to OpenSSL directory. CentOS 7 # cd /etc/pki/tls Ubuntu 14.xx Use the following commands to create the directories that will store you certificate and private key. $ sudo mkdir -p /etc/pki/tls/certs $ sudo mkdir /etc/pki/tls/private Execute the following command to create a SSL certificate, replace “elk” with the hostname of your real logstash server. # cd /etc/pki/tls # openssl req -x509 -nodes -newkey rsa:2048 -days 3650 -keyout private/logstash-forwarder.key -out certs/logstashforwarder.crt -subj /CN=elk The generated logstash-forwarder.crt should be copied to all client servers who's logs you intend to send to your Logstash server. N.B. If you've decided to go with the above described Option 1, please ignore Option 2 below, and skip straight to the 'Configure Logstash' section. Option 2 If for some reason you don't have DNS setup and/or can't resolve the hostname of your Logstash server, you can add the IP address of your Logstash server to the subjectAltName (SAN) of the certificate we're about to generate. Start by editing the OpenSSL configuration file: $ nano /etc/pki/tls/openssl.cnf Find the section starting with [ v3_ca ] and add the following line, substituting the IP address for that of your own Logstash server. subjectAltName = IP: Save & exit Execute the following command to create the SSL certificate and private key. # cd /etc/pki/tls # openssl req -config /etc/pki/tls/openssl.cnf -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout private/logstash-forwarder.key -out certs/logstash-forwarder.crt The generated logstash-forwarder.crt should be copied to all client servers you intend to collect logs from. Configure Logstash Logstash configuration files are JSON-Format files located in the /etc/logstash/conf.d/ directory. A Logstash server configuration consists of three sections; input, filter and output, all of which can be placed in a single configuration file. However in practice is it's much more practical to place these sections into separate config files. Create an input configuration: # nano /etc/logstash/conf.d/01-inputs.conf Paste the following: #logstash-forwarder [Not utilized by pfSense by default] #input { # lumberjack { # port => 5000 # type => "logs" # ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt" # ssl_key => "/etc/pki/tls/private/logstash-forwarder.key" # } #} #tcp syslog stream via 5140 input { tcp { type => "syslog" port => 5140 } } #udp syslogs tream via 5140 input { udp { type => "syslog" port => 5140 } } Create an syslog configuration: # nano /etc/logstash/conf.d/10-syslog.conf Paste the following: filter { if [type] == "syslog" { #change to pfSense ip address if [host] =~ /192\.168\.0\.2/ { mutate { add_tag => ["PFSense", "Ready"] } } if "Ready" not in [tags] { mutate { add_tag => [ "syslog" ] } } } } filter { if [type] == "syslog" { mutate { remove_tag => "Ready" } } } filter { if "syslog" in [tags] { grok { match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" } add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } syslog_pri { } date { match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] locale => "en" } if !("_grokparsefailure" in [tags]) { mutate { replace => [ "@source_host", "%{syslog_hostname}" ] replace => [ "@message", "%{syslog_message}" ] } } mutate { remove_field => [ "syslog_hostname", "syslog_message", "syslog_timestamp" ] } # if "_grokparsefailure" in [tags] { # drop { } # } } } Create an outputs configuration: # nano /etc/logstash/conf.d/30-outputs.conf Paste the following: output { elasticsearch { hosts => localhost index => "logstash-%{+YYYY.MM.dd}" } stdout { codec => rubydebug } } Create your pfSense configuration: # nano /etc/logstash/conf.d/11-pfsense.conf Paste the following: filter { if "PFSense" in [tags] { grok { add_tag => [ "firewall" ] match => [ "message", "<(?<evtid>.*)>(?<datetime>(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust) ?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\s+(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]) (?:2[0123]|[01]?[0-9]):(?:[0-5][0-9]):(?:[0-5][0-9])) (?<prog>.*?): (?<msg>.*)" ] } mutate { gsub => ["datetime"," "," "] } date { match => [ "datetime", "MMM dd HH:mm:ss" ] timezone => "UTC" } mutate { replace => [ "message", "%{msg}" ] } mutate { remove_field => [ "msg", "datetime" ] } } if [prog] =~ /^filterlog$/ { mutate { remove_field => [ "msg", "datetime" ] } grok { patterns_dir => "/etc/logstash/conf.d/patterns" match => [ "message", "%{PFSENSE_LOG_DATA}%{PFSENSE_IP_SPECIFIC_DATA}%{PFSENSE_IP_DATA}%{PFSENSE_PROT OCOL_DATA}", "message", "%{PFSENSE_LOG_DATA}%{PFSENSE_IPv4_SPECIFIC_DATA_ECN}%{PFSENSE_IP_DATA}%{PFSENS E_PROTOCOL_DATA}" ] } mutate { lowercase => [ 'proto' ] } geoip { add_tag => [ "GeoIP" ] source => "src_ip" # Optional GeoIP database database => "/etc/logstash/GeoLiteCity.dat" } } } The above configuration uses a pattern file. Create a patterns directory: # mkdir /etc/logstash/conf.d/patterns And download the following pattern file to it: # cd /etc/logstash/conf.d/patterns # wget bef4c7/pfsense2-2.grok (Optional) Download and install the MaxMind GeoIP database: $ cd /etc/logstash $ sudo curl -O "" $ sudo gunzip GeoLiteCity.dat.gz Now restart the logstash service. CentOS 7 # systemctl restart logstash.service Ubuntu 14.xx $ sudo service logstash restart Logstash server logs are stored in the following file, # cat /var/log/logstash/logstash.log Logs retrieved from pfSense (once setup is complete) can be viewed via, # tail -f /var/log/logstash/logstash.stdout these will help you troubleshoot any issues you encounter. Configuring pfSense for remote logging to ELK Login to pfSense and check the dashboard to ensure you're running pfSense 2.2.x Now go to the settings tab via Status > System Logs. Check 'Send log messages to remote syslog server', enter your ELK servers IP address and custom port (port 5140 in this case), and check 'Firewall events' (or 'Everything' if you wish to send everything pfSense logs to ELK). That's it for pfSense! Configure Kibana4 Kibana 4 provides visualization of your pfSense logs. Use the following command to download it in terminal. wget # tar -zxf kibana-4.1.2-linux-x64.tar.gz # mv kibana-4.1.2-linux-x64 /opt/kibana4 Enable PID file for Kibana, this is required to create a systemd init file. # sed -i 's/#pid_file/pid_file/g' /opt/kibana4/config/kibana.yml Kibana can be started by running /opt/kibana4/bin/kibana, to run kibana as a server we will create a systemd file. # nano /etc/systemd/system/kibana4.service [Unit] Description=Kibana 4 Web Interface After=elasticsearch.service After=logstash.service [Service] ExecStartPre=rm -rf /var/run/ ExecStart=/opt/kibana4/bin/kibana ExecReload=kill -9 $(cat /var/run/ && rm -rf /var/run/ && /opt/kibana4/bin/kibana ExecStop=kill -9 $(cat /var/run/ [Install] Start and enable kibana to start automatically at system startup. # systemctl start kibana4.service # systemctl enable kibana4.service Check to see if Kibana is working properly by going to http://your-ELK-IP:5601/ in your browser. You should see the following page where you have to map Logstash index to use Kibana. From the Time-field name dropdown menu select @timestamp. Spot any mistakes/errors? Or have any suggestions? Please make a comment below. Part 2: Configuring Kibana 4 visualizations and dashboards. (Coming soon)... pfSense Collect logs from pfSense and OPNsense with Elastic Agent. What is an Elastic integration? This integration is powered by Elastic Agent. Elastic Agent is a single, unified way to add monitoring for logs, metrics, and other types of data to a host. It can also protect hosts from security threats, query data from operating systems, forward data from remote services or hardware, and more. Refer to our documentation for a detailed comparison between Beats and Elastic Agent. Prefer to use Beats for this use case? See Filebeat modules for logs or Metricbeat modules for metrics. Get started with integrations This is an integration to parse certain logs from pfSense and OPNsense firewalls. It parses logs received over the network via syslog (UDP/TCP/TLS). pfSense natively only supports UDP. OPNsense supports all 3 transports. Currently the integration supports parsing the Firewall, Unbound, DHCP Daemon, OpenVPN, IPsec, HAProxy, Squid, and PHP-FPM (Authentication) logs. All other events will be dropped. The HAProxy logs are setup to be compatible with the dashboards from the HAProxy integration. Install the HAPrxoy integration assets to use them. pfSense Setup 1. Navigate to Status -> System Logs, then click on Settings 2. At the bottom check Enable Remote Logging 3. (Optional) Select a specific interface to use for forwarding 4. Input the agent IP address and port as set via the integration config into the field Remote log servers (e.g. 5. Under Remote Syslog Contents select what logs to forward to the agent Select Everything to forward all logs to the agent or select the individual services to forward. Any log entry not in the list above will be dropped. This will cause additional data to be sent to the agent and Elasticsearch. The firewall, VPN, DHCP, DNS, and Authentication (PHP-FPM) logs are able to be individually selected. In order to collect HAProxy and Squid or other "package" logs, the Everything option must be selected. OPNsense Setup 1. Navigate to System -> Settings -> Logging/Targets 2. Add a new Logging/Target (Click the plus icon) Transport = UDP or TCP or TLS Applications = Select a list of applications to send to remote syslog. Leave empty for all. Levels = Nothing Selected Facilities = Nothing Selected Hostname = IP of Elastic agent as configured in the integration config Port = Port of Elastic agent as configured in the integration config Certificate = Client certificate to use (when selecting a tls transport type) Description = Syslog to Elasticsearch Click Save The module is by default configured to run with the udp input on port 9001. Important The pfSense integration supports both the BSD logging format (used by pfSense by default and OPNsense) and the Syslog format (optional for pfSense). However the syslog format is recommended. It will provide the firewall hostname and timestamps with timezone information. When using the BSD format, the Timezone Offset config must be set when deploying the agent or else the timezone will default to the timezone of the agent. See https://<pfsense url>/status_logs_settings.php and ttings.html for more information. A huge thanks to a3ilson for the repo, which is the foundation for the majority of the grok patterns and dashboards in this integration. Logs pfSense log This is the pfSense log dataset. An example event for log looks as following: { "@timestamp": "2021-07-04T00:10:14.578Z", "agent": { "ephemeral_id": "6b82ecb8-3739-4d1c-aeca-3a62c5340c7f", "id": "c5c06c39-0b86-45ec-9ae3-c773f4562eaa", "name": "docker-fleet-agent", "type": "filebeat", "version": "8.3.2" }, "data_stream": { "dataset": "pfsense.log", "namespace": "ep", "type": "logs" }, "destination": { "address": "", "geo": { "city_name": "Changchun", "continent_name": "Asia", "country_iso_code": "CN", "country_name": "China", "location": { "lat": 43.88, "lon": 125.3228 }, "region_iso_code": "CN-22", "region_name": "Jilin Sheng" }, "ip": "", "port": 853 }, "ecs": { "version": "8.5.0" }, "elastic_agent": { "id": "c5c06c39-0b86-45ec-9ae3-c773f4562eaa", "snapshot": false, "version": "8.3.2" }, "event": { "action": "block", "agent_id_status": "verified", "category": [ "network" ], "dataset": "pfsense.log", "ingested": "2022-07-30T02:57:35Z", "kind": "event", "original": "\u003c134\u003e1 2021-07-03T19:10:14.578288-05:00 filterlog 72237 - 146,,,1535324496,igb1.12,match,block,in,4,0x0,,63,32989,0,DF,6,tcp,60,,17,49652,853,0,S,1818117648,,64240,,mss;sackOK;TS;nop;wscale", "provider": "filterlog", "reason": "match", "timezone": "-05:00", "type": [ "connection", "denied" ] }, "input": { "type": "tcp" }, "log": { "source": { "address": "" }, "syslog": { "priority": 134 } }, "message": "146,,,1535324496,igb1.12,match,block,in,4,0x0,,63,32989,0,DF,6,tcp,60,,1,49652,853,0,S,1818117648,,64240,,mss;sackOK;TS;nop;wscale", "network": { "bytes": 60, "community_id": "1:pOXVyPJTFJI5seusI/UD6SwvBjg=", "direction": "inbound", "iana_number": "6", "transport": "tcp", "type": "ipv4" }, "observer": { "ingress": { "interface": { "name": "igb1.12" }, "vlan": { "id": "12" } }, "name": "", "type": "firewall", "vendor": "netgate" }, "pfsense": { "ip": { "flags": "DF", "id": 32989, "offset": 0, "tos": "0x0", "ttl": 63 }, "tcp": { "flags": "S", "length": 0, "options": [ "mss", "sackOK", "TS", "nop", "wscale" ], "window": 64240 } }, "process": { "name": "filterlog", "pid": 72237 }, "related": { "ip": [ "", "" ] }, "rule": { "id": "1535324496" }, "source": { "address": "", "ip": "", "port": 49652 }, "tags": [ "preserve_original_event", "pfsense", "forwarded" ] } Exported fields Field @timestamp client.address Description Type Date/time when the event originated. This is the date/time extracted from the event, typically representing when the event was generated by the source. If the event source has no original date timestamp, this value is typically populated by the first time the event was received by the pipeline. Required field for all events. Some event client addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in keyword the .address field. Then it should be duplicated to .ip or .domain, depending on which one it is. Unique number allocated to the autonomous system. The autonomous system number (ASN) long uniquely identifies each network on the Internet. Organization name. keyword Field client.bytes client.domain client.geo.city_name client.geo.continent_name client.geo.country_iso_code client.geo.country_name client.geo.location client.geo.region_iso_code client.geo.region_name client.ip client.mac client.port cloud.availability_zone cloud.machine.type cloud.provider cloud.region container.labels data_stream.dataset Description Type Multi-field match_only_ text of Bytes sent from the client to the server. long The domain name of the client system. This value may be a host name, a fully qualified domain name, or another host naming format. keyword The value may derive from the original event or be added from enrichment. City name. keyword Name of the continent. keyword Country ISO code. keyword Country name. keyword Longitude and latitude. geo_point Region ISO code. keyword Region name. keyword IP address of the client (IPv4 or IPv6). ip MAC address of the client. The notation format from RFC 7042 is suggested: Each octet (that is, 8-bit byte) is represented by two [uppercase] keyword hexadecimal digits giving the value of the octet as an unsigned integer. Successive octets are separated by a hyphen. Port of the client. long The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, keyword Google Cloud ORG Id, or other unique identifier. Availability zone in which this host is running. keyword Image ID for the cloud instance. keyword Instance ID of the host machine. keyword Instance name of the host machine. keyword Machine type of the host machine. keyword Name of the project in Google Cloud. keyword Name of the cloud provider. Example values keyword are aws, azure, gcp, or digitalocean. Region in which this host is running. keyword Unique container id. keyword Name of the image the container was built on. keyword Image labels. object Container name. keyword constant_key Data stream dataset. word Field Description data_stream.namespace Data stream namespace. data_stream.type Data stream type. Type constant_key word constant_key word Some event destination addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in destination.address keyword the .address field. Then it should be duplicated to .ip or .domain, depending on which one it is. Unique number allocated to the autonomous system. The autonomous system number (ASN) long uniquely identifies each network on the Internet. Organization name. keyword Multi-field match_only_ xt text of . destination.bytes Bytes sent from the destination to the source. long destination.geo.city_name City name. keyword destination.geo.continent_name Name of the continent. keyword destination.geo.country_iso_code Country ISO code. keyword destination.geo.country_name Country name. keyword destination.geo.location Longitude and latitude. geo_point User-defined description of a location, at the level of granularity they care about. Could be the name of their data centers, the floor number, keyword if this describes a local physical entity, city names. Not typically used in automated geolocation. destination.geo.region_iso_code Region ISO code. keyword destination.geo.region_name Region name. keyword destination.ip IP address of the destination (IPv4 or IPv6). ip MAC address of the destination. The notation format from RFC 7042 is suggested: Each octet (that is, 8-bit byte) is represented by two destination.mac keyword [uppercase] hexadecimal digits giving the value of the octet as an unsigned integer. Successive octets are separated by a hyphen. destination.port Port of the destination. long dns.question.class The class of records being queried. keyword The name being queried. If the name field contains non-printable characters (below 32 or keyword above 126), those characters should be Field dns.question.registered_domain dns.question.subdomain dns.question.top_level_domain dns.question.type dns.type ecs.version Description Type represented as escaped base 10 integers (\DDD). Back slashes and quotes should be escaped. Tabs, carriage returns, and line feeds should be converted to \t, \r, and \n respectively. The highest registered domain, stripped of the subdomain. For example, the registered domain for "" is "". This value can be determined precisely with a list keyword like the public suffix list ( Trying to approximate this by simply taking the last two labels will not work well for TLDs such as "". The subdomain is all of the labels under the registered_domain. If the domain has multiple levels of subdomain, such as keyword "", the subdomain field should contain "sub2.sub1", with no trailing period. The effective top level domain (eTLD), also known as the domain suffix, is the last part of the domain name. For example, the top level domain for is "com". This value can be determined precisely with a list like the keyword public suffix list ( Trying to approximate this by simply taking the last label will not work well for effective TLDs such as "". The type of record being queried. keyword The type of DNS event captured, query or answer. If your source of DNS events only gives you DNS queries, you should only create dns events of type dns.type:query. If your keyword source of DNS events gives you answers as well, you should create one event per query (optionally as soon as the query is seen). And a second event containing all query details as well as an array of answers. ECS version this event conforms to. ecs.version is a required field and must exist in all events. When querying across keyword multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events. Field Description error.message Error message. event.action event.category event.dataset event.duration event.ingested event.kind Type match_only_ text The action captured by the event. This describes the information in the event. It is more specific than event.category. Examples keyword are group-add, process-started, filecreated. The value is normally defined by the implementer. This is one of four ECS Categorization Fields, and indicates the second level in the ECS category hierarchy. event.category represents the "big buckets" of ECS categories. For example, filtering on event.category:process yields keyword all events relating to process activity. This field is closely related to event.type, which is used as a subcategory. This field is an array. This will allow proper categorization of some events that fall in multiple categories. constant_key Event dataset word Duration of the event in nanoseconds. If event.start and event.end are known this value long should be the difference between the end and start time. Unique ID to describe the event. keyword Timestamp when an event arrived in the central data store. This is different from @timestamp, which is when the event originally occurred. It's also different from event.created, which is meant to capture the first time an agent saw the date event. In normal conditions, assuming no tampering, the timestamps should chronologically look like this: @timestamp < event.created < event.i ngested. This is one of four ECS Categorization Fields, and indicates the highest level in the ECS category hierarchy. event.kind gives highlevel information about what type of information the event contains, without being keyword specific to the contents of the event. For example, values of this field distinguish alert events from metric events. The value of this field can be used to inform how these kinds of Field Description events should be handled. They may warrant different retention, different access control, it may also help understand whether the data coming in at a regular interval or not. event.module Event module event.original event.outcome event.provider event.reason Type constant_key word Raw text message of entire event. Used to demonstrate log integrity or where the full log message (before splitting it up in multiple parts) may be required, e.g. for reindex. This field is not indexed and doc_values are disabled. It keyword cannot be searched, but it can be retrieved from _source. If users wish to override this and index this field, please see Field data types in the Elasticsearch Reference. This is one of four ECS Categorization Fields, and indicates the lowest level in the ECS category hierarchy. event.outcome simply denotes whether the event represents a success or a failure from the perspective of the entity that produced the event. Note that when a single transaction is described in multiple events, each event may populate different values of event.outcome, according to their perspective. Also note that in the case of a keyword compound event (a single event that contains multiple logical events), this field should be populated with the value that best captures the overall success or failure from the perspective of the event producer. Further note that not all events will have an associated outcome. For example, this field is generally not populated for metric events, events with event.type:info, or any events for which an outcome does not make logical sense. Source of the event. Event transports such as Syslog or the Windows Event Log typically mention the source of an event. It can be the name of the software that generated the event keyword (e.g. Sysmon, httpd), or of a subsystem of the operating system (kernel, Microsoft-WindowsSecurity-Auditing). Reason why this event happened, according to the source. This describes the why of a keyword particular action or outcome captured in the Field Description Type event. Where event.action captures the action from the event, event.reason describes why that action was taken. For example, a web proxy with an event.action which denied the request may also populate event.reason with the reason why (e.g. blocked site). This field should be populated when the event's timestamp does not include timezone information already (e.g. default Syslog timestamps). It's optional otherwise. Acceptable event.timezone keyword timezone formats are: a canonical ID (e.g. "Europe/Amsterdam"), abbreviated (e.g. "EST") or an HH:mm differential (e.g. "05:00"). This is one of four ECS Categorization Fields, and indicates the third level in the ECS category hierarchy. event.type represents a categorization "sub-bucket" that, when used event.type along with the event.category field values, keyword enables filtering events down to a level appropriate for single visualization. This field is an array. This will allow proper categorization of some events that fall in multiple event types. Name of the backend (or listener) which was haproxy.backend_name keyword selected to manage the connection to the server. Total number of requests which were processed haproxy.backend_queue long before this one in the backend's global queue. Name of the listening address which received haproxy.bind_name keyword the connection. Total number of bytes transmitted to the client haproxy.bytes_read long when the log is emitted. Total time in milliseconds spent waiting for the haproxy.connection_wait_time_ms long connection to establish to the final server Total number of concurrent connections on the long process when the session was logged. Total number of concurrent connections haproxy.connections.backend handled by the backend when the session was long logged. Total number of concurrent connections on the haproxy.connections.frontend long frontend when the session was logged. Number of connection retries experienced by haproxy.connections.retries this session when trying to connect to the long server. Field Description Type Total number of concurrent connections still haproxy.connections.server active on the server when the session was long logged. Error message logged by HAProxy in case of haproxy.error_message text error. Name of the frontend (or listener) which haproxy.frontend_name keyword received and processed the connection. haproxy.http.request.captured_cook Optional "name=value" entry indicating that the keyword ie server has returned a cookie with its request. List of headers captured in the request due to haproxy.http.request.captured_head the presence of the "capture request header" keyword ers statement in the frontend. haproxy.http.request.raw_request_l Complete HTTP request line, including the keyword ine method, request and HTTP version string. Total time in milliseconds spent waiting for a haproxy.http.request.time_wait_ms full HTTP request from the client (not counting long body) after the first byte was received. Total time in milliseconds spent waiting for the haproxy.http.request.time_wait_wit server to send a full HTTP response, not long hout_data_ms counting data. haproxy.http.response.captured_co Optional "name=value" entry indicating that the keyword okie client had this cookie in the response. List of headers captured in the response due to haproxy.http.response.captured_he the presence of the "capture response header" keyword aders statement in the frontend. mode that the frontend is operating (TCP or haproxy.mode keyword HTTP) Name of the last server to which the connection haproxy.server_name keyword was sent. Total number of requests which were processed haproxy.server_queue long before this one in the server queue. haproxy.source The HAProxy source of the log keyword haproxy.tcp.connection_waiting_ti Total time in milliseconds elapsed between the long me_ms accept and the last close Condition the session was in when the session haproxy.termination_state keyword ended. Total time in milliseconds spent waiting for the haproxy.time_backend_connect connection to establish to the final server, long including retries. Total time in milliseconds spent waiting in the haproxy.time_queue long various queues. Total time in milliseconds spent waiting in the haproxy.total_waiting_time_ms long various queues host.architecture Operating system architecture. keyword Field host.containerized host.domain host.hostname host.ip host.mac host.os.codename host.os.kernel host.os.platform host.os.version host.type hostname http.request.body.bytes http.request.method http.request.referrer http.response.body.bytes http.response.bytes Description Type If the host is a container. boolean Name of the domain of which the host is a member. For example, on Windows this could be the host's Active Directory domain or keyword NetBIOS domain name. For Linux this could be the domain of the host's LDAP provider. Hostname of the host. It normally contains what keyword the hostname command returns on the host machine. Unique host id. As hostname is not always unique, use values that are meaningful in your keyword environment. Example: The current usage of Host ip addresses. ip Host mac addresses. keyword Name of the host. It can contain what hostname returns on Unix systems, the keyword fully qualified domain name, or a name specified by the user. The sender decides which value to use. OS build information. keyword OS codename, if any. keyword OS family (such as redhat, debian, freebsd, keyword windows). Operating system kernel version as a raw string. keyword Operating system name, without the version. keyword text Multi-field of Operating system platform (such centos, keyword ubuntu, windows). Operating system version as a raw string. keyword Type of host. For Cloud providers this can be the machine type like t2.medium. If vm, this keyword could be the container, for example, or other information meaningful in your environment. Hostname from syslog header. keyword Size in bytes of the request body. long HTTP request method. The value should retain its casing from the original event. For keyword example, GET, get, and GeT are all considered valid values for this field. Referrer for this HTTP request. keyword Size in bytes of the response body. long Total size in bytes of the response (body and long headers). Field http.response.mime_type http.response.status_code http.version input.type log.level log.source.address log.syslog.priority message network.bytes network.community_id network.direction Description Type Mime type of the body of the response. This value must only be populated based on the content of the response body, not on keyword the Content-Type header. Comparing the mime type of a response with the response's Content-Type header can be helpful in detecting misconfigured servers. HTTP response status code. long HTTP version. keyword Type of Filebeat input. keyword Original log level of the log event. If the source of the event provides a log level or textual severity, this is the one that goes in log.level. If your source doesn't specify one, you may put keyword your event transport's severity here (e.g. Syslog severity). Some examples are warn, err, i, informational. Source address of the syslog message. keyword Syslog numeric priority of the event, if available. According to RFCs 5424 and 3164, the priority is 8 * facility + severity. This long number is therefore expected to contain a value between 0 and 191. For log events the message field contains the log message, optimized for viewing in a log viewer. For structured logs without an original match_only_ message field, other fields can be concatenated text to form a human-readable summary of the event. If multiple messages exist, they can be combined into one message. Total bytes transferred in both directions. If source.bytes and destination.bytes are long known, network.bytes is their sum. A hash of source and destination IPs and ports, as well as the protocol used in a communication. This is a tool-agnostic standard keyword to identify flows. Learn more at Direction of the network traffic. When mapping events from a host-based monitoring context, populate this field from the host's point of view, keyword using the values "ingress" or "egress". When mapping events from a network or perimeterbased monitoring context, populate this field Field network.iana_number network.packets network.protocol network.transport network.type observer.ip observer.type Description Type from the point of view of the network perimeter, using the values "inbound", "outbound", "internal" or "external". Note that "internal" is not crossing perimeter boundaries, and is meant to describe communication between two hosts within the perimeter. Note also that "external" is meant to describe traffic between two hosts that are external to the perimeter. This could for example be useful for ISPs or VPN service providers. IANA Protocol Number ( keyword Standardized list of protocols. This aligns well with NetFlow and sFlow related logs which use the IANA Protocol Number. Total packets transferred in both directions. If source.packets and destination.packet long s are known, network.packets is their sum. In the OSI Model this would be the Application Layer protocol. For example, http, dns, keyword or ssh. The field value must be normalized to lowercase for querying. Same as network.iana_number, but instead using the Keyword name of the transport layer keyword (udp, tcp, ipv6-icmp, etc.) The field value must be normalized to lowercase for querying. In the OSI Model this would be the Network Layer. ipv4, ipv6, ipsec, pim, etc The field keyword value must be normalized to lowercase for querying. VLAN ID as reported by the observer. keyword Interface name as reported by the system. keyword VLAN ID as reported by the observer. keyword IP addresses of the observer. ip Custom name of the observer. This is a name that can be given to an observer. This can be helpful for example if multiple firewalls of the keyword same model are used in an organization. If no custom name is needed, the field can be left empty. The type of the observer the data is coming from. There is no predefined list of observer keyword types. Some examples Field observer.vendor pfsense.dhcp.age pfsense.dhcp.duid pfsense.dhcp.hostname pfsense.dhcp.iaid pfsense.dhcp.lease_time pfsense.dhcp.subnet pfsense.dhcp.transaction_id pfsense.icmp.code pfsense.icmp.destination.ip pfsense.icmp.mtu Description Type are forwarder, firewall, ids, ips, proxy, po ller, sensor, APM server. Vendor name of the observer. keyword Age of DHCP lease in seconds long The DHCP unique identifier (DUID) is used by a client to get an IP address from a DHCPv6 keyword server. Hostname of DHCP client keyword Identity Association Identifier used alongside keyword the DUID to uniquely identify a DHCP client The DHCP lease time in seconds long The subnet for which the DHCP server is keyword issuing IPs The DHCP transaction ID keyword ICMP code. long Original destination address of the connection ip that caused this notification ID of the echo request/reply long MTU to use for subsequent data to this long destination Originate Timestamp date ICMP parameter. long ICMP redirect address. ip Receive Timestamp date ICMP sequence number. long Transmit Timestamp date ICMP type. keyword pfsense.icmp.otime pfsense.icmp.parameter pfsense.icmp.redirect pfsense.icmp.rtime pfsense.icmp.seq pfsense.icmp.ttime pfsense.icmp.type pfsense.icmp.unreachable.iana_nu Protocol ID number that was unreachable mber pfsense.icmp.unreachable.other Other unreachable information pfsense.icmp.unreachable.port Port number that was unreachable pfsense.ip.ecn Explicit Congestion Notification. pfsense.ip.flags IP flags. pfsense.ip.flow_label Flow label ID of the packet pfsense.ip.offset Fragment offset pfsense.ip.tos IP Type of Service identification. pfsense.ip.ttl Time To Live (TTL) of the packet pfsense.openvpn.peer_info Information about the Open VPN client pfsense.tcp.ack TCP Acknowledgment number. pfsense.tcp.flags TCP flags. pfsense.tcp.length Length of the TCP header and payload. pfsense.tcp.options TCP Options. long keyword long keyword keyword keyword long long keyword long keyword long keyword long array Field pfsense.tcp.seq pfsense.tcp.urg pfsense.tcp.window pfsense.udp.length process.program related.ip related.user server.address server.bytes server.ip server.mac server.port source.address Description Type TCP sequence number. long Urgent pointer data. keyword Advertised TCP window size. long Length of the UDP header and payload. long Process name. Sometimes called program name keyword or similar. match_only_ Multi-field of text Process id. long Process from syslog header. keyword All of the IPs seen on your event. ip All the user names or other user identifiers seen keyword on the event. A rule ID that is unique within the scope of an agent, observer, or other entity using the rule keyword for detection of this event. Some event server addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in keyword the .address field. Then it should be duplicated to .ip or .domain, depending on which one it is. Bytes sent from the server to the client. long IP address of the server (IPv4 or IPv6). ip MAC address of the server. The notation format from RFC 7042 is suggested: Each octet (that is, 8-bit byte) is represented by two [uppercase] keyword hexadecimal digits giving the value of the octet as an unsigned integer. Successive octets are separated by a hyphen. Port of the server. long Some event source addresses are defined ambiguously. The event will sometimes list an IP, a domain or a unix socket. You should always store the raw address in keyword the .address field. Then it should be duplicated to .ip or .domain, depending on which one it is. Unique number allocated to the autonomous system. The autonomous system number (ASN) long uniquely identifies each network on the Internet. Organization name. keyword Field source.bytes source.domain source.geo.city_name source.geo.continent_name source.geo.country_iso_code source.geo.country_name source.geo.location source.geo.region_iso_code source.geo.region_name source.ip source.mac source.nat.ip source.port source.user.full_name source.user.full_name.text squid.hierarchy_status squid.request_status tags Description Type Multi-field match_only_ text of Bytes sent from the source to the destination. long The domain name of the source system. This value may be a host name, a fully qualified domain name, or another host naming format. keyword The value may derive from the original event or be added from enrichment. City name. keyword Name of the continent. keyword Country ISO code. keyword Country name. keyword Longitude and latitude. geo_point User-defined description of a location, at the level of granularity they care about. Could be the name of their data centers, the floor number, keyword if this describes a local physical entity, city names. Not typically used in automated geolocation. Region ISO code. keyword Region name. keyword IP address of the source (IPv4 or IPv6). ip MAC address of the source. The notation format from RFC 7042 is suggested: Each octet (that is, 8-bit byte) is represented by two keyword [uppercase] hexadecimal digits giving the value of the octet as an unsigned integer. Successive octets are separated by a hyphen. Translated ip of source based NAT sessions (e.g. internal client to internet) Typically ip connections traversing load balancers, firewalls, or routers. Port of the source. long User's full name, if available. keyword match_only_ Multi-field of source.user.full_name. text Unique identifier of the user. keyword The proxy hierarchy route; the route Content keyword Gateway used to retrieve the object. The cache result code; how the cache responded to the request: HIT, MISS, and so on. Cache keyword result codes are described here. List of keywords used to tag each event. keyword Field tls.cipher tls.version tls.version_protocol url.domain url.extension url.full url.full.text url.original url.original.text url.password url.path url.port url.query Description Type String indicating the cipher used during the keyword current connection. Numeric part of the version parsed from the keyword original string. Normalized lowercase protocol name parsed keyword from original string. Domain of the url, such as "". In some cases a URL may refer to an IP and/or port directly, without a domain name. In this case, the IP address would go to keyword the domain field. If the URL contains a literal IPv6 address enclosed by [ and ] (IETF RFC 2732), the [ and ] characters should also be captured in the domain field. The field contains the file extension from the original request url, excluding the leading dot. The file extension is only set if it exists, as not every url has a file extension. The leading period must not be included. For example, the keyword value must be "png", not ".png". Note that when the file name has multiple extensions (example.tar.gz), only the last one should be captured ("gz", not "tar.gz"). If full URLs are important to your use case, they should be stored in url.full, whether this wildcard field is reconstructed or present in the event source. match_only_ Multi-field of url.full. text Unmodified original url as seen in the event source. Note that in network monitoring, the observed URL may be a full URL, whereas in wildcard access logs, the URL is often just represented as a path. This field is meant to represent the URL as it was observed, complete or not. match_only_ Multi-field of url.original. text Password of the request. keyword Path of the request, such as "/search". wildcard Port of the request, such as 443. long The query field describes the query string of the request, such as "q=elasticsearch". The ? is excluded from the query string. If a URL keyword contains no ?, there is no query field. If there is a ? but no query, the query field exists with an Field url.scheme url.username user.domain user.full_name user.full_name.text user_agent.original user_agent.original.text user_agent.os.full user_agent.os.full.text user_agent.os.version user_agent.version Description Type empty string. The exists query can be used to differentiate between the two cases. Scheme of the request, such as "https". Note: keyword The : is not part of the scheme. Username of the request. keyword Name of the directory the user is a member of. For example, an LDAP or Active Directory keyword domain name. User email address. keyword User's full name, if available. keyword match_only_ Multi-field of user.full_name. text Unique identifier of the user. keyword Short name or login of the user. keyword match_only_ Multi-field of text Name of the device. keyword Name of the user agent. keyword Unparsed user_agent string. keyword match_only_ Multi-field of user_agent.original. text Operating system name, including the version keyword or code name. match_only_ Multi-field of user_agent.os.full. text Operating system name, without the version. keyword match_only_ Multi-field of text Operating system version as a raw string. keyword Version of the user agent. keyword