Understanding Apache 2.2 Configuration Brad Nicholes Senior Software Engineer, Novell Inc. Member, Apache Software Foundation bnicholes@novell.com (with revisions) 1 A PAtCHy server: developed by the Apache group formed 2/95 around a number of people who provided patch files for NCSA httpd 1.3 by Rob McCool. History-http://www.apache.org/ABOUT_APACHE.html First official public release (0.6.2) in April 1995 Add adaptive pre-fork child processes (very important!). Modular structure and API for extensibility (Bob Thau) Port to multiple platforms. Add documentation. Apache 1.0 was released on 12/1/95. Passed NCSA httpd to be #1 server in Internet. 2 Apache is current market share leader in web servers. You can download it from www.apache.org See survey statistics in http://news.netcraft.com/archives/web_serv er_survey.html 3 Shipping: Apache 1.3.37 – Maintenance mode, no new development Apache 2.0.59 – Maintenance mode, no new development Apache 2.2.9 – Current release Development: Apache 2.3.x-dev – Unstable, all new development happens here first 4 Download httpd-2.2.0.tar.bz2 from http://www.apache.org/dist or closer mirror sites $tar xjf httpd-2.2.0.tar.bz2 $ ./configure --prefix=PREFIX $ make $ make install $ PREFIX/bin/apachectl start Here PREFIX is the prefix of the directory containing the distribution, typically it is /usr/local/apache. For configuring the apache with specific features, we can specify the corresponding features as option to the configure command. You can find the list of features by “./configure –help” Here is a command used to compile the httpd with proxy and cache modules needed. 5 File Locations Modules - /usr/lib/apache2 Configuration - /etc/apache2 Logs - /var/log/apache2 Cgi-bin - /srv/www/cgi-bin DocumentRoot - /srv/www/htdocs Binary - /usr/sbin/httpd2 (symlink to actual binary) /usr/sbin/httpd2-worker /usr/sbin/httpd2-prefork Other support binaries - /usr/sbin Startup script – /usr/sbin/rcapache2 Symlink to /etc/init.d/apache2 6 Accommodate a wide variety of operating environments on different platforms Responsible for: Binding to network ports Accepting requests Dispatching worker threads to handle requests Allows customization for particular sites Scalability in a threaded environment – Worker MPM Compatibility with older modules – Prefork MPM Platform custom – NetWare MPM, WinNT MPM 7 Combines multi-process and multi-threaded models Variable number of processes (parents) Fixed number of threads Each child process handles many concurrent connections Stability of multiple processes Performance of multiple threads Reduces the memory footprint 8 Worker MPM - Multi-Processing Module implementing a hybrid multi-threaded / multi-process web server StartServers - Number of child server processes created at startup MinSpareThreads - Minimum number of idle threads allowed before additional worker threads are created MaxSpareThreads - Maximum number of idle threads allowed before excess worker threads are destroyed MaxClients - Maximum number of worker threads allowed MaxMemFree - Maximum amount of memory that the main allocator is allowed to hold without calling free() ThreadsPerChild - Number of threads created by each child process http://httpd.apache.org/docs/2.2/mod/worker.html 9 Stable but slower (based on documentation) One parent (master server) many children (workers) Each child server is a process itself Each child handles one connection at a time Uses more memory Similar to the NetWare MPM but using processes instead of threads 10 Prefork MPM - Implements a non-threaded, preforking web server StartServers - Number of child server processes created at startup MinSpareServers - Minimum number of idle child server processes MaxSpareServers - Maximum number of idle child server processes MaxClients - Maximum number of child processes that will be created to serve requests MaxMemFree - Maximum amount of memory that the main allocator is allowed to hold without calling free() http://httpd.apache.org/docs/2.2/mod/prefork.html 11 Reading the Documentation Online: http://httpd.apache.or g/docs/2.2/ Also installed with every instance of Apache Most directives consist of a name and a single value Some directives may have multiple, optional or boolean values Example directive: The default HTTPD.conf file contains a very good explanation of each directive that is used and why The directives are not ordered The configuration file contains one directive per line but the “\” may be used to indicate that the directive continues to the next line Configuration directives are case-insensitive but some arguments may be case-sensitive Lines that begin with “#” are considered to be comments <IfDefine> can be used to block out sections of the configuration file that are only used if a specific environment variable has been defined 13 Directives that limit the application of other directives. Specify by a group like a tag section in html. <VirtualHost host[:port]> ... </VirtualHost> <VirtualHost…><Directory dir>, <Files file>, <Location URL> in ascending order of authority. <Location> can overwrite others. dir, file, URL can specify using wildcards and full regular expressions preceded by “~” 14 KeepAlive [on|off](on): keep connection alive for n requests before terminate provided they come in before timeout. n is defined in MaxKeepAliveRequests <n>(100) directive KeepAliveTimeout <n>(15): wait for the next request for n seconds before terminate the connections. Timeout <n>(300): max. time in sec for a block data. HostNameLookups [on|off|double](off): do reverse DNS lookup for logging the domain name of the request. MaxClients <n>(256): the limit of # of simultaneous requests (hence the # of child processes). MaxRequestsPerChild <n>(0): Spare(child) server dies after <n> requests, avoid mem leak. 0 mean infinite requests. Min/MaxSpareServers <n>(5/10): # of Idle child servers StartServers <n>(5): sets the number of child server processes created on startup. 15 ServerRoot – Base directory for the server installation All relative paths are derived from the ServerRoot If you have multiple installations of the web server, make sure that the ServerRoot points to the respective install locations PidFile - File where the server records the process ID of the daemon If an error message occurs when starting Apache on Linux indicating that HTTPD is already running, it may be that an old httpd.pid file was orphaned after an abnormal shutdown (ie. Kill -9) 16 Timeout – Amount of time the server will wait for send or receive events before failing a request (Default 300 seconds or 5 minutes) If Apache appears to hang during a shutting down on NetWare, it may be that a worker thread is waiting for data from the client. After the timeout period has expired, Apache will shutdown normally. KeepAlive – Enable persistent connections (ie. Avoids having to reconnect with the same client on subrequests) If the connection is not properly terminated by the client, the connection will be held for the duration of the KeepAliveTimeout value. This could cause unecessary latency when responding to new requests on a busy server 17 Listen – Binds Apache to a specific IP address and/or port LoadModule – Loads an external Apache module If only a port is specified, Apache will listen to that port on all IP addresses assigned to the box <IfModule> - Should surround module specific directives to prevent invalid configuration if a module has not been loaded UseCanonicalName – Determines how Apache constructs self-referencing URLs (ie. Redirects) ServerName – Used to construct a self-referencing URL when UseCanonicalName is set to ON. Otherwise Apache uses the host name supplied by the client 18 DocumentRoot – Default location from which all documents are served If an alias for a URI is not found, Apache will attempt to serve the page from the DocumentRoot Options – Configures the features that are available in a specific directory Indexes – Allows a directory listing AddIcon - Specifies the location and file name of the icon that should be displayed for a given file type Multiviews – Allows language negotiation ExecCGI – Allow CGI binaries or scripts to be executed Includes – Enables Server-Side includes or parsed HTML 19 Order/Allow/Deny – Specifies access control restrictions The Order directive determines whether Apache should be inclusive or exclusive when applying access control Both Allow and Deny can be used to restrict access based on full or partial IP addresses, network masks or environment variables DirectoryIndex – Specifies the default file name(s) to serve when no page is specifed in the request The file index.html.var can be used to specify additional language negotiation rules rather than an actual web page 20 CustomLog – Defines the location and format of a custom log file When used with the LogFormat directive, the contents of the log file as well as the format can be specified Multiple log files can be defined containing different information or layouts (Warning: specifying additional log files may hurt performance) Alias – Associates a URI prefix with a physical directory location <Directory>/<Location>/<Files> - Should accompany the Alias directive to indicate how files are accessed from the aliased location 21 ErrorDocument – Defines a custom or user friendly response to an HTTP error The response can be in plain text, local redirect or external redirect If the response is a redirect, the language can be negotiated so that it is appropriate for the request BrowserMatch – Customizes the request handling for particular browsers Can be used to force a response to HTTP 1.0 rather than 1.1 or to turn off keepalive connections for older browsers 22 Functional blocks of directives can be put into a separate configuration file Use the “Include” directive to instruct Apache to read additional configuration files If the “Include” directive specifies a directory, all files within the directory will be read as additional configuration files Wildcards can be used to specify a certain set of additional configuration files (include conf/*.conf) 23 Apache supports two types of virtual hosts Name-based virtual host Selects a virtual host configuration based on the domain name of the request Allows more that one virtual host per IP address IP-based virtual Selects a virtual host configuration based on the IP address of the request Each IP address belongs to a specific virtual host Each virtual host can be configured independently ServerName, DocumentRoot, Aliases, log files, etc. 24 There are a few way we can host a web site: Name-based Virtual Hosting IP-based virtual Hosting: A set of hostnames shared the same IP address (similar to alias) utilize the HOST: meta header in http request (browser fill in the hostname) to distinguish different web site. Each hostname will have its own site configuration, document root. Require either the set of hostnames are registered DNS names or the client machines need to configure their ip addresses mapping in hostfiles such as /etc/hosts (Unix) or C:\WINDOWS\system32\drivers\etc\hosts (Windows) Require a unique IP address for each virtual hosting site Use IP alieas to configure the same Network Interface Card (NIC) to listen to different IP address, e.g., ifconfig eth0:1 128.198.160.33 Some Unix system sets limit on how many IP aliases can be supported. Use <VirtualHost hostname[:port]> block directives Specify ServerAdmin, DocumentRoot, ServerName, ErrorLog, TransferLog for individual VH 25 With Virtual Machine (VMWare/VPC). We can configure a virtual machine for each web site. This gives each site total control of the OS of the virtual machine. We can gracefully shutdown/restart individual web site (for maintenance/configuration/software updates). Cannot be done with name-based or IP-based virtual hosting web sites. We can configure different software packages, OS for each individual web site. Allow total control for the admin of the web site (root privilege, user creation, etc) Disadvantage: Require more resources (CPU, memory, Disk). 26 NameVirtualHost *:80 <VirtualHost *:80> ServerName www.domain.com ServerAlias domain.com *.domain.com DocumentRoot /www/domain </VirtualHost> <VirtualHost _default_> ServerName www.otherdomain.com DocumentRoot /www/otherdomain </VirtualHost> • • • NameVirtualHost specifies the IP address that will be shared ServerAlias directive allows access to a specific virtual host by different domain names Apache uses the ServerName directive to decide which virtual host configuration applies based upon the HOST: header request 27 <VirtualHost www.smallco.com> ServerAdmin webmaster@mail.smallco.com DocumentRoot /groups/smallco/www ServerName www.smallco.com ErrorLog /groups/smallco/logs/error_log CustomLog /groups/smallco/logs/access_log combined </VirtualHost> <VirtualHost www.baygroup.org> ServerAdmin webmaster@mail.baygroup.org DocumentRoot /groups/baygroup/www ServerName www.baygroup.org ErrorLog /groups/baygroup/logs/error_log CustomLog /groups/baygroup/logs/access_log combined </VirtualHost> • • Apache determines which virtual host to use based off of the IP address resolved from the host name Almost any configuration directive can be put in a virtual host block with the exception of some of the process creation directives 28 A single instance of the Apache Web server can be used to serve page content in multiple languages Language negotiation does not depend on the server installed language The <Directory> or <Location> block must contain one of the following: “Option Multiviews” to enable language file matching “AddHandler type-map var” to specify a type-map file that contains language definitions Each HTML file encoded for a different language, must append the corresponding language extention Example: index.html.en – English, index.html.fr – French 29 The following directives are used by the language negotiation functionality: - AddLanguage - AddDefaultCharset - AddLanguage - LanguagePriority LanguagePriority - AddDefaultCharset - DefaultLanguage - DefaultLanguage - ForceLanguagePriority - ForceLanguagePriority - AddCharset - AddCharset Each browser request contains an “acceptlanguage” header that indicates the language(s) that the client will accept The languages are usually specified by either 2 or 4 character keys (en, en-us, fr, de, es, ...) 30 Multiviews enabled negotiation Type-Map enabled negotiation Apache matches the “accept-language” key to a file extension through the “AddLanguage” directives in the HTTPD.conf file Apache first searches for an exact match of the specified file Apache next searches for the specified file with the 2 or 4 character appended language extension Apache searches for the specified file with the type-map extension (usually .var) Apache reads the .var file and selects the file name that is associated with the appropriate language If a language file is not found, Apache will fallback to the LanguagePriority and ForceLanguagePriority directives to determine how to handle the request More info: http://httpd.apache.org/docs/2.2/content-negotiation.html 31 Directives enclosed in a <Directory> block apply to the specified file system directory and sub-directories Directives enclosed in a <Location> block apply to the specified web space container <Location /private> would apply to any URL-path that begins with “/private” http://your.domain.com/private http://your.domain.com/private123 http://your.domain.com/private/mydocs/index.html Able to apply directives to locations that don't physically exist such as a module handler <Location /server-status> SetHandler server-status </Location> 32 Default SSL port for an HTTP server is 443 All SSL requests and responses are handled through the MOD_SSL module (NetWare handles SSL natively) SSL configuration is done by creating a virtual host that listens the designated SSL port Example SSL configuration is found in conf/extra/httpd-ssl.conf of the Apache HTTPD distribution Additional documentation can be found at: http://httpd.apache.org/docs/2.2/ssl http://httpd.apache.org/docs/2.2/mod/mod_ssl.html 33 Terms / Authentication Elements: Authentication Type – Type of encryption used during transport of the authentication credentials (Basic or Digest) Authentication Method/Provider – Process by which a user is verified to be who they say they are Authorization – Process by which authenticated users are granted or denied access based on specific criteria Previous to Apache 2.2, every authentication module had to implement all three elements Choosing an AuthType limited which authentication and authorization methods could be used Potential for inconsistencies across authentication modules Note: Pay close attention to the words Authentication vs. Authorization 34 The functionality of each Apache 2.0 authentication module has been split out into the three authentication elements for Apache 2.2 Overlapping functionality among the modules was simply eliminated in favor of a base implementation The module name indicates which element of the authentication functionality it performs Mod_auth_xxx – Implements an Authentication Type Mod_authn_xxx – Implements an Authentication Method or Provider Mod_authz_xxx – Implements an Authorization Method 35 New Modules – Authentication Type Modules Directives Mod_Auth_Basic • AuthBasicAuthoritative Basic authentication – User credentials are received by the server as unencrypted data • AuthBasicProvider Mod_Auth_Digest • AuthDigestAlgorithm MD5 Digest authentication – User credentials are received by the server in encrypted format • AuthDigestDomain • AuthDigestNcCheck • AuthDigestNonceFormat • AuthDigestNonceLifetime • AuthDigestProvider • AuthDigestQop • AuthDigestShmemSize 36 New Modules – Authentication Providers Modules Directives Mod_Authn_Anon • Anonymous Allows “anonymous” user access to authenticated areas • Anonymous_LogEmail • Anonymous_MustGiveEmail • Anonymous_NoUserID • Anonymous_VerifyEmail Mod_Authn_DBM • AuthDBMType DBM file based user authentication • AuthDBMUserFile Mod_Authn_Default • AuthDefaultAuthoritative Authentication fallback module 37 New Modules – Authentication Providers Modules Mod_Authn_File Directives • AuthUserFile File based user authentication Mod_Authnz_LDAP • AuthLDAPBindDN LDAP directory based authentication • AuthLDAPBindPassword • AuthLDAPCharsetConfig • AuthLDAPDereferenceAliases • AuthLDAPRemoteUserIsDN • AuthLDAPUrl 38 New Modules - Authorization Modules Directives Mod_Authnz_LDAP • Require ldap-user LDAP directory based authorization • Require ldap-group • Require ldap-dn • Require ldap-attribute • Require ldap-filter • AuthLDAPCompareDNOnServer • AuthLDAPGroupAttribute • AuthLDAPGroupAttributeIsDN • AuthzLDAPAuthoritative Mod_Authz_Default • AuthzDefaultAuthoritative Authorization fallback module 39 New Modules - Authorization Modules Directives Mod_Authz_DBM • Require file-group* DBM file based group authorization • Require group • AuthDBMGroupFile • AuthzDBMAuthoritative • AuthzDBMType Mod_Authz_GroupFile • Require file-group* File based group authorization • Require group • AuthGroupFile • AuthzGroupFileAuthoritative Mod_Authz_Host • Allow Group authorization based on host (name or IP address) • Deny • Order 40 New Modules - Authorization Modules Directives Mod_Authz_Owner • Require Authorization based on file ownership • AuthzOwnerAuthoritative Mod_Authz_User • Require valid-user User authorization • Require user file-owner • AuthzUserAuthoritative 41 New Directives Renamed Directives AuthBasicProvider On|Off|provider-name [provider-name]… AuthDigestProvider On|Off|provider-name [provider-name]… AuthzXXXAuthoritative On|Off AuthBasicAuthoritative On|Off Multiple modules must be loaded (auth, authn, authz) rather than a single mod_auth_xxx module 42 Apache 2.0 Apache 2.2 Require Valid-User Require User user-id [user-id] … Require Group group-name [group-name] … Same as Apache 2.0 LDAP - ldap-user, ldap-group, ldap-dn, ldap-filter, ldapattribute GroupFile – file-group* DBM – file-group* Owner – file-owner Since multiple authorization methods can be used, in most cases the type names should be unique 43 LoadModule LoadModule LoadModule LoadModule auth_basic_module authn_file_module authz_user_module authz_host_module modules/mod_auth_basic.so modules/mod_authn_file.so modules/mod_authz_user.so modules/mod_authz_host.so <Directory /www/docs> Order deny,allow Allow from all AuthType Basic AuthName Authentication_Test AuthBasicProvider file AuthUserFile /www/users/users.dat require valid-user </Directory> The authentication provider is file based and the authorization method is any valid-user 44 LoadModule LoadModule LoadModule LoadModule LoadModule LoadModule auth_basic_module modules/mod_auth_basic.so authn_file_module modules/mod_authn_file.so authz_user_module modules/mod_authz_user.so authz_host_module modules/mod_authz_host.so authnz_ldap_module modules/mod_authnz_ldap.so ldap_module modules/mod_ldap.so The <Directory /www/docs> authentication Order deny,allow includes both file Allow from all and LDAP AuthType Basic providers with AuthName Authentication_Test the file provider AuthBasicProvider file ldap taking AuthUserFile /www/users/users.dat precedence AuthLDAPURL ldap://ldap.server.com/o=my-context followed by LDAP AuthzLDAPAuthoritative off require valid-user </Directory> 45 Moving from hook-based to provider-based authorization “AND/OR/NOT” logic in authorization Host Access Control as an authorization type Require IP …, Require Host …, Require Env … Require All Granted, Require All Denied “Order Allow/Deny”, “Satisfy” where did they go? Backward compatibility with the 2.0/2.2 Host Access Control, use the Mod_Access_Compat module 46 Allows authorization to be granted or denied based on a complex set of “Require…” statements New Directives <SatisfyAll> … </SatisfyAll> - Must satisfy all of the encapsulated statements <SatisfyOne> … </SatisfyOne> - Must satisfy at least one of the encapsulated statements <RequireAlias> … </RequireAlias> - Defines a ‘Require’ alias Reject – Reject all matching elements 47