LSASS Offline Operation LSASS Offline Operation Likewise Software July 24, 2008 1 LSASS Offline Operation 1 2 3 Introduction.............................................................................................................. 3 Caching Requirements ............................................................................................ 4 Offline State Manager (OM) ..................................................................................... 5 3.1 Header File for Offline State Manager ............................................................... 5 3.2 Offline Transitions ............................................................................................. 8 4 Operations ............................................................................................................... 9 4.1 LsaInitializeProvider .......................................................................................... 9 4.2 LsaShutdownProvider ....................................................................................... 9 4.3 OpenHandle .................................................................................................... 10 4.4 CloseHandle ................................................................................................... 10 4.5 ServicesDomain .............................................................................................. 10 4.6 AuthenticateUser............................................................................................. 10 4.7 ValidateUser ................................................................................................... 10 4.8 FindUserByName ............................................................................................ 10 4.9 FindUserById .................................................................................................. 11 4.10 BeginEnumUsers......................................................................................... 11 4.11 EnumUsers.................................................................................................. 11 4.12 EndEnumUsers ........................................................................................... 12 4.13 FindGroupByName ...................................................................................... 12 4.14 FindGroupById ............................................................................................ 12 4.15 GetUserGroupMembership .......................................................................... 12 4.16 BeginEnumGroups ...................................................................................... 12 4.17 EnumGroups ............................................................................................... 12 4.18 EndEnumGroups ......................................................................................... 12 4.19 ChangePassword ........................................................................................ 12 4.20 AddUser ...................................................................................................... 12 4.21 ModifyUser .................................................................................................. 13 4.22 DeleteUser .................................................................................................. 13 4.23 AddGroup .................................................................................................... 13 4.24 DeleteGroup ................................................................................................ 13 4.25 OpenSession ............................................................................................... 13 4.26 CloseSession .............................................................................................. 13 4.27 GetNamesBySidList..................................................................................... 13 2 LSASS Offline Operation 1 Introduction This document describes the offline operation in LSASS. There most critical scenario is for a user in a disconnected notebook computer being able to log on and work normally. Additional scenarios include cases where the computer’s domain cannot be contacted and cases where the computer’s domain is reachable but not a user’s domain. Any call into LSASS that is supposed to work offline should work if we cannot contact the domain at the start of the call. Currently, It is not a goal to be able to return successfully from an LSASS call with cached data if the call starts out being able to talk to the domain but later has a connection error to the domain partway through the call. Note that not all online functionality need be present while offline. For example, offline user and group enumeration is a non-goal. 3 LSASS Offline Operation 2 Caching Requirements The current LSASS cache caches the results of operations at the top-most level. It contains user, group, and group membership information. When a user or group is in the cache, its complete Unix and AD information is present. (The group membership information, however, may be partial.) The caching in LSASS will need to contain the following: Add caching of the cell information for the computer. This is necessary for LSASS to be able to start up and initialize when it cannot contact the computer’s domain. The information cached needs to include the domain, DN, schema vs. non-schema, any linked cells, and any trusted domains. Add one-way hash of user’s password1 to the user information stored in the cache. This is present only after the user has logged into the machine. So user information in the database may or may not have a password hash. Given the existing cache database design, this would be stored as a separate table in the database which would have a relation to the user entry. Make sure that forest information is available somehow via the cache (i.e., what domain a forest is in, or what forest a cache entry came from). Entries will not be reaped after expiration so that they can be used while offline as “best available information”. We can implement reaping policies that take into account whether the user has logged in (i.e., has a password hash in the cached), whether and how long to keep such entries, whether and how long to keep “expired” entries w/o passwords, etc. [Note that the reaper thread currently does nothing.] We should not need negative caching for this design as the offline state of a domain will prevent us from trying to contact it. 1 See http://technet.microsoft.com/en-us/library/cc512606(TechNet.10).aspx 4 LSASS Offline Operation 3 Offline State Manager (OM) For each domain, we will track whether it is online (potentially reachable) or offline (considered unreachable). A domain is considered offline if we cannot talk to a DC (or GC) for that domain. (Note that the “domain” when talking about the GC here is the forest root domain. This is different from some of the APIs used in LSASS, where we talk to ‘the GC for a domain” which is internally translated as the GC for that domain’s forest root.) There is an “online detection thread” which will periodically attempt to contact a DC for each offline domain. If this thread determines that the domain is reachable, it will mark that domain as no longer offline. Note that, we really only need to track offline domains. Any domain that is not offline is treated as online. In addition, to help in testing (and perhaps even some customer scenarios), we will be able to set specific domains or all domains to be “forced offline”. When this flag is set, the domains in thus affected will be treated as offline and will not be transitioned online via the “online detection thread”. Note that supporting “force offline” comes at minimal cost. 3.1 Header File for Offline State Manager /// @file offline.h /// /// @author Danilo Almeida (dalmeida@likewisesoftware.com) /// ////////////////////////////////////////////////////////////////////// /// /// @defgroup lsa_om LSASS Offline State Manager (LsaOm) /// /// @details These functions control the LSASS offline state. /// @{ /// /// Keeps track of all offline state. /// typedef struct _LSA_OM_OFFLINE_STATE { /// Whether to force all domains offline. BOOLEAN bIsForceOffline; /// @brief List of domains that are offline (LSA_OM_DOMAIN_OFFLINE_STATE). /// @details If IsForceOffline is set, this list will be empty except for /// domains that are marked as forced offline. LIST_ENTRY DomainOfflineList; // Additional internal fields related to threading, etc. go here. // pthread_t OnlineDetectionThread // pthread_t* pOnlineDetectionThread // ... } LSA_OM_OFFLINE_STATE, *PLSA_OM_OFFLINE_STATE; /// /// Keeps track of offline state of a single domain. /// typedef struct _LSA_OM_DOMAIN_OFFLINE_STATE { /// Name of domain that is offline 5 LSASS Offline Operation PSTR pszDomainName; /// Whether the domain is forced offline. If the domain is forced /// offline, the online detection thread will not attempt to /// transition it online. BOOLEAN bIsForceOffline; /// List entry in the DomainOfflineList of LSA_OM_OFFLINE_STATE. LIST_ENTRY DomainOfflineListEntry; } LSA_OM_DOMAIN_OFFLINE_STATE, *PLSA_OM_DOMAIN_OFFLINE_STATE; /// /// LSASS offline state. /// /// @note To get rid of this global variable, we would pass it to the /// offline manager functions. /// LSA_OM_OFFLINE_STATE gLsaOmState; DWORD LsaOmInitialize( ); ///< /// Initialize state for offline manager. /// /// This includes starting up the online detection thread. /// /// @return LSA status code. /// @arg SUCCESS on success /// @arg !SUCCESS on failure VOID LsaOmCleanup( ); ///< /// Initialize state for offline manager. /// /// This includes terminating the online detection thread. /// DWORD LsaOmSetForceOfflineState( IN OPTIONAL PCSTR pszDomainName, IN BOOLEAN bIsSet ); ///< /// Set/unset force offline state of a specific domain. /// /// This sets/unsets whether all or a specfic domain should be forced /// offline. This condition is trigerred manually. We could also do /// the global trigger via an external (media sense) event. Setting /// force offline will transition a domain offline and will prevent it /// from being transitioned online. Unsetting the bit allows a domain /// to be subsequently transitioned online. If the force offline /// setting is already in the desired state, this function succeeds. /// /// @param[in] pszDomainName - Optional Name of domain to set/unset force /// offline. If NULL, operations on global force state. /// /// @param[in] bIsSet - Whether to set/unset. /// /// @return LSA status code. /// @arg SUCCESS on success /// @arg !SUCCESS on failure /// DWORD LsaOmTransitionOffline( IN PCSTR pszDomainName ); ///< /// Transition a domain to offline mode. /// 6 LSASS Offline Operation /// /// /// /// /// /// /// /// /// This adds the domain to the offline domains list, if it is not already there. @param[in] pszDomainName Name of domain to transition. @return LSA status code. @arg SUCCESS on success @arg !SUCCESS on failure DWORD LsaOmTransitionOnline( IN PCSTR pszDomainName ); ///< /// Transition a domain to online mode. /// /// This removes a domain from the offline domains list, if it /// is in the list and not forced offline. /// /// @param[in] pszDomainName Name of domain to transition. /// /// @return LSA status code. /// @arg SUCCESS on success /// @arg NOT_FOUND if not in list /// @arg FORCED_OFFLINE if forced offline /// BOOLEAN LsaOmIsDomainOffline( IN PCSTR pszDomainName ); ///< /// @brief Checks whether a domain is in offline mode. /// /// @param[in] pszDomainName /// Name of the domain to check. /// /// @return Whether DomainName is offline (forced or otherwise). /// DWORD LsaOmDetectTransitionOnline( IN OPTIONAL PCSTR pszDomainName ); ///< /// /// Transition a domain online if it is reachable. /// /// @param[in] pszDomainName /// Optional name of domain to try to transition online. /// tries to transition all domains. /// /// @return LSA status code. /// @arg SUCCESS on success /// @arg !SUCCESS on failure /// If NULL, ////////////////////////////////////////////////////////////////////// /// /// @defgroup lsa_om_external External Diagnostics /// /// @details These functions are purely for external diagnostics. /// @{ typedef struct _LSA_OM_OFFLINE_STATE_BUFFER { BOOLEAN bIsForceOffline; DWORD dwCount; PSTR* ppszDomainName; PBOOLEAN pDomainIsForceOffline; } LSA_OM_OFFLINE_STATE_BUFFER, *PLSA_OM_OFFLINE_STATE_BUFFER; 7 LSASS Offline Operation DWORD LsaOmQueryOfflineState( OUT PLSA_OM_OFFLINE_STATE_BUFFER* ppState ); VOID LsaOmFreeOfflineState( IN OUT PLSA_OM_OFFLINE_STATE_BUFFER pState ); /// @} lsa_om_external /// @} lsa_om 3.2 Offline Transitions Offline transitions (calls to LsaOmTransitionOffline()) will happen from the locations where we currently do re-affinitizing2 and also when we try to do authentication using Kerberos3, . The idea is that, if we try to re-affinitize, but fail again, we transition the corresponding domain to offline. 2 LsaLdapOpenDirectoryEx(), AD_InitializeOperatingMode(), AD_LookupObjectSidByNameWithReaffinity() (which does LsaLookupNames2() RPC), AD_AddDomainTrustsInfoForDomain() (which does DsEnumerateDomainTrusts RPC). 3 Via LsaSetupUserLoginSession() in AD_UserAuthenticate(). 8 LSASS Offline Operation 4 Operations These are the various operations Function LsaInitializeProvider LsaShutdownProvider OpenHandle CloseHandle ServicesDomain AuthenticateUser ValidateUser FindUserByName FindUserById BeginEnumUsers EnumUsers EndEnumUsers FindGroupByName FindGroupById GetUserGroupMembership BeginEnumGroups EnumGroups EndEnumGroups ChangePassword AddUser ModifyUser DeleteUser AddGroup DeleteGroup OpenSession CloseSession GetNamesBySidList Offline Support Yes No Yes Yes Yes Yes Yes Yes Yes No No No Yes Yes Yes No No No No N/A N/A N/A N/A N/A Yes Yes Yes Changes Needed Yes Yes No No No Yes No Yes Yes No Yes No Yes Yes Yes No Yes No Yes No No No No No Yes No Yes [When discussing Init/Shutdown, discuss impact on background threads – machine password sync and reaper. Will also discuss online transition thread.] [Will need to add an entrypoint for sending the provider control operations, such as “go all offline” or “go all online”. This can also be useful for things like “flush the cache”.] 4.1 LsaInitializeProvider This function will initialize the OM, including the online detection thread, by calling LsaOmInitialize(). In addition, it will be modified to cache the LSASS cell information gathered at startup and to be able to by using this cached information when the machine’s domain is offline. 4.2 LsaShutdownProvider This function will be modified to call LsaOmCleanup(). 9 LSASS Offline Operation 4.3 OpenHandle This function does not need to be modified as it only sets up authentication context for calls into the provider. 4.4 CloseHandle This function does not need to be modified. 4.5 ServicesDomain This function currently just walks the trusted domain list. Since this will be initialized from the cached information when offline, this does not need to be modified. 4.6 AuthenticateUser This function calls FindUserByName() to resolve the user and then LsaSetupUserLoginSession() to log in a user. FindUserByName() will work fine offline. When calling LsaSetupUserLoginSession(), we will need to detect a connection failure and attempt an offline logon for the user in that case. Note that, if we get a PAC from LsaSetupUserLoginSettion(), the function calls AD_CacheGroupMembershipFromPac(), which calls ADLdap_GetUserGroupMembership(). However, if we are offline, we will not have a PAC. [Note that if we go offline after getting the PAC while doing ADLdap_GetUserGroupMembership, we can still fail. This is ok given our current design goals.] Finally, any successful user logon will need to cache the user’s password for offline logons. [Note: Do we want to throw away the cached information if the user fails to logon due to a password error but the password is what we have cached for the user?] 4.7 ValidateUser This function, in effect, just calls FindUserByName(). Since the latter will be modified to work offline, we do not need to modify ValidateUser(). 4.8 FindUserByName This function currently does the following: First, it cracks username. There are three two cases: 1. The name is DOMAIN\samAccountName or UPN, so the user’s domain is known. 2. The name is an alias so the user’s domain is not known. If the user’s domain is not known, the function searches GCs for the alias to find the user’s domain and samAccountName that is resolvable via LsaLookupNames2(). Next, we get the SID from the samAccountName or UPN. Then, the user’s domain is used to find the client domain’s trust relationship to the user domain (along with the FQDN of the user’s domain). Finally, the SID is used to get the full user information (AD 10 LSASS Offline Operation and Unix). In a one-way trust (which must be cell mode), we look up the AD and Unix information in the client domain (which is where the cell and any linked cells reside). In a two-way trust, we look up the information in the cell, local forest GC or another GC as needed. For offline functionality: If the client domain is offline, we return a cached entry, if one exists (even if it is expired). We do this because we will not be able to look up any entry’s SID or resolve aliases. Next, we need to determine the domain for the user in order to see whether the domain is offline. If the name is in the cache, we use the domain information from the cache to determine whether the domain is offline. If the domain is offline, return the cached (and potentially expired) entry. Otherwise, we need to continue resolving the name. If the cracked name is not an alias, we know the domain for the user and can determine offline state for the domain/forest. If the cracked name is an alias, we need to resolve its domain. In order to do that, we need to search the relevant cell(s) or domain(s). Insofar as we need to search other domains, we check the domain/forest’s offline state before issuing a query to that domain. If the domain is offline, we fail right away. Note that, at this point, we already checked the client domain and user domain and forest for offline state. So we can continue resolving the trusts, then SID as usual. Finally, we look up the user information as usual. 4.9 FindUserById This function is the same as FindUserByName() except that it converts from id to samAccountName first. In converting the id, the idea is to search for the id in the relevant domain(s). For offline operation, this function needs to look in the cache first. If the user’s domain/forest if offline, return the cached data. If the client domain is offline also return cached data. Otherwise, convert the id to a samAccountName as usual. In converting the id to a samAccountName, we need to check whether the domain/forest to which we need to communicate is offline. If so, we fail the operation. 4.10 BeginEnumUsers This function just allocates and initializes an enumeration context. No changes are needed. 4.11 EnumUsers Since user enumeration only works in the current domain, we just need to check for offline state of the current domain to fail fast in the offline case. 11 LSASS Offline Operation 4.12 EndEnumUsers This function just frees an enumeration context. No changes are needed. 4.13 FindGroupByName This works almost just like FindUserByName(). The offline modifications are equivalent. However, after finding the group, the group membership, group expansion is done by calling AD_GetExpandedGroupUsers() which calls AD_GetGroupMembers() to get membership information for a group. The latter handles caching of group membership information, going to AD if the group membership is not cached. When going to AD, basically converts SIDs to names and grabs info (including membership) for each named group. If domain is offline, we should just use cached results. 4.14 FindGroupById This works just like FindUserById(). The offline modifications are equivalent. 4.15 GetUserGroupMembership This function first calls GetUserById(). Then gets groups for which user is a member. Then does group expansion (see FindGroupByName()). First check whether user’s domain is offline. If so, that’s it. When expanding groups, use same logic as FindGroupByName(). 4.16 BeginEnumGroups This function just allocates and initializes an enumeration context. No changes are needed. 4.17 EnumGroups This works just like EnumUsers(). The offline modifications are equivalent. 4.18 EndEnumGroups This function just frees an enumeration context. No changes are needed. 4.19 ChangePassword This function calls AD_FindCachedUserByName(), which is basically a copy of FindUserByName() minus a data conversion step, then, using the user’s domain, calls AD_NetUserChangePassword() to actually change the password. We will need to combine AD_FindCachedUserByName() and FindUserByName(). Then, if domain of the user we found is offline, we will fail the change password w/o calling AD_NetUserChangePassword(). 4.20 AddUser This function returns an error indicating that it is not supported. No change is necessary for offline operation. 12 LSASS Offline Operation 4.21 ModifyUser This function returns an error indicating that it is not supported. No change is necessary for offline operation. 4.22 DeleteUser This function returns an error indicating that it is not supported. No change is necessary for offline operation. 4.23 AddGroup This function returns an error indicating that it is not supported. No change is necessary for offline operation. 4.24 DeleteGroup This function returns an error indicating that it is not supported. No change is necessary for offline operation. 4.25 OpenSession This function calls FindUserByName() to get user information that it uses to set up the user’s home directory, .k5login file, and to call LsaKrb5GetServiceTicketForUser(). Since FindUserByName() will be modified to work offline, all the needs to be handled here in the offline case is a failure in LsaKrb5GetServiceTicketForUser(). If that returns a connection error, this function will simply ignore the error, logging an appropriate message. 4.26 CloseSession This function just calls FindUserByName() and therefore does not need to be modified. 4.27 GetNamesBySidList This function effectively searches for object with the specified SID using LDAP. Once the object name and type are found, this function calls the equivalent of FindUserByName() or FindGroupByName() (except that it uses the re-implmentations: AD_FindCachedUserByName() or AD_FindCachedGroupByNT4Name()) to get the full object information. Then the full object information is discarded and just the name and type info is returned. Note that this function only queries the client’s domain. Why don’t we just use LsaLookupSids instead of sequential LDAP? Also, as far as caching, could have an additional SID-to-name cache that is also updated whenever cached user info is updated. Caching wraps the entire inner workings of this function. [Note that the function does extra work of fetching full object information that it does not need. But that is probably done due to how caching works – i.e., caching caches entire objects. However, it may be that this causes trouble with respect to performance as compared to winbind.] 13 LSASS Offline Operation For offline operation, we would check the state of the relevant domain(s), which, in the current implementation, is just the client domain. If the client domain is offline, we would just return the data from the cache. Need to test RPC LsaLookupNames2()/LsaLookupSids() when domain in question is offline…Hopefully the client domain does the right thing. 14