Inside Sharepoint Building Your SharePoint Infrastructure Just as e-mail messaging transformed business communication, Web-based collaboration is changing how people work together and share information. As a good example, take a look at what SharePoint technology offers. With Microsoft Windows SharePoint Services (WSS) 3.0 and Microsoft Office SharePoint Server (MOSS) 2007, you can create team sites, portals, Web content management solutions, document libraries, and search centers, not to mention the 2007 Microsoft® Office system integration, XML-based forms, workflows, mobility support, and so on. It's not always simple to get started with SharePoint®, however. The terminology can be confusing. The system architecture can be complex, and SharePoint requires you to deal with multiple components, including IIS, the Microsoft .NET Framework, SQL Server®, and possibly other technologies, such as Business Intelligence, InfoPath® Forms Services, Rights Management Services (RMS), Exchange Server, and ForefrontTM Security. You can quickly lose your way with integrations and customizations given the many approaches you can take to create SharePoint solutions, whether through the built-in user interface or programmatically. Moreover, when a SharePoint application does not work, troubleshooting can be complicated. Oftentimes, you must have the mindset of an application developer to understand the components involved and how they interact. With all these challenges, where do you begin in order to build a robust, scalable, and manageable SharePoint infrastructure? In this column, I'll show you how to get started first by building a foundation with a high-level architecture discussion and then by diving into the deployment of WSS, including very basic branding customizations. Using the Self-Service Site Management feature of WSS 3.0, you will see how to delegate permissions for creating and managing SharePoint sites to individual users while maintaining centralized administrative control over the SharePoint infrastructure. By looking first at the SharePoint architecture, it becomes easier to understand the deployment and configuration steps necessary to implement a flexible and scalable infrastructure. So let's take a glance at the dependencies and then go straight into the deployment of WSS 3.0. For detailed deployment instructions, see the companion material for reference. You can find this material at the downloads section of the TechNet Magazine Web site at technetmagazine.com. SharePoint Architecture With SharePoint, it is helpful to think about the technology from a system architect's point of view. You don't have to know all the nitty-gritty details, but if you are familiar with the overall dependencies that arise from the SharePoint architecture, you can arrive at solutions faster because you can anticipate what you need to configure and why. SharePoint is a technology for provisioning Web applications and sites. It is an IIS-based Web site solution, integrating with IIS through ASP.NET and relying on a SQL Server database back end for storing configuration data and content. In short, SharePoint combines three different architectures at its core (IIS, .NET, and SQL Server), as illustrated in Figure 1. Figure 1 WSS 3.0 architecture based on IIS 6.0 and ASP.NET 3.0 (Click the image for a larger view) Don't let the diagram intimidate you. The architecture might look overwhelming at first, considering the sheer number of components. But all of these components fit into a logical framework that, when examined systematically, gives insight into the component dependencies. As depicted, SharePoint relies on IIS and ASP.NET to handle HTTP requests and responses. Standard IIS components, such as the HTTP kernel-mode driver (http.sys) and the Worker Process (w3wp.exe), perform the initial queuing and routing of requests until they arrive at the ASP.NET ISAPI filter (aspnet_isapi.dll). When you install the .NET Framework, the setup routine registers aspnet_isapi.dll in the IIS Metabase (C:\Windows\System32\Inetsrv\metabase.xml), as follows: Copy Code InProcessIsapiApps="C:\WINDOWS\system32\inetsrv\httpext.dll C:\WINDOWS\system32\inetsrv\httpodbc.dll C:\WINDOWS\system32\inetsrv\ssinc.dll C:\WINDOWS\system32\msw3prt.dll C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll" Once IIS loads the ASP.NET ISAPI filter, all incoming requests for a Web site can be passed to ASP.NET, which is important because SharePoint must eventually receive these requests through ASP.NET. To accomplish this, SharePoint extends the configuration of the selected Web site by adding a wildcard application map that routes all incoming requests, regardless of the file name extension, to the ASP.NET ISAPI filter. You can see this in IIS Manager after you install WSS 3.0 using the Basic installation option. The WSS setup routine deactivates the existing default IIS Web site on the server and creates a new default Web site on port 80 that has the ASP.NET wildcard application map defined, as shown in Figure 2. Figure 2 Wildcard application map for ASP.NET ISAPI filter (Click the image for a larger view) For ASP.NET to pass requests in turn to SharePoint, SharePoint must also extend the HTTP Request Pipeline through a custom HttpApplication object, which is implemented by means of a class called SPHttpApplication in the Microsoft.SharePoint assembly. SharePoint defines this custom application object in the ASP.NET application file (global.asax), which you can find in the file system in the root folder of SharePoint-extended Web sites. The following code lists the content of such a global.asax file: Copy Code <%@ Assembly Name="Microsoft.SharePoint"%> <%@ Application Language="C#" inherits="Microsoft.SharePoint.ApplicationRuntime.SPHttpApplication" %> ASP.NET dynamically parses and compiles this file to instantiate the SharePoint application object. For each received request, ASP.NET triggers a series of events that the Web application can process, such as BeginRequest, AuthenticateRequest, ProcessRequest, and EndRequest. The details of event handling are beyond the scope of deploying and managing SharePoint, yet it is important to know that, in addition to SPHttpApplication specified in global.asax, SharePoint implements custom HTTP handlers and modules defined in the web.config file for the site. For example, SharePoint uses an HTTP module based on the SPRequestModule class, registered as the first HTTP module prior to standard ASP.NET modules. SPRequestModule initializes the SharePoint runtime environment, such as by registering an SPVirtualPathProvider component with ASP.NET. SPRequestModule is for internal SharePoint use, but SharePoint solution developers can modify the web.config file to register additional components, such as custom HTTP handlers and modules. Through both custom and standard HTTP modules, SharePoint takes advantage of ASP.NET while maintaining tight control over all requests to SharePoint applications. Note that when you create a Web application using the SharePoint 3.0 Central Administration site, WSS adds the ASP.NET wildcard application map to the selected IIS Web site and creates the global.asax and web.config files in the Web site's root folder. Each Web application uses its own set of top-level global.asax and web.config files. To process requests and return meaningful information to browsers, WSS 3.0 and MOSS 2007 rely on the standard ASP.NET page parser, which compiles the requested ASP.NET pages or processes them in no compilation mode. But the ASP.NET pages that SharePoint passes to the ASP.NET parser are not necessarily located where they appear to exist. For example, you will not be able to find a default.aspx file in the root folder of a SharePoint-extended Web site, such as the SharePoint 3.0 Central Administration site, yet you are opening default.aspx when displaying the home page of that Web site. It is the SPVirtualPathProvider component that virtualizes the environment by loading the page content from the local file system or a SQL Server content database and passing it as a virtual file to the ASP.NET page parser. For the Central Administration site, SharePoint loads the default.aspx file from the C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\TEMPLATE\SiteTemplates\CentralAdmin folder. The home page, as well as most other SharePoint site pages, is linked to an ASP.NET Master Page (default.master) that implements a common layout for menus and navigation controls. Default.master resides in the C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\Template\Global folder and includes named placeholders for further content pages that can also reside on the local file system or in a SQL Server content database. The key point is that when you open a SharePoint site in a Web browser, you are actually viewing information from a collection of content pages that are not necessarily located on the local Web server and that are arranged according to a layout defined in a Master Page. The general rule is that unmodified pages (or uncustomized pages in SharePoint terminology) exist as page templates on the local file system of every SharePoint server, and customized pages are written to the content database so that all SharePoint servers in a Web farm have access to the same set of pages (see Figure 3). It is assumed that uncustomized pages are identical across all servers and sites in the Web farm. However, if you customize a content page or a Master Page in a SharePoint site, perhaps by using Office SharePoint Designer 2007, SharePoint automatically stores the customized page in the content database. Figure 3 Uncustomized and customized ASP.NET pages in a SharePoint application (Click the image for a larger view) In addition to customized pages and other Web site content, SharePoint also stores configuration data in a SQL Server database. SharePoint keeps the configuration data separate from the content because the configuration data is global in nature while the content is specific to each individual Web application and site collection. Accordingly, a Web farm can only have a single configuration database but it can have multiple content databases. All WSS servers in the Web farm use the same configuration database to share metadata, configuration settings, and information about every single IIS Web site that has been SharePoint- extended in the Web farm. Individual Web applications, on the other hand, can be associated with one or more content databases (though each content database can only be associated with one Web application). The relationship between IIS Web sites, Web applications, site collections, sites, and content databases can be confusing. In SharePoint terminology, the term Web application refers to a SharePoint-extended IIS Web site. A Web application can include multiple site collections, and each site collection again can include a toplevel site and sub-level sites that share the same configuration settings. Among other things, creating multiple site collections enables you to delegate site collection administration to different users and groups. A single site collection cannot span multiple content databases, but a Web application can use multiple content databases for multiple site collections to increase scalability and mitigate the performance impact of a large site that generates a significant amount of database activity on other SharePoint sites. However, it is not a good idea to place every SharePoint site in its own site collection because this form of deployment limits cross-site functionality. WSS 3.0 does not support content searches across multiple site collections. Such searches require MOSS 2007 or Microsoft Search Server 2008. For example, you can create a Web application and top-level site for http://contoso, and a site collection administrator can then create sub-level sites using the SharePoint user interface, such as http://contoso/info and http://contoso/events. All these sites exist in the same content database because they belong to the same site collection. As a farm administrator, you can also use a managed path, such as /sites, and then define additional site collections, such as http://contoso/sites/IT and http://contoso/sites/HR, in SharePoint 3.0 Central Administration. These three site collections (http://contoso, http://contoso/sites/IT, and http://contoso/sites/HR) can have different site collection administrators, configuration settings, and content databases, but they are still all accessed through the same IIS Web site (http://contoso) and use the same application pool identity of the Web application. Of course, there are many more details, but this relationship among IIS, ASP.NET, and SQL Server is especially important to understand to get comfortable with SharePoint. If you are interested in reading more about the SharePoint architecture, I recommend Ted Pattison's MSDN® Magazine article "Discover Significant Developer Improvements in SharePoint Services," available at msdn.microsoft.com/msdnmag/issues/06/07/WSS30Preview. SharePoint Infrastructure Elements Now let's translate our brief architecture discussion into a flexible SharePoint infrastructure. As you certainly have noticed, we need Windows Server®, IIS, .NET Framework 3.0 (both for ASP.NET and the Windows® Workflow Foundation), WSS 3.0 or MOSS 2007, and SQL Server. Although you can look forward to IIS 7.0 on Windows Server 2008, we'll use IIS 6.0 on Windows Server 2003 for our purposes because at the time of this writing, it is the most commonly deployed version. We will stay with WSS 3.0 because we require no MOSS 2007-specific features for an initial SharePoint pilot. For a minimalist approach, you can install WSS 3.0 with all required components on a single computer (as outlined in the WSS 3.0 on a single computer.pdf, available in the companion material for this column), which is good enough for a lab server or a small workgroup environment. However, if you intend to focus on flexibility in your SharePoint infrastructure, you should not start with a standalone deployment in your production environment. For the sake of availability as well as future scalability, it is better to start with a multitier infrastructure and add more servers as needed. Figure 4 shows the SharePoint infrastructure I recommend if you are looking for a straightforward and flexible system configuration. It includes a Web farm of two SharePoint servers and a separate computer running SQL Server. This configuration eliminates database processing overhead on the Web servers, increases availability, scalability, and facilitates system maintenance. Note that you do need Active Directory® because this is a software requirement of WSS 3.0 in a Web farm deployment. For step-by-step deployment instructions, see the Basic SharePoint Infrastructure.pdf file in the companion material. Figure 4 Basic SharePoint infrastructure that can accommodate future growth (Click the image for a larger view) The domain account that you use to deploy SharePoint in this arrangement requires the permissions of a local administrator on the Web servers. It is also necessary to add this account to the SQL Server roles dbcreator and securityadmin as well as to the database role db_owner for the master database in SQL Server 2005, as shown in the Basic SharePoint Infrastructure.pdf. You can then use the SharePoint Products and Technologies Configuration Wizard during the WSS 3.0 installation to create the necessary configuration database for the Web server farm and a content database for the SharePoint 3.0 Central Administration site. Otherwise, a SQL Server administrator must provision these databases for you and add the WSS system accounts to the db_owner role. It is important to keep in mind that the user account you use to install SharePoint is not the account SharePoint uses to access the configuration database or the content database for the Central Administration site. Instead, SharePoint uses the system account configured as the identity of the application pool for the SharePoint 3.0 Central Administration site. The SharePoint Products and Technologies Configuration Wizard will prompt you for the account information. It is a good idea to use a dedicated domain user account for this, such as CONTOSO\WssConfigAdmin. My general practice has been to use individual, dedicated user accounts for any additional Web applications that I create later. Using a separate application pool for each Web application helps to ensure process isolation, and using a different user account for each application pool helps to maintain security isolation. It should be noted though that this is just one approach, and the manageability and potential performance impact that may have should be evaluated against your own environment and business requirements. Another important system account that a domain administrator should create for you is the search service account. You can use the Central Administration account, but for added security it is better to use a dedicated search account, such as CONTOSO\WssSearch, that has no administrative permissions and cannot modify any content. Write permission to content databases is not necessary because the Search service only crawls content for indexing purposes and maintains the search data in a separate database. When you create a Web application in a server farm, you can associate that content database with a search server, which implicitly adds the corresponding search service account to the Web application's Full Read policy. Search servers are SharePoint servers running the Windows SharePoint Services Search service. If you followed the step-by-step instructions in the Basic SharePoint Infrastructure.pdf file, you have configured both Web servers as search servers so that you can distribute the load of crawling and indexing multiple content databases. However, it is also possible to configure a dedicated search server in a Web farm, excluded from network load balancing and client connections so that client connections are not affected by crawling activities. Self-Service Site Management With a basic SharePoint infrastructure in place, we can delegate the administration of site collections and sites to individual departments and users without decentralizing administrative control over Active Directory, the Web server farm, or the SQL Server databases. As a farm administrator, you collaborate with Active Directory and SQL Server administrators to provision application pool accounts and content databases for your Web applications. Within these Web applications, you then create site collections and designate site collection administrators with the right to create sub-level sites. In this way, site collection administrators within individual departments can manage their SharePoint resources with minimal involvement of the IT department, as illustrated in Figure 5. Figure 5 Decentralized site administration in a centralized SharePoint infrastructure (Click the image for a larger view) It is also possible to give users the ability to create site collections under managed paths (such as the /sites path or other managed paths with wildcard inclusions that you create in SharePoint 3.0 Central Administration). If you enable the Self-Service Site Management feature within a Web application, users can create their own site collections and manage site groups and permissions within the SharePoint user interface. Unlike sub-level sites, site collections do not inherit permissions from a parent site. Self-Service Site Management is not appropriate for every SharePoint environment and is disabled by default. If you enable it, you might end up with a large number of infrequently used site collections in your content databases. However, this feature demonstrates very vividly the flexibility of SharePoint administration, and I recommend you check it out in your pilot deployment. (Additionally, there are options within SharePoint to notify users and/or administrators about inactive sites so they can be removed if necessary.) You must enable Self-Service Site Management for a Web app explicitly, as outlined in the Enabling Self-Service Site Management.pdf file in the companion material. SharePoint Customizations and Branding Sharepoint Resources Windows SharePoint Services TechCenter Windows SharePoint Services Developer Center Microsoft SharePoint Products and Technologies Team Blog Logical Architecture Components Design Server Farms and Topologies At this point, it is a natural desire to include the company's logo, name, and corporate colors in the SharePoint user interface. Be aware, however that you are about to take your SharePoint project to the ASP.NET developer level. At a minimum, you need a development system, such as a standalone WSS 3.0 server with Microsoft Office SharePoint Designer 2007 (see the SharePoint Designer Installation.pdf in the companion material) so that you can create and test your customizations without affecting the production environment. Also, you should visit the Windows SharePoint Services Developer Center at msdn2.microsoft.com/sharepoint to learn more about the abundance of customization options that SharePoint offers. While SharePoint development is outside the scope of this column, let me point out a few aspects you should take into consideration. SharePoint stores customized pages in the content database of the corresponding site collection. In other words, any customizations that you apply to site pages, application pages, Master Pages, style sheets, and so forth, in a SharePoint site only apply at the site collection or site level. This is great for individual site collection administrators who want to adjust the look and feel of their sites using SharePoint Designer 2007, but it's not so great for the farm administrator who wants to enforce corporate identity across all Web applications, site collections, and sites in a Web farm. You can create custom site themes or custom site definitions based on a copy of a standard SharePoint theme or site definition. You can also create custom Master Pages and add these to the Master Page Gallery. However, none of these options enforces global branding if Self-Service Site Management is enabled because a user with the permissions to create site collections or sites can still select a standard site template that does not show the corporate identity. Global branding requires you to replace the default SharePoint components so that your custom components are used instead. Developers go to great lengths to accomplish this without modifying the original files. One approach is to change the configuration of the virtual directories in IIS Manager and point them to new folders with customized files. Another method is to implement a custom HTTP module or ISAPI filter that rewrites URLs to redirect requests for specific default pages to customized versions. Conclusion I've focused on the essentials of establishing a SharePoint infrastructure with WSS 3.0. I did not cover other features, such as workflows, surveys, messaging integration, and antivirus, nor MOSS 2007-specific functionality such as portals, site directories, and business data catalogs. The customizations for site administration and company branding also only touched at the possibilities within SharePoint. You can perform further customizations with WSS 3.0 by programming custom applications in Visual Studio®. The infrastructure is robust enough to handle future growth if you add more Web servers or database servers. And with the pilot rollout of a few customizations, users can create individual sites and generally become familiar with SharePoint. In this way, you establish the core components of user adoption as well as hardware and software, which are flexible enough to accommodate change and serve as the foundation for a full-fledged production rollout.