White Paper: Working with large lists in Office SharePoint ® Server 2007 Author: Steve Peschka Date published: August 2007 Summary: Microsoft performed performance testing against Microsoft® Office SharePoint® Server 2007 to determine the performance characteristics of large SharePoint lists under different loads and modes of operation. This white paper presents their findings. The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication. This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS DOCUMENT. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. 2007 Microsoft Corporation. All rights reserved. Microsoft, SQL Server, Windows, SharePoint, and Active Directory are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. Table of Contents Goals 4 Test results and findings 4 Test characteristics 4 Data access methods 5 Browser 5 SPList with For/Each 5 SPList with SPQuery 6 SPList with DataTable 7 SPListItems with DataTable 8 Lists Web service 8 Search 9 PortalSiteMapProvider Test harness 11 11 WinForm test application 12 WebPart and JavaScript 12 Web Part 12 Test results 12 Browser-based viewing and page size 13 The baseline test 14 Testing with a very large list 15 Comparing results with an indexed column 17 Comparing an indexed column to an ID column 18 Analyzing the results 18 Search 19 PortalSiteMapProvider 19 SPList 19 Data maintenance considerations 20 Data locking 21 Crawl times 21 Related content 21 Goals The test results in this white paper are intended to demonstrate the difference in the performance characteristics of SharePoint lists containing large numbers of items when different data access types are used to present list contents. Test results in this white paper show how to optimize list performance through limits on the number of items that appear in a list, and by choosing the most appropriate method of retrieving list contents. The tests upon which the results in this white paper are based were conducted by using artificially created test data and simulated users. Real-world results may vary depending on hardware, number of concurrent users, farm configuration, and user operations being performed. Test results and findings There is documented guidance for Microsoft® Office SharePoint® Server 2007 regarding the maximum size of lists and list containers. For typical customer scenarios in which the standard Office SharePoint Server 2007 browser-based user interface is used, the recommendation is that a single list should not have more than 2,000 items per list container. A container in this case means the root of the list, as well as any folders in the list — a folder is a container because other list items are stored within it. A folder can contain items from the list as well as other folders, and each subfolder can contain more of each, and so on. For example, that means that you could have a list with 1,990 items in the root of the site, 10 folders that each contain 2,000 items, and so on. The maximum number of items supported in a list with recursive folders is 5 million items. In Office SharePoint Server 2007, virtually all end-user data is stored in a list. A document library, for example, is just a specialized list. The same is true for calendars, contacts, and other interfaces; they are all just customized versions of the basic SharePoint list, also referred to as an SPList. The individual items in the list are referred to as list items generally, or an SPListItem in an SPListItemCollection in the Office SharePoint Server 2007 object model. The findings in this article are equally important across all of the ways in which you store and work with data in a Office SharePoint Server 2007 site. There are some scenarios in which you want to take advantage of the features of Office SharePoint Server 2007, but need to exceed the limit of 2,000 items per container. If you write your own interface for managing and retrieving the data, it’s quite possible that you can go past this limit without an adverse impact on farm performance. You may be able to manage larger lists to some extent by using views within Office SharePoint Server 2007 that are filtered such that there are never more than 2,000 items returned. Filtered views provide better performance than just trying to view one large flat list, but are not as efficient as breaking down the list into different containers if you are using the predefined browser-based Office SharePoint Server 2007 interface. If you develop your own interface, there are several different ways to retrieve list data, each with different performance characteristics. Some data access methods perform very well, but are only useful in a limited number of scenarios. Finally, there are also performance tradeoffs that need to be made with other data maintenance tasks in addition to data retrieval. Test characteristics The tests in this white paper were conducted on a relatively underpowered Microsoft Virtual Server 2005 R2 image to show a comparison of farm performance characteristics when different data access types are used to manipulate list data. The goal of these tests was not to establish a new arbitrary limit, or to deliver a “requests per second” type number that is typically used in a load style test to show raw throughput capacity. The virtual server image was running Office SharePoint Server 2007 Enterprise Edition and had 1 gigabyte (GB) of allocated RAM. Virtual Server was running on a host machine with a 2 gigahertz (GHz) dual-core processor and 2 GB of RAM. Baseline tests were done first with a list containing 1,500 items. The list schema looked like this: Title: Single line of text Expense Category: Choice (Meals, Travel, Hotel, Supplies) Amount: Currency Deductible: Yes/No Created By: Person or Group Modified By: Person or Group In the baseline tests, no columns were indexed; measurements were taken just to provide a relative value that could be used after the number of items in the list exceeded recommended boundaries. In the tests against a very large list, one set was done with no columns being indexed and a second round was done after configuring the Expense Category column to be indexed. The query that was executed in each one of the tests used a WHERE clause against the Expense Category field looking for the first 100 items that contained “Supplies.” To provide another point of comparison, the data being selected was based on ID value in the tests against the very large list. The ID is a built-in numeric indexed field in all SharePoint lists that is well suited to queries. The query in this case was constructed with a WHERE clause that retrieved items where the ID ranged from 44,500 through 44,599. Some tests were also run with the site under load. To create the load during the testing process, a LoadTest was created in the Microsoft Visual Studio® .NET 2005 development system to stress test the site. Instead of targeting a specific number of users in the test, it was configured as a goal-based test, or a test in which a target value is defined for a particular measurement, and the test determines the number of requests required to achieve the target. In this case, the goal that was configured for the test was to achieve a consistent target CPU utilization on the Office SharePoint Server 2007 computer of from 60 through 80 percent. Data access methods Each test consisted of retrieving a subset of data from the list using one of a number of different data access methods. This section shows the different methods that were tested. Note: The code samples included in the following sections are intended to show the process used to conduct tests. The code may not comply with coding best practices, and should not be used in a production environment without careful review and testing. Browser The list was viewed using a browser and the predefined Office SharePoint Server 2007 interface. A special tool, which is described in the Test Harness section later in this white paper, was developed to accurately capture how long it takes to view that information and browse through pages of data. SPList with For/Each The Office SharePoint Server 2007 object model (OM) was used to retrieve the list into an SPList object. Each item in the list was then enumerated with a For/Each loop until items were found that matched the search criteria. The following sample code was used for this method. 'get the site Dim curSite As SPSite = New SPSite("http://myPortal") 'get the web Dim curWeb As SPWeb = curSite.OpenWeb() 'get our list Dim curList As SPList = curWeb.Lists(New Guid("myListGUID")) 'get the collection of items in the list Dim curItems As SPListItemCollection = curList.Items 'enumerate the items in the list For Each curItem As SPListItem In curItems 'do some comparison in here to see if it's an item we need Next SPList with SPQuery The OM was used to create an SPQuery object that contained the query criteria. That object was then used to against an instance of the list in a SPList object. The results of the query were returned by calling the GetItems method on the SPList object. The following sample code was used for this method. 'get the site Dim curSite As SPSite = New SPSite("http://myPortal") 'get the web Dim curWeb As SPWeb = curSite.OpenWeb() 'create our query Dim curQry As SPQuery = New SPQuery() 'configure the query curQry.Query = "<Where><Eq><FieldRef Name='Expense_x0020_Category'/><Value Type='Text'> Hotel</Value></Eq></Where>" curQry.RowLimit = 100 'get our list Dim curList As SPList = curWeb.Lists(New Guid("myListGUID")) 'get the collection of items in the list Dim curItems As SPListItemCollection = curList.GetItems(curQry) 'enumerate the items in the list For Each curItem As SPListItem In curItems 'do something with each match Next SPList with DataTable This is one of two methods that test using a Microsoft ADO.NET DataTable to work with the data. In this case an instance of the list is obtained with an SPList object. The data from it is then retrieved into a DataTable by calling the GetDataTable() method on the Items property —for example, SPList.Items.GetDataTable(). The DataTable’s DefaultView has a property called RowFilter that was then set to find the items. To keep the methodology between data access methods consistent, the DataTable was not cached between tests —it was filled each time by calling the GetDataTable() method. In a real-world scenario this test would have performed better had the DataTable been cached after the data was first retrieved, but it serves as a valuable point in comparison testing about the cost of this approach versus retrieving a DataTable from a selection of data that’s already filtered. The following sample code was used for this method. 'get the site Dim curSite As SPSite = New SPSite("http://myPortal") 'get the web Dim curWeb As SPWeb = curSite.OpenWeb() 'get our list Dim curList As SPList = curWeb.Lists(New Guid("myListGUID")) 'get the item in a datatable Dim dt As DataTable = curList.Items.GetDataTable() 'get a dataview for filtering Dim dv As DataView = dt.DefaultView dv.RowFilter = "Expense_x0020_Category='Hotel'" 'enumerate matches For rowNum As Integer = 0 To dv.Count - 1 'do something with each match Next SPListItems with DataTable This method is similar to the SPList with DataTable method, but with a twist. An instance of the list is retrieved through an SPList object. An SPQuery object is created to build a query, and that query is executed against the SPList object, which returns an SPListItems collection. The data from that collection is then retrieved into a DataTable by using the GetDataTable() method on the SPListItems collection. The following sample code was used for this method. 'get the site Dim curSite As SPSite = New SPSite("http://myPortal") 'get the web Dim curWeb As SPWeb = curSite.OpenWeb() 'create our query Dim curQry As SPQuery = New SPQuery() 'configure the query curQry.Query = "<Where><Eq><FieldRef Name='Expense_x0020_Category'/><Value Type='Text'>Hotel</Value></Eq></Where>" curQry.RowLimit = 100 'get our list Dim curList As SPList = curWeb.Lists(New Guid("myListGUID")) 'get the collection of items in the list Dim curItems As SPListItemCollection = curList.GetItems(curQry) 'get the item in a datatable Dim dt As DataTable = curItems.GetDataTable() 'enumerate matches For Each dr As DataRow In dt.Rows 'do something with each match Next Lists Web service The Lists Web service, which comes with Windows SharePoint Services 3.0 and Office SharePoint Server 2007, was used to retrieve the data. A Collaborative Application Markup Language (CAML) query was created and submitted along with the list identifier, and an XML result set was returned from the Lists Web service. The following sample code was used for this method. 'create a new xml doc we can use to create query nodes Dim xDoc As New XmlDocument 'create our query node Dim xQry As XmlNode = xDoc.CreateNode(XmlNodeType.Element, "Query", "") 'set the query constraints xQry.InnerXml = "<Where><Eq><FieldRef Name='Expense_x0020_Category'/><Value Type='Text'>Hotel</Value></Eq></Where>" 'create the Web service proxy that is mapped to Lists.asmx Using ws As New wsLists.Lists() 'configure it ws.Credentials = System.Net.CredentialCache.DefaultCredentials ws.Url = "http://myPortal/_vti_bin/lists.asmx" 'create the optional elements Dim xView As XmlNode = xDoc.CreateNode(XmlNodeType.Element, "ViewFields", "") Dim xQryOpt As XmlNode = xDoc.CreateNode(XmlNodeType.Element, "QueryOptions", "") 'query the server Dim xNode As XmlNode = ws.GetListItems("myListID", "", xQry, xView, "", xQryOpt, "") 'enumerate returned items For nodeCount As Integer = 0 To xNode.ChildNodes.Count - 1 'do something with each match Next End Using Search The OM was used to execute a query against the Office SharePoint Server 2007 search engine and return the results as a ResultTableCollection. That was then further distilled down into an ADO.NET DataTable via the ResultTable of ResultType.RelevantResults from the ResultTableCollection. The following sample code was used for this method. 'get the site Dim curSite As SPSite = New SPSite("http://myPortal") 'get the web Dim curWeb As SPWeb = curSite.OpenWeb() 'get our list Dim curList As SPList = curWeb.Lists(New Guid("myListGUID")) Dim qry As New FullTextSqlQuery(curSite) Dim SQL As String = "SELECT Title, Rank, Size, Description, Write, Path, Deductible, ExpenseCategory, ID, Vendor, Amount FROM portal..scope() WHERE CONTAINS (""URL"",'""#SITEURL#Lists/#LISTURL#*""') #DEFAULT# ORDER BY ""Rank""" 'do token replacement SQL = SQL.Replace("#SITEURL#", "http://myPortal/") SQL = SQL.Replace("#LISTURL#", curList.Title) SQL = SQL.Replace("#DEFAULT#", "AND FREETEXT (""ExpenseCategory"",'""Hotel""')") qry.QueryText = SQL qry.RowLimit = 100 qry.ResultTypes = ResultType.RelevantResults 'execute the query Dim rtc As ResultTableCollection = qry.Execute() Dim rt As ResultTable = rtc(ResultType.RelevantResults) Dim dt As New DataTable() dt.Load(rt, LoadOption.OverwriteChanges) 'enumerate matches For Each dr As DataRow In dt.Rows 'do something with each match Next PortalSiteMapProvider One approach to retrieving list data in Office SharePoint Server 2007 that’s not very well known is the use of the PortalSiteMapProvider class. It was originally created to help cache content for navigation. However, it also provides a nice automatic caching infrastructure for retrieving list data. The class includes a method called GetCachedListItemsByQuery that was used in this test. This method first retrieves data from a list based on an SPQuery object that is provided as a parameter to the method call. The method then looks in its cache to see if the items already exist. If they do, the method returns the cached results, and if not, it queries the list, stores the results in cache and returns them from the method call. The following sample code was used for this method. Note that it is different from all of the previous examples in that you cannot use the PortalSiteMapProvider class in Windows forms applications. 'get the current web Dim curWeb As SPWeb = SPControl.GetContextWeb(HttpContext.Current) 'create the query Dim curQry As New SPQuery() curQry.Query = "<Where><Eq><FieldRef Name='Expense_x0020_Category'/><Value Type='Text'>Hotel</Value></Eq></Where>" 'get the portal map provider stuff Dim ps As PortalSiteMapProvider = PortalSiteMapProvider.WebSiteMapProvider Dim pNode As PortalWebSiteMapNode = TryCast(ps.FindSiteMapNode(curWeb.ServerRelativeUrl), PortalWebSiteMapNode) 'get the items pItems = ps.GetCachedListItemsByQuery(pNode, "myListName_NotID", curQry, curWeb) 'enumerate all matches For Each pItem As PortalListItemSiteMapNode In pItems 'do something with each match Next Test harness All of the tests were executed through one of three different test harnesses. Each one is described in more detail below. WinForm test application The WinForm test application was used for the majority of the tests. It was written in the Microsoft Visual Basic.NET development system, and runs on the Office SharePoint Server 2007 computer itself so that it can use the OM to retrieve data from Office SharePoint Server 2007. It used the new StopWatch feature of the Microsoft.NET Framework version 2.0 to capture the elapsed milliseconds that each test took to complete both retrieving the data and enumerating the results. The test results were enumerated and the values of two fields of data were retrieved from each item so that if any data access method caused some additional processing time in the retrieval of those items, it would get recorded along with the results. This was done to give a more realistic representation of how the data would be used in a real-world scenario. WebPart and JavaScript Monitoring the time it takes for the predefined Office SharePoint Server 2007 browser interface to render a page was more difficult. In order to capture that information a custom ASP.NET server control was developed. In the OnInit event for the Web Part, the current time down to the millisecond is recorded. When Render is called, that time is output along with some JavaScript onto the page. The JavaScript forces a call when the browser document’s ReadyStateChange event fires to a function that the Web Part creates. That function checks the document’s readyState property and if it is Complete, the function gets the current time, subtracts the time that was captured during the Web Part’s OnInit event, and displays the difference. The value that is displayed represents how long it took from when the Web Part was first initialized until the page was completely finished loading. Web Part A second Web Part was written to use the PortalSiteMapProvider application programming interface (API). This Web Part requires a valid HTTP context and so it would not work in the WinForms test harness. The process it used was very similar to the WinForms application, however — in the Render method it calls the GetCachedListItemsByQuery on the PortalSiteMapProvider class instance and uses the StopWatch class to track the elapsed milliseconds, which it outputs to the page. Test results Before reviewing each of the data points in the testing process it’s also important to understand what each data point represents. Each point on the graph is represents the average of a number of tests. For example, most of the test results consist of five data points. Each data point represents the average time for five tests, so all five data points are the result of 25 tests. The only exception is the tests for the browser-based rendering times — they used a smaller dataset than the other tests. The following sections describe the individual test results. All timed results are measured in milliseconds, so smaller numbers are better. Browser-based viewing and page size One test that was done was to determine how the number of records displayed for a list on the page impacts the performance of rendering that page. The goal was to understand if showing more items on page caused linear growth, or response times that got exponentially worse. The testing was done against a list with 1,500 items and varied the number of items displayed on a page to be 100, 300 and 500. As shown in the following graph, increasing the number of items displayed per page results in a fairly linear increase in display time. The baseline test The goal for the next set of tests was to establish our baseline numbers. Here are the results of the different data access methods against a list with 1,500 items. Only the most common data access methods were included in the baseline testing, so test results for the PortalSiteMapProvider class were not included. What stands out clearly in this set of results is that viewing the data using the predefined Office SharePoint Server 2007 browser interface is the slowest data access method by far. This is one of the reasons why guidance has been delivered to restrict list sizes to no more than 2,000 items per container. It’s also why we recommend that you don’t consider going above the 2,000 items per container unless you are developing an alternative interface to work with the data. Testing with a very large list The next test really shows well what happens when you dramatically increase the number of items in the list over the recommended guideline. In this case, the list contained 100,000 items. The list did not have the index on the Expense Category column, and the site was under load. The following version of the previous chart omits the two slowest data retrieval methods for ease of comparison between the other methods. Using the For/Each enumeration to find items within the list is clearly not a good choice for working with large amounts of data. In addition, there was tremendous overhead in loading all of the list data into an ADO.NET DataTable and then using its filtering capabilities to find the desired data. However, as stated earlier, if you cached the DataTable instead of loading the list data into it on each request, the results would probably have been significantly different. There still would be a very significant hit the first time the list data is loaded into the DataTable, however. Another point to note here is just how well the PortalSiteMapProvider class performed. It was lightning fast in these tests, and significantly outperformed the other data access methods. Because the PortalSiteMapProvider and other tested methods performed substantially better than the For/Each, SPList with DataTable and Page Load in Browser methods, the latter methods were not included in any subsequent test results. Also, for the Page Load in Browser test, the page was configured to display 100 items per page. Comparing results with an indexed column The goal of this test was to determine how much of a performance gain is realized when configuring the column used in the WHERE clause for the test query to be indexed. These results demonstrate that if you are using the SPList class as part of your data access strategy, you will benefit greatly from indexing the columns used in WHERE clauses. For other data access methods, indexing will likely give you only nominal benefit, if at all. Adding a column index actually reduced performance when using the PortalSiteMapProvider class. Comparing an indexed column to an ID column This test was conducted to compare the performance differences when using a WHERE clause in the query that relied on an item’s ID rather than the value of an indexed field. What’s interesting about these results is that they are essentially the inverse of the previous test. That is, when using ID as the filter field criteria, data access methods that do not use the SPList class perform much better. However, data access methods that rely on the SPList class still work much more quickly when they are using an indexed column rather than item IDs. Analyzing the results The test results in this white paper validate the fact that with proper testing in your own environment, it is quite possible that you can use more than 2,000 items in a container without an adverse impact on performance. The best results will be obtained if you write your own user interface to work with the data in the list, and make some carefully considered choices about what data access method works best for your requirements. The data access method you choose may very well impact other aspects of your site or list implementation. For example, using data access methods that require the SPList class will greatly benefit from indexing columns used in a WHERE clause. However, the benefit of indexing these columns is marginal if the data is retrieved using the Search service, the Lists Web service or the PortalSiteMapProvider class. Conversely, if you are not using the SPList class for data retrieval, data access will likely be much faster if you are able to retrieve data based on the ID of items, rather than the value of a specific column in a list. Search Search performed well across all of the scenarios. One drawback to using Search is that it cannot retrieve data until indexing has completed, so if immediate data retrieval is a requirement, Search may not be the best choice. You will probably also need to configure Search further to support your query requirements. For example, these tests required the ability to use a structured query language (SQL) statement that retrieved a very specific set of fields from a list, as well as use the ID and Expense Category field in the WHERE clause. For this solution to work, Managed Properties must be configured in Search to retrieve the custom properties from the list and to use criteria against them. Implementing Search as it was used in this testing requires Office SharePoint Server 2007. PortalSiteMapProvider The PortalSiteMapProvider class was one of the best performing data access methods in every scenario. However, there are a couple of limitations in using it. First, because of the way in which the data is cached, use of the PortalSiteMapProvider class is going to be most useful if the data you are retrieving is not significantly different over time. If you are trying to frequently retrieve different data sets, the PortalSiteMapProvider class will incur the overhead of constantly reading from the database, inserting data into the cache and then returning it from the method call. Clearly, the advantage of the PortalSiteMapProvider class is when it can read data directly from the cache. Also, the amount of memory the PortalSiteMapProvider class has available to use may be somewhat constrained. It uses the site collection object cache to store data; by default, the object cache is only 100 megabytes (MB). You can increase the size of the site collection object cache on the Object cache settings page for the site collection. You can change the Max. Cache Size (MB) value on that page. However, remember that whatever amount of memory you assign to the object cache comes out of the same shared memory available to the application pool. If you are running the 32-bit version of Office SharePoint Server 2007, the most memory you can assign to a single application pool is 2 GB, and you immediately lose roughly 500 MB when the .NET Framework and base Office SharePoint Server 2007 DLLs and assemblies are loaded. Therefore, you need to balance the object cache size with how much memory you have available on your Web servers in addition to the processor architecture, other loaded programs used by Office SharePoint Server 2007, etc. The PortalSiteMapProvider class is only available on Office SharePoint Server 2007. SPList Using the SPList class gives you several options to retrieve data — a For/Each enumeration, the Items collection, the GetDataTable method of an SPListItems collection, and using an SPQuery object to filter data. Some of those methods, specifically the GetListItems and GetDataTable from the results of GetListItems, routinely performed well in most scenarios. However, there are some limitations. For example, the GetListItems method won’t work across folders in a single list unless the ViewAttributes property of your SPQuery query class includes Scope="Recursive". For that matter, it won’t work across lists if you want to query data from multiple lists or subsites. It also requires that all code runs directly on the Office SharePoint Server 2007 computer. Other options, like the Lists Web service and the Search Web service (not the Search methodology that was used in these tests) can retrieve the data but run on remote servers. Data maintenance considerations There are a few other issues to consider when creating lists with more than 2,000 items per container. One is the cost of other common operations such as adding or deleting items from the list. We did some additional tests to measure the impact of those kinds of operations against our very large list. The results show that as the list gets quite large, those operations begin to slow down considerably. The results show that when the site is not under load, adding a single new item does not have a significant impact on performance. However, although indexing a column improves query performance, it also may negatively impact the performance of adding new records. Also, performance would obviously degrade when multiple items are being added and the site is under load. Performance for deleting items degrades significantly when a list becomes very large. Deleting a single item from a very large list takes much more time than deleting an item from a smaller list. In the test case, a single item was deleted from a site that was not under load. As the data shows, whether there was an indexed column or not, performance when changing list items degrades as the size of the list grows. It’s more likely that a batch process would need to be built to delete items during off-peak periods. If that is not an option, the performance of delete functionality alone could conceivably force you to abandon plans to use very large lists in Office SharePoint Server 2007. Data locking Another important consideration when using large lists is the concept of the locks that Microsoft SQL Server™ places on data tables that contain list information. Virtually all data for all Office SharePoint Server 2007 lists is contained within a single table in SQL Server. This table contains data for all the lists in all the site collections whose data is stored in that content database. When you attempt to update data on a list item, whether that is adding, editing or deleting a list item, SQL Server will attempt to lock other items (rows to SQL Server) for that particular list. However, there is a limit to the number of individual rows that SQL Server will try to lock down. If you try to select approximately 5,000 items or more simultaneously for reading or update, SQL Server will typically lock the entire table for the duration of that change. In this event, all other reads and writes for all lists in all site collections are queued until the previous transaction is complete and the lock is released. If your query retrieves data across multiple folders within the list, the locking behavior occurs whether or not list items are recursively nested so that there are not more than 2,000 items in an individual container. To ensure that you don’t encounter this locking behavior, make sure the number of items you retrieve in a single request is well below this threshold. For example, you can control the number of records returned by setting the RowLimit on the SPQuery class. Crawl times Another consideration with very large lists is crawl time and crawl time-outs. As a list gets larger, the chances of the indexer timing out when crawling the contents of that list increases. This is an issue that should be carefully monitored and tested in a lab environment before rolling out any large list in production. If the indexer is timing out when crawling large lists, you can increase the time-out value with the following steps: 1. In Central Administration, on the Application Management tab, in the Search section, click Manage search service. 2. On the Manage Search Service page, in the Farm-Level Search Settings section, click Farm-level search settings. 3. In the Timeout Settings section, in the Connection time and Request acknowledgement time boxes, enter the desired number of seconds. Related content For more detailed information about the factors involved in performance and capacity planning for Office SharePoint Server 2007 lists, see following resource: Plan for software boundaries (Office SharePoint Server) (http://go.microsoft.com/fwlink/?LinkID=95115&clcid=0x409). This article provides a starting point for planning the performance and capacity of your system, including performance and capacity testing results and guidelines for acceptable performance.