High Throughput Screening Informatics Running title: HTS data handling Xuefeng Bruce Ling Stanford Medical Center Stanford University, CA Email: xuefeng_ling@yahoo.com Supplementary materials: http://hts.stanford.edu Abbreviations: Key words: ABSTRACT High throughput screening (HTS), an industrial effort to leverage developments in the areas of modern robotics, data analysis and control software, liquid handling devices, and sensitive detectors, has played a pivotal role in the current drug discovery process, allowing researchers to efficiently screen millions of compounds to identify tractable small molecule modulators of a given biological process or disease state and transform these into high content lead series. As HTS throughput has significantly increased the volume, complexity of data, and the content level of information output, discovery research demands a clear corporate strategy for scientific computing and subsequent establishment of enterprise-wide (usually globally) accessible, robust informatics platforms, enabling complicated HTS work flows, facilitating HTS data mining, and driving effective decision-making. The purpose of this review is, from the data analysis and handling perspective, to examine key elements in HTS operations and some essential data-related activities supporting or interfacing the screening process, outline properties that various enabling software solutions should have, and offer some general advice for corporate managers with system procurement responsibilities. INTRODUCTION The completion of the Human Genome Project has significantly advanced our understanding of human biology and the nature of many diseases, resulting in a plethora of novel therapeutic targets. High throughput screening (HTS), an industrial effort to leverage developments in the areas of modern robotics, scientific computing and control software, liquid handling devices, and sensitive detectors, has played crucial role in the current drug discovery process, allowing researchers to effectively and efficiently screen millions of compounds to identify tractable chemical series (HTS hits) with the requisite biological activity against the target of choice. Subsequently, HTS hits are followed up as starting points for drug design, transformed into high content lead series, and as tool compounds to understand the interaction or role of a particular biochemical process in biology. As a brute-force approach and a complex industrial process, HTS “manufactures” unprecedented amount of experimental data -- usually observations about how some biological entity, either proteins or cells, reacts to exposure to various chemical compounds -- in a relatively short time. Given the cost, technical specialization and operational sophistication, application of small molecule screening and discovery to the development of chemical probe research tools has been limited in pharmaceutical/biotech companies. There is a recent trend in academia to set up HTS facilities with the capacity to rapidly screen millions of compounds on a routine basis, which normally only industry had. The scale and complexity of any current HTS campaigns quickly outstrip the capabilities of desktop spreadsheet like data analysis software. As HTS technology moves from the industrial sector to the academic and small biotech sectors, strategies of how to deliver cost-effective and scalable computing solutions with limited budget become important. The purpose of this review is, from the informatics perspective, to examine the key elements in HTS operations and some essential data-related activities supporting or interfacing the screening process, outline requirements and properties that various enabling scientific computing systems should have, and offer some general advice for managers with system procurement responsibilities. MAIN TEXT High-throughput screening is a key link in the chain comprising the industrialized lead discovery paradigm. Figure 1 diagrams the key operational elements and necessary supporting databases in HTS. To enable complex HTS workflows and strive for the best possible decisions for various HTS scenarios, development, deployment and maintenance of enterprise data handling systems are essential. Data management requirements include the effective track of various types of information transfer, robust storage of large data sets of compound structures, biological samples, various types of containers, registered assay catalogs, primary/secondary screenings, dose-response assays, specificity assays, assay/automation workflow SOPs, and efficient interface to other operational and scientific data sets including PKDM. The integration of various information resources and software applications can significantly promote multidisciplinary communications and teamwork between HTS and other corporate functions, including biology, laboratory automation, sample bank, analytical/computational/medicinal chemistry, scientific computing, and project teams. “Buy versus Build” has been the constant dilemma for corporate managers implementing HTS and supporting HTS data analysis. Off-the-shelf software are currently available that can be acquired to "plug and play" to support HTS. They are mass-produced containing generic content not developed specifically for one organization or user population. Therefore, pieces of the HTS process may be best served well by a given vendor and their marketed packages, but that delivering a complete solution as vendors usually claim is unlikely. The various elements in this case will require front and back-end application support via their programming interfaces or APIs (Application Programmer Interfaces). While high throughput technologies have dramatically increased the pace of research, they have also dramatically increased the complexity of HTS and HTS-related processes. In reality, it is impossible to “Build” or develop a complete custom enterprise solution from grounds up. Therefore, the hunting dilemma becomes “what to buy” and “what to build”. It does not matter of the final decision as whether to “Build vs. Buy”, you have to thoroughly digest the business analysis and you have to perform integration afterwards both the business and the current retained software. As the HTS processes grow more complex, they have also grown more sensitive to even small process changes, making process optimization and process yield management essential to remain competitive. BergenShaw International's Focus software product enables high throughput laboratories to rapidly accelerate their response to change by quickly identifying those process factors and combination of factors associated with yield loss and yield gain. Developing a complete custom package for HTS is a formidable task requiring high levels of cooperation between the various teams that produce and consume information and those that develop the data management tools. Buy Versus Build: A Battle of Needs By Laura M. Francis with Randy Emelo It's the question that haunts the dreams of every training professional, the one that gives adults nervousstomach butterflies as though it were the first day of school. Buy versus build, the perpetual dilemma for people who are implementing e-learning. Build typically means creating an e-learning product from the ground up. That includes determining learning objectives, writing content, creating graphics, and crafting the design. Building also must include testing the product, by such means as pilot sessions, to ensure that it functions as anticipated. In addition to buying off-the-shelf e-learning products or building them from the ground up, you can also customize e-learning by making cosmetic changes to a program's content and graphics, so it appears to be built just for you. I put that process under the build umbrella since the product must be modified, but customizing typically requires fewer resources and a smaller degree of commitment than starting from scratch. Although the decision to buy or build may seem daunting, it can be boiled down into three factors to consider: needs, resources, and uniqueness. (See Buy v. Build Flowchart.) Needs Begin by identifying your needs, which are critical to helping you narrow your focus and determine the features most important to you. To pinpoint your needs, ask yourself these questions: * What organizational objectives must I meet? * What skills do I want to build? * What information do I want to pass along in order to improve knowledge? * What behaviors do I want to support and enhance? You'll rarely be able to meet all of the needs you discover; therefore, you must prioritize them. Match your most pressing needs to the learning objectives of the program. For example, if you have a strong need for improved communication skills, look for a product with one or more learning objectives that meet that need. Such objectives might read, "discover three types of basic communication" or "develop a communication-based action plan for addressing an employee's poor work habits." Resources This factor is one that most people wish they could avoid thinking about. However, examining your resources is critical. If resources means just money to you, think again. Although money does play an important role, two other pieces of the puzzle need to be considered as well: time and personnel. Time. When considered in terms of buying or building e-learning, time takes into account the following: * how long you have to make your decision * how long you have to develop an e-learning product, including testing time if necessary * how long you have to roll out or implement the product within the organization. To understand the importance of time resources in the buy versus build decision, consider this scenario. You're given unlimited funds for your e-learning purchase, but only a three- to six-month timeframe to review, purchase, and roll out the e-learning product. In this case, time will be the deciding factor in your choice of buying or building. By analyzing your own situation against the three variables above, you can accurately determine the importance of time in your decision. Personnel. Another often-overlooked resource factor is personnel. That includes people needed for both implementation and support of e-learning; each group plays an integral role in your decision. As you analyze your resources, you must determine whether you have not only the personnel to implement the e-learning product within the organization (you'll need an advocate or champion to push the initiative within the company), but also whether you have the needed number of technical support staff available (for example, IT and administrative support people). If you don't have the personnel you anticipate needing, you must identify where you can obtain the required support as well as how much that support will cost, which brings us to the third resource. Money. Both a limited and an unlimited budget can influence your choice of buying versus building. When considering your budget, you must look at the short-term and long-term benefits of the overall investment--that includes analyzing the effects of not implementing any e-learning and in turn not developing the skills, knowledge, and behaviors you identified as needs. When building, you may incur a large, up-front, one-time fee, but the investment may be lower over time. (The product will be your exclusive property once it's paid for in full.) A built product will typically need to serve at least 500 users in order to qualify as a good investment. When buying e-learning, you must calculate how much money you anticipate paying over the life of the product, which is typically 12 to 24 months for off-the-shelf products. For example, you may think you're spending less for offthe-shelf e-learning, but you need to take into account annual renewal and maintenance fees as well as costs for upgrades or add-ons that may become available. Those hidden expenses may end up costing you more over the life of the product than you had anticipated or budgeted. To combat rising costs, identify and document both your company's and the supplier's role and responsibilities. That can help you pinpoint added fees up front. For example, identify who will provide technical support. If it's the supplier, determine whether that's included in the purchase price, how much it will cost if it's not included (is it a flat monthly or yearly rate or do you have to pay per request?), and how long the support will be available. Uniqueness The third factor to consider when deciding whether to buy or build e-learning is uniqueness. Are you using elearning to teach a proprietary business process or skill? Do you need the e-learning product to cover general skills or subjects (for example, conducting annual reviews) or more targeted information (for example, giving effective feedback to hostile, resistant, and ambivalent employees)? Can you find distinct connections between seemingly mismatched off-the-shelf products and your needs (such as an off-the-shelf communication product that you can pair with your effective feedback initiative)? Does either your industry or corporate culture (or perhaps both) preclude a generic e-learning product? Answering those questions can help you determine the degree of uniqueness you need from e-learning products. If you're transferring a proprietary business process or skill (such as patented information for a pharmaceutical company) into an e-learning program for employees, you'll typically need to custom-build a product. You may consider customizing one that already exists, but generally a proprietary business process is so unique that no existing products will meet your needs. If your content needs are more generic, you can still customize an off-the-shelf product to give it the feel and flare of your organization. That could include making such simple changes as putting the company's logo in the header or slightly revising the wording to match organizational language. Ultimately, you must decide how willing you are to forego originality for quicker delivery time. Drumming up support Once you've analyzed needs, resources, and uniqueness, you should also consider a supplemental decision factor: the support or buy-in you can expect from people within the organization. Although this factor may not affect your decision to buy or build directly, it will certainly affect how people within the organization receive your decision. Necessary support falls into three categories: * support of upper-level management and executives * support of like-level colleagues * support of end users. Management. The need for support from upper-level management and executives is easily apparent; they have the power to feed your idea to the lions or cut you a check with lots of zeros on it. Becuase this group has so much influence, people often spend all their time "selling" them and forget the other two important groups whose support they need. Colleagues. This group constitutes your peers, those people you must interact with on a daily basis. Their basic level of influence with upper-level management and executives can affect the outcome of your initiative, so it's important to gain their support. People in this group need to understand your goals for the e-learning product, how you plan to pay for it (especially since your budget may include funds they sought for their own uses), who will use it, and how their own jobs will be affected by the initiative. End Users. This group is the oft-forgotten low man on the totem pole. They have the hidden power to scuttle your entire plan by refusing to use the product, which will convince your colleagues and executives that the investment was wasted. End users may not intentionally sabotage the initiative; their resistance may stem from not understanding why they have to use the product or resenting not being part of the decision-making process (whether or not their opinion would've influenced your decision). Or they may resist because the product doesn't meet their needs. Don't downplay the importance of this group--their support is critical. Although it's not always possible to get the support of all those individuals, open and direct communication with each of them can help your e-learning implementation succeed, whether you buy or build. Armed with the information in this article you should be able to choose wisely--and sleep at night. Developing a complete custom package for HTS is a formidable task requiring high levels of cooperation between the various teams that produce and consume information and those that develop the data management tools. Data strage models or objects need to be developed for each of the subsystems, and APIs will have to be developed to permit communication between them. There are many factors to be weighed in choosing the best tools and approaches for developing HTS systems. In our own efforts, we have benefits greatly from the use of object-oriented programming methods. These approaches emphasize a modular approach that produces systems that are extensible, easy to maintain, and code-efficient. Every consideration should be given to ease of use and flexibility, but in the end the flexibility desired by the users must be carefully balanced against both the flexibility allowed by the automated assay systems and the IT cost of maintaining overly complex software. , have far-reaching consequences for success of later obtaining tractable high quality lead series. An effective and efficient HTS campaign obviously depends significantly on the successful procurement and management of compound structures, samples, containers and requests coming from HTS stages. Although the assay developer always comes up with a method to consistently measure an experimental outcome, the assay characteristics are different depending on where one sits within the spectrum of drug discovery. In the beginning of the spectrum are assays whose primary purpose is to identify chemical leads from the thousands of compounds found in a chemical library. These assays are typically for high-throughput primary screening (HTS) and are usually multiwell and very robust. On the other side of the spectrum are assays for clinical diagnostics, which are robust and well characterized and have to go through a rigorous analysis of sample lots, reagent shelf life, intra- and interassay variability performed in a good laboratory practice (GLP)-compliant manner to achieve FDA approval for marketing. Because I have been fortunate enough in my tenure in the pharmaceutical industry to develop assays for both HTS and therapeutic target teams, I can highlight both the differences and similarities between assays in these two areas. Assay development for HTS is developing an assay that is target-specific, multiwell, robust, and capable of automation. It can be cell-based or biochemical. The assay is sometimes given to the HTS team by the therapeutic group but often in a format that is not suitable for an HTS and needs to be optimized or redeveloped from scratch. Although there are exceptions, these assays are homogeneous and low volume. They can be read by fluorescence, luminescence, absorbance, radioactivity, or any other output that is available on plate reading systems. The assays are suited for the running of thousands of compounds per run, so each plate must be designed with its own controls, to ensure quality of results. The samples are typically run in singlet, and therefore the assay must perform so well that one can tell the difference between an active compound or "hit" and noise in the assay. This is validated using the z-factor analysis of variability. The assays are developed only upon request by a therapeutic area when a target is ready to enter screening and the target itself has been validated. Depending upon the throughput or the number of compounds screened, these assays can be completed in a matter of days to weeks so there is no need to be prepared to run these assays on an ongoing basis. These assays are run as in a factory where production output is important and the assay itself must fit into the process. Given various challenges and complexities of any HTS operation, it is imperative that all interested parties engage and involve thoroughly during the entire HTS campaign. From HTS function point of view, screening scientists should and usually integrate well within the corporate drug discovery environment. For example, the operational integration of various screening activities like compound supply, assay design and execution, data analysis and tracking, compound quality control can obviously have profound impact on the efficiency and effectiveness of HTS campaigns. Striving for the best possible decisions for various HTS scenarios, such as screen prioritization, timing for assay to screen transition, choices of assay format, compound input format (e.g. number, type, and concentration), choices of secondary and selectivity assays, and when to rescreen, can be daunting tasks without assistance of comprehensive real time information. HTS campaign and business process standardizations Importance of the controlled vocabulary Multi displinary team will advance a specification to formally describe interoperable business processes and business interaction protocols for Web services orchestration. allows users to describe business process activities as Web services and define how they can be connected to accomplish specific tasks. The co-authors rightfully view customer adoption as the most important hurdle in making a business process standard meaningful--and that means ubiquitous ISV support. "To solve real-life business problems, companies may need to invoke multiple Web services applications inside their firewalls and across networks to communicate with their customers, partners, and suppliers," said Diane Jordan of IBM, co-chair of the OASIS WSBPEL Technical Committee. "BPEL4WS allows you to sequence and coordinate internal and external Web services to accomplish your business tasks. Thus, the result of one Web service can influence which Web service gets called next, and successful completion of multiple Web services in a process can be coordinated." John Evdemon of Microsoft, co-chair of the OASIS WSBPEL Technical Committee, added, "The participants in this Technical Committee are committed to building and delivering standards-based interoperable Web services solutions to meet customer requirements. Business processes are potentially very complex and require a long series of time- and data-dependent interactions. However, BPEL4WS allows companies to describe sequential interactions and exception handling in a standard, interoperable way that can be shared across platforms, applications, transports and protocols." "Through OASIS, a large group of organizations are joining together to further the evolution of BPEL4WS from specification to standard--within the context of an open, publicly vetted process. Active participation from the OASIS membership at-large, which includes many business process solution vendors as well as customers, will provide valuable input on usage cases and implementation scenarios that will result in the broadest possible industry adoption," commented Karl Best, vice president of OASIS. "We plan to work closely with organizations such W3C, UN/CEFACT, and others in completing the 'big picture' of Web services." "W3C's members believe coordination is vital to ensure the delivery of timely and thorough technical solutions that truly meet the needs of customers, especially in the area of Web services," explained Steve Bratt, Chief Operating Officer for W3C. "To that end, W3C's Web Services Choreography Working Group has invited representatives of the OASIS WSBPEL Technical Committee to attend its second face-to-face meeting in June. We look forward to building on the technical coordination already established between OASIS Technical Committees and W3C Working Groups." About OASIS (http://www.oasis-open.org) OASIS (Organization for the Advancement of Structured Information Standards) is a not-for-profit, global consortium that drives the development, convergence, and adoption of e-business standards. Members themselves set the OASIS technical agenda, using a lightweight, open process expressly designed to promote industry consensus and unite disparate efforts. OASIS produces worldwide standards for Web services, security, XML conformance, business transactions, electronic publishing, topic maps and interoperability within and between marketplaces. Founded in 1993, OASIS has more than 2,000 participants representing over 600 organizations and individual members in 100 countries. HTS automation support and interface with data analysis pipelines Assay data for HTS is generated on a wide variety of instruments or “readers”. Virtually all these readers are controlled via software provided by their manufacturer. Instrument control software varies widely in complexity but usually has menu options allowing some customization of its exported (e.g., ASCII text) data files. At present, there are no standard data formats adopted by the instrument makers, so developers of an HTS calculating system need to write a number of routines or “data filters” which parse the text file to a format or formats compatible with their system. To limit the complexity and variety of the data filters, the automation team can implement process controls to standardize the formats used by the various readers. HTS data analysis, storage and QC requirements Challenge: automation In an overly simplified scenario, a screen essentially brings a compound plate and an assay plate together, from which a readout is taken. Of course, the compound plate typically needs to be diluted and/or transferred, which requires a pipetting station. Another station is needed to start the biochemical plate or remove media and replace buffer from a cell plate. The order of reagents added to the plate needs to be decided in advance because it is typically the last reagent that should start the reaction. After mixing the diluted compounds and reagents or cells, the plates may need to be incubated, before finally going through the detection devices. There are two common approaches to accomplishing this workflow, which are subject to yet "another debate topic," says Roche's Garippa. Screening facilities may either use a robotic system or a series of workstations. "Robotic systems produce less variability," he says. "You also need dedicated personnel and a tight partnership with your service provider since down-time on a robotics platform can be very expensive." The high capacities on robotic platforms are very attractive because the primary screen can be finished very quickly. On the other hand, workstations yield more flexibility to work around technical problems, and if they do occur, typically there are easily available substitutions with backup instrumentation. "If the primary pipettor breaks, you just use another. Each of the workstations now have stackers, so you can load a stack of plates and have walk-away capability, but you can't walk away for many hours at a time as with robotic systems," says Garippa. BMS's Cacace says that the choice depends on the assay and the flexibility of the robotic system. "Some robotic systems do not have the flexibility to deal with varying assay formats," she says. "It's depends on what equipment is on automated system and how flexible it is, as well as how dynamic is the scheduler [informatics platform that controls the movement of the robot]. Long incubation steps, for example, may cause a lot of down-time on the robot, which decreases throughput and warrants a switch to a workstation system." Merck takes an "industrialized approach" to HTS assays and relies on robotic platforms. "The problem with semi-automated workstations is that scientists are burdened with unnecessary repetition; automation just makes our life easier," says Peter Hodder, PhD, head of HTS robotics, Merck Research Laboratories, North Wales, Pa. "Our robotic systems are flexible and modular, so we can move detectors or liquid handlers on and off depending on the needs of the assay." Merck runs their robots continuously, either validating an HTS assay or running the screen itself. The trick is to develop an assay that is scalable to a robotic system. "We make sure that the assay is robust and that assay development scientists anticipate how the protocol will run on the robotic platform," continues Hodder. "They are aware that the robot may not be able to reproduce what can be done by hand, and work with automation scientists to design the HTS protocol accordingly. When the protocol is designed with this end-point in mind, the scale-up to HTS is a gradual and straightforward process." Challenge: quality control Running a perfect screen is nearly impossible. The trick to success is detecting any problems that may have arisen along the way. Taking multi-parameter measurements during the screen, using appropriate controls, and monitoring the data in real time during the screen are all part of the quality control (QC) steps necessary to ensure meaningful data. Analyzing QC data in real time while the screen is still running allows researchers to correct any obvious equipment problems. "We need quick visualization tools or automatic cut-off points to detect pipetting problems, clogged tips, consistency in liquid volume in all wells, tip carryover from plate to plate, compound autofluorescence, bubbles in wells, and many more," says Garippa. "The QC software is key: it allows us to find a multitude of problems in the assay, from pipetting to washing to incubation. I wish I could say [QC] was all automated. But because it's so important, we end up taking a quick glance at each plate to see if there are any obvious problems. We are not yet at a point where we can completely trust in silico QC." The popular software packages for HTS data management and QC include CyBi-SIENA from CyBio AG, Jena, Germany; Screener from Genedata, Basel, Switzerland; ActivityBase suite from IDBS, Surrey, UK; and DecisionSite for Lead Discovery from Spotfire Inc., Somerville, Mass. Many pharmaceutical companies also rely on databases and informatics tools developed in-house. "We have a custom database for viewing results and QC," says Merck's Hodder. "It's a large and essential IT component in our operation, responsible for acquiring data from the robots and for data management. We define statistical parameters to determine whether the plate is 'healthy' without human intervention," he says. "The screen data is also linked to compound information." Business decisions: home-made vs. commercial software There are several commercially available software packages for HTS (Activity Base, MDL information Systems, Accelrys, Tripos). These packages vary in their capabilities and should be examined with the point of view that pieces of the process may be best served by a given package, but that supplying a complete solution is unlikely. The various elements in this case will require front and backend application support via their programming interfaces or APIs (Application Programmer Interfaces). Developing a complete custom package for HTS is a formidable task requiring high levels of cooperation between the various teams that produce and consume information and those that develop the data management tools. Data storage models or objects need to be developed for each of the subsystems, and APIs will have to be developed to permit communication between them. (In this discussion, the term “object” refers to the representation of a set of data as a series of text and numerical variables in computer memory). Some critically important properties that HTS calculating and review software should have (1) flexibility – end users should be able to place various control types and choose assay mappings for the calculators from a simple interface. Other specifications needed for proper calculations should be supplied to the calculator from predefined prototypes as well. Caution should be exercised in permitting too much flexibility in the calculating software, as the automated processes themselves are usually the primary reason for limiting flexibility. (2) graphical user interface – numerical trends are best perceived using graphs of the data. If possible, incorporate interactive graphical tools for labeling points, navigating through the plates and subsetting the data. (3) speed and ease of use – the HTS laboratory is a hectic place. Any tool used by screening personnel should be fast, intuitive to use, and robust. Wherever possible, precompute variables and store them in RAM to allow efficient random navigation through the data. Choices from the interface should be from list controls rather than edit fields. Storing the configuration of the calculator settings with the data object generated by the calculator allow users to correct for minor changes in the protocol (e.g. the location of the controls was accidentally reversed) at run time. Computing algorithm overview (single dose vs multiple dose), high content screening Quality control algorithms and data visualization 4:15 Statistical Quality Control throughout the High-Throughput Screening Process Monitoring and analyzing the quality of data in High-Throughput Screening (HTS) is becoming ever more complex as volume increases, and turn-around times decrease. Over the last 2 years GSK have been developing a Statistical Quality Control system to meet this challenge. Key features of this system are: a modular rules-based approach to defining statistical measures of quality, provision of rules for both the screening process and for individual plates, provision of tools for both real-time and offline analysis of the quality data, early warning and alerting to process problems, and low impact integration with GSK's existing Activity Base screening environment. This system, developed with Tessella in 2005, is now being used routinely by screening scientists to analyze screening plates worldwide at GSK. It has facilitated the application of common business rules for QC across sites for passing or failing plates before publishing HTS data. Through the application of a modular rules-based Statistical Quality Control system coupled with a sophisticated visualization and data analysis tool significant improvements in the efficiency of the Molecular Screening process can be realized. Data warehousing and data mining The basic question a compound screen seeks to answer is: What are the true biological effects of a compound? The challenge for current data analysis tools is analyzing multiple readouts for signs of compound activity, specificity and cross reactivity to determine real effects from artifacts. Common questions: Hits Has this compound been considered a hit in other assays? Was it confirmed? Statistics How often has this compound been tested? How often was it a hit? Details Show me the structure and other detail information about the compound. Screens Show me every screening result for the compound. Challenge: target selectivity With the growing size of compound libraries, it is not uncommon for a screen to yield in excess of 10,000 active hits for a given target. Target selectivity is one of the main criteria in narrowing the leads. Most targets currently in screening are members of larger protein families, such as isozymes or receptor subtypes. Despite the high degree of homology, these protein family members have different functions, making it important for a compound to selectively affect the chosen target. "Very early on, we would like to know how much of a liability target nonselectivity is," says Ralph Garippa, PhD, research leader, cell-based HTS and robotics at Hoffmann-La Roche Inc., Nutley, N.J. "For some targets, you want to develop a pan-inhibitor; you don't care if it also knocks out other members of the protein family. In other cases, the primary side-effect profile arises from the activity of closely related receptor or enzyme. The project team has to decide before developing the assay whether to bring other family members forward in cloning, expression, and purification or in parallel cell lines," says Garippa. In most screening facilities, it is standard follow-up procedure to run counter screens to determine if the active compounds from the primary screen hit other related targets. This would eliminate "promiscuous" compounds. For example, when screening G-protein-coupled receptor (GPCR) targets, it is common for the disease team to consider fold-selectivity over the "nearest neighbor" that may cause side effects. In functional cell-based GPCR agonist assays, where endogenous GPCRs are present in the given cell line, the specificity of the compound to the target is determined by running the parental cell line in the same assay and eliminating compounds that show activity at endogenous GPCRs in the parental cell line screen. In addition to running counter screens against the target family members, compounds are also profiled across 10 to 50 receptors to determine any other activity the compound may exhibit. This activity would need to be eliminated, either by dropping the hit or optimizing at the medicinal chemistry stage, in order to avoid the toxic liability of the clinical candidate. The more screens are run on the same library, the more historical data on compound profiling is available, indicating each compound's activity in a variety of assays. Knowledge sharing and “Google” 1:30 Using Web 2.0 Technologies for Effective Knowledge Management and Real-Time Collaboration In 2006 Nastech launched a project-based, user-driven collaborative resource on our intranet to meet the growing needs of capturing our Nastech knowledge in a structured format. This presentation will demonstrate how we developed a standardized ontology for the description of Nastech assays and how Web 2.0 development methodologies were advantageous to the rapid development of a user-driven collaborative workspace. * Brief overview of the Nastech informatics infrastructure * The challenges to developing and maintaining a standardized ontology * How Nastech utilized Web 2.0 development methodologies to allow scientists to dynamically interact with their experimental data in real time * Using a decentralized approach to the storing and management of the various internal and external data silos within an adaptive framework to allow disparate data sources to be seamlessly integrated * Highlights with a few examples Jeremy Thompson, Development Manager, Research Information Services, Nastech Pharmaceutical Company, Inc. 12:00 pm Enterprise-Wide Management of Scientific Data Effective scientific collaboration is essential for enhancing R&D productivity, and is as important within a company as it is between strategic partners. It requires the alignment of data handling and project management systems on a global scale, yet with minimal complexity. It also needs to bridge the inherent differences between individual R&D components or “silos”, an aspect identified by the FDA’s Critical Path. Teranode is addressing this with technologies based on the next generation of the Web, which is able to address the scale and diversity within pharmaceutical organizations. This presentation will address these issues in the context of Translational Research. * Universal access to all information resources * Creation of annotations of between any set of data and documents * Mapping to legacy datasystems * Dynamic inclusion of new data types, including biomarkers and genotypic profiles * Security and access control * Ability to add and utilize controlled vocabulary across all groups * Support for powerful query engines and inference agents Eric Neumann, Ph.D., Senior Director Product Strategy, Teranode Corporation 6 Information Management Data Acquisition Data Analysis Expectations Strategies Error Detection Error Correction Normalization and Data Condensing Data Standardization Statistical Analysis Binning and Pooling Statistics General Strategies Random and Systematic Error Random Error Systematic Error Bias Type 1 and Type 2 Error Sample Number (N) Signal-To-Noise and Signal-To-Background Ratios Signal-To-Noise Signal-To-Background Limit Of Detection Precision and Accuracy Standard Deviation Coefficient of Variance Resolution Residual Analysis Ordinary Least Squares Residual Analysis Z' Factor Software and Automated Data Analysis References