How the World Bank built an enterprise taxonomy -- a story with a happy ending Denise A. D. Bedford, Ph.D. Senior Information Officer World Bank ASIST Potomac Valley Chapter presentation November 19, 2003 Storytelling • I’m going to use a traditional Knowledge Management tool tonight to tell you how we built our enterprise taxonomy – storytelling • My goal in using this approach is to illustrate the technical, information architecture and the social aspects of such an undertaking • It will also allow me to speak to some of the critical foundation elements and milestones in the process • It would not be truthful for me to tell you a story about how one day we defined our enterprise-taxonomy, and the next day we all lived happily ever after! • I’d like to take you back to the world of medieval fiefdoms – many systems, many rules, different sets of laws, different languages and grammars Once upon a time • We had many different financial systems, multiple document management systems, 100’s of searchable resources, and a number of gaps in coverage of our information assets • Then a wise and foreseeing Chief Information Officer and President helped us to establish a stable, standard institutional platform for our institutional collections (…our modern day Alexander the Great) • This meant that instead of having multiple financial systems, human resource systems, and document management systems, we had one to suit each function (…first thoughts of unification arise…) • And, the wise counselors advised them to select systems that functioned on a common operating system - Oracle (…we agree to talk to establish lines of communication and send ambassadors) • Enterprise begins to think of systems at an ‘enterprise’ level – this is a crucial organizational culture aspect to implementing an enterprise taxonomy Consolidation of Business System Fiefdoms •Before the dawn of the Knowledge Age, we had many different business systems •Each business system had its own (or no…) metadata, classification schemes, indexes, search systems… •When we standardized our primary business systems, we merged those different taxonomies into enterprise taxonomies •In this first step, we still had multiple business systems, but one per business function Laying Out the Information Empire • Once we had established a common communication foundation, the people in those different fiefdoms began to talk to one another and a cultural change began to occur • The idea of having ‘one’ business system to support a business function was accepted by the masses • Now we find we have many different kinds of taxonomies – accounting structures, business functions/process/task taxonomies, product taxonomies, taxonomies of job classes, skills taxonomies, organizational taxonomies, personnel profiles, etc. • We built taxonomies in these business function systems as we were implementing them - designed to suit business functions and the people who were administering the systems, not necessarily end users • Start to understand important of usability and end-user training From Business to Information Systems • Then a wise counselor (information architect) had a vision of a common enterprise-document management system • When we began looking for such a system, though, the commercial products were not up to snuff in terms of our requirements • We developed our own in-house system – portions of which were/were not using the common foundation • The wise counselor had another vision of an integrated enterprise information system that would support a single point of access to all the information within the information empire • This was the spark that set a the goal for an integrated enterprise architecture and taxonomy, though we were not sure we could actually achieve it Document Management Systems •Document management system was like a cathedral that held the church network together – smaller churches represented the units contributing to the system •Document management system architecture was a little bit different, though •Took many years to convince the little churches to send their offerings to the cathedral so they could become part of the larger network •Each church could maintain their own filing structures which served the creators not the users •Eventually they agreed to use a common prayer book – common filing structure •Churches can speak different languages but they all have to be able to communicate Monasteries Document vs. Information Management Systems Distribution • Caution here – goals of document and records management systems are to store and preserve information from the perspective of those who created the information • End user access is not a primary goal of these kinds of systems • Taxonomies that you put in place for these kinds of systems don’t necessarily serve end users needs • Kinds of taxonomies – organization filing structures, record series for retention & dispositioning, economic sector and impact categories, some minimal metadata is beginning to emerge, though • These taxonomies serve filing and storage goals, not the information access goal of our enterprise taxonomy Renaissance – Creativity Explodes • While we were making good progress in synchronizing different kinds of taxonomies in all of these business areas, a creative renaissance of knowledge creation and sharing began • In about 1997, we launched a knowledge management initiative, using Lotus Notes databases to support collaboration and document libraries • Knowledge management was a cultural change in itself – creativity of organizational units was encouraged and heightened • It was a very important source of cultural change within the institution – beginning of a transformation to a learning organization • It meant that the masses could become interested in taxonomies Renaissance – Creativity Explodes • Proliferation of writing, publishing and organizing of information • Déjà vu all over again – creativity took the form of user-defined metadata, publishing and navigation taxonomies • These taxonomies were different from any of the taxonomies we had seen before – reflected the new thematic structure of the KM organization • In some respects there was more confusion because they were talking about different kinds of taxonomies but trying to fit them into the same structures • We began some internal QuickStart educational sessions on metadata, taxonomies, search, semantic web, etc. to provide a framework Popular Information Revolution • So now we have several business process systems, a decentralized document management system, knowledge management system – and there is a popular uprising – the web • Many web towns are created - 100’s of web sites, 1000s of web pages • No central coordination of virtual villages • Too many different places to go to look for information – going back to the medieval monastery network systems • Masses begin to surface their discontent with the quality of access and the quality of information that is being published • Realization among the masses that not all of the quality information assets are electronic or publicly available Popular Information Revolution • Begins to look like the Dark Ages again - no profiles, no taxonomies, no controlled vocabularies or values • Different systems have different profiles, different taxonomies, controlled vocabularies or values, indexes, search systems • We start to see information pollution – alchemists and court jesters come back onto the scene – advocating magical approach to discovering the enterprise architecture • But, we didn’t give up – we kept working on the components of the infrastructure in the background • We knew that the day would come when they would be needed – and that day came Rationalism & Enlightenment • Wise counselor returns to bring back sense of rationalism and enlightenment • Counselor commissions a synthesis of content types across systems, standard metadata scheme, and the rejuvenation of the World Bank Thesaurus • Content of the information is what we focus on for integration • Information architecture then derives from our kinds of content • Synthesis and integration work outside of existing systems, but leverages all the work that is done within the business systems • Metadata is the central structure (faceted taxonomy) • Reference sources for each facet support the governance and quality control (flat, hierarchical and network taxonomy structures) Scientific Revolution & Industrialization • About this time, the visionary counselor begins to lay the work for a superhighway connecting all information systems – using the integrated enterprise taxonomy as a blueprint • Content type proposal – enterprise-wide review of kinds of information is completed and accepted by Information Architecture Committee • Establishment of Bank standard metadata – deriving from existing metadata across systems • Long-term search strategy proposed and submitted to Information Architecture Committee • Simplified Enterprise Taxonomy for topics is formed – looking across all systems and looking to the systems that are used by our partners Space Travel - Portals • The wild and crazy growth of the external website of the Bank, as well as the need to create a new internal web services platform raised awareness of the value of an integrated enterprise taxonomy • You need some predictability in the source and target systems before you can syndicate content from an SAP BW cube, a newsfeed source, a DM system, an RM system, Archives, and the InfoShop to a project portal or to a personal portal, they all need to have a common point of reference • The portal team tried the vendor’s suggested approach – create and implement simple new hierarchies and use them throughout the portal • The enterprise taxonomy actually becomes the technical and information infrastructure of the portal – metadata repository, global navigation bars, … • Taxonomies also now must be an integral part of the content that you are creating in the portals and in the systems that provide content to the portals Back to Communications • Vision of a whole-Bank search – one place to go to find information in any of the Bank’s systems, speaking any of the languages of our clients • Vision involved having a search engine that spoke the Bank’s business language and the languages of our clients – another kind of taxonomy • We had a print-based ‘topical’ thesaurus which needed to be updated and expanded to reflect the Bank’s business in 2000 (moved this from 10,xxx terms in 1997 to 92,xxx in 2003) • Same time the Translations Department was implementing a new parallel translation system which leverages multilingual and cross-language glossaries • Translations Department glossaries focus on business functions, WB Thesaurus focuses on topics – integration and cross-population now in progress Transparency • Policy on Information Disclosure (2002) approved by the Board of Executive Directors required that we: – develop a metadata based, cross-system Catalog to surface disclosed and disclosable documents for the external public user – put in place a system that would support the capture and tracking of disclosure requests in the future and record changes in disclosure status – This effort funded the first release of whole-Bank search • Disclosed and disclosable documents lived in all of those systems above and were not tagged with their disclosure conditions or status • In order to deliver WB Catalog, we had to integrate all of those taxonomies described above as well as the long-term search strategy Information Universe • Let’s jump to the 21st century – Enterprise Content Architecture and Enterprise Content Management • All those taxonomies we worked on for the past 15 years are now integral components of the enterprise content architecture • We’re finding that these taxonomies are critical to efficient and effective use of portal technologies • Allows us to shift the focus to information content, metadata management, taxonomies, search, access, security, disclosure…. • Now the impetus is to bring them all under central control so that they can be managed and used by systems across the enterprise • Let’s see what the enterprise taxonomy looks like today, its content, how we maintain and manage it Information Universe • We realize that we really do want to work and travel in a 21st century universe of information • Space travel is not magical, but is based on good engineering and maintenance • Managers need to understand that quick fixes and solutions do not result in sustainable systems, but rather result in significant investment losses • A multi-dimensional design approach supports flexibility, extensibility, and customization • We can view our information universe from several different perspectives – – – – • Individual systems landscape A technical architecture landscape User’s view of the enterprise taxonomy An information architecture landscape All of these views make up our Enterprise Content Architecture and allow us to move to the next step – Enterprise Content Management Systems Architecture Site Specific Searching Publications Catalog World Bank Catalog/ Enterprise Search Recommender Engines Personal Profiles Portal Content Syndication Browse & Navigation Structures Metadata Repository Of Bank Standard Metadata (Oracle Tables & Indexes) Reference Tables Topics, Countries Document Types (Oracle data classes) Transformation Rules/Maps Data Governance Bodies Metadata Extract Doc Mgmt System Metadata Extract Metadata Extract Metadata Extract Metadata Extract People Soft JOLIS Metadata InfoShop Metadata SAP Financial System Metadata Extract Web Content Mgmt. Metadata Concept Extraction, Categorization & Summarization Technologies Technical View of the Enterprise Architecture Content Contributor End User Content Systems Metadata Management and Security Services DELIVERY access rules ePublish Content Access Services …. Content Management Services view multilingual srch search syndication browsing notification retention schedule PDS workflow create/del. check in/out versioning declare classification reference data taxonomy thesaurus Content Integration and Archives Services relate Connector Concept extraction rules evaluator harmonize Adapter data dic. monitors Archives Store logs Over Time SAP (R/3, BW) Documents, Images, Audio, Data records Repositories Services Metadata warehouse People Soft Notes / Domino iLAP Business Systems User’s View of the Enterprise Taxonomy Information Architecture Title Author Keyword Content Type Topics Bus. Activity Format Disclosure Bank Standard Metadata by Purpose Identification/ Distinction Search & Browse Use Management Compliant Document Management Agent Country Authorized By Record Identifier Title Region Rights Management Disposal Status Date Abstract/ Summary Access Rights Disposal Review Date Format Keywords Location Management History Publisher Subject-SectorTheme-Topic Use History Retention Schedule/Mandate Language Business Function Disclosure Status Preservation History Disclosure Review Date Aggregation Level Version Series & Series # Content Type Relation Taxonomies in Action • Metadata in Fielded Search – Faceted Taxonomy • Topics Taxonomy – Shallow Hierarchy • Business Activity Taxonomy – Deep Hierarchy • Organizational Taxonomy – Faceted Taxonomy • Country – Region Taxonomy – Hierarchy • Thesaurus in Search – Faceted Taxonomy • Disclosure Status – Flat Taxonomy Top Tier Content Type Examples • Documents in IRIS, ImageBank, IRAMS… • Data in BW, DEC SIMA queries in central, regional & agency databases, CDF indicators, GDF data reports, . • Publications in JOLIS, Office of Publisher, Thematic Group databases… • Communications in External Affairs, Office of President, DEC, IRIS… • People & Communities in YourNet, PeopleSoft, WBDirectory,… • Knowledge in Notes databases, Oral History program,… • Services in WB Yellow Pages, Service Portal,… • Collections in EIU database, Oxford Analytica Lessons Learned • You can change some of the information architecture, but some of it you will have to adapt or map • Business functions are the most critical for standardizing to single business taxonomy – the move towards standardization has to come from above • Map business system taxonomies to enterprise taxonomies - help the business system owners to see the value of being part of an enterprise taxonomy (no value, no buy in) • Expect change and be ready to integrate and map, but educate your users to alert you to changes – make it possible for them to work with you • Do outreach and consciousness raising (QuickStart programs on metadata, taxonomies 101, search engines, semantic engines,… Lessons Learned • Move forward on the end user front while you’re working on the backend – when people can see the actual value they will buy in (now no one wants to be left out of the WB Catalog now – we created it, so they are coming) • Have to have a goal and a vision – you will never succeed at creating an enterprise taxonomy if you don’t know why you’re doing it • We are putting in place an enterprise architecture based on welldefined and managed taxonomies that are used within and by internal systems • This gives us flexibility to build different products and views for end users, while internally managing our information assets