The Electronic Library and other means of finding information on the web Patrick & Michael Krolak See: Week 4 in Intralearn or http://www.cs.uml.edu/~pkrolak/lab18/lab18.html The Electronic Library E-libraries are the electronic front doors to the modern physical library. What is the e-library • The e-library resides on the Internet and allows the user access to the library’s catalogs and electronic collections (24 hours a day – 7 days a week). • The user can use it to locate the books in the public access catalog. • Access electronic books and journals. • Access to newspaper archives such as the Boston Globe, Boston Herald, etc. Electronic Journals and Newspaper Archive • Most Newspapers charge for articles from it archive. The major state newspapers archives are available through the e-lib. • The Lexis Nexis sites have current business, legal cases, and bios. • Electronic journals can provide abstracts and in some cases the full article. Special Electronic Reference Materials Every library whether public or personal should have a collection of materials readily available that contains useful facts and data. Reference materials can be a quick and accurate means of developing marketing and business plans, determining business law and custom, rates of exchange and interest, holidays, etc. The dictionary can provide useful insights even for the scholar. Example of References • Dictionaries, thesaurus, and similar word orientated materials: • No college student, researcher, or writer should be without an online collection of style guides, dictionaries, and related reference materials. • Dictionary, English Style Guides, and Reference Materials for the Writer • Thesaurus Roget's • Online dictionaries of over 230 languages, thesaurus, and other useful word and language tools, by yourdictionary.com http://yourdictionary.com/ • Quotations: • An extensive online reference for quotes, dictionary including pronouncing the word, etc. http://bartleby.com/ The Browser’s Bookmark and Favorites Lists • Bookmarks and favorite lists are simple means of creating a set of useful reference links… • Can be used in linking personal sites, frequently visited sports sites, and business. • Travel and entertainment Public and University e-libs in Massachusetts • The Boston Public Library, http://www.bpl.org/ • The CLAMS library network for Cape Cod: http://www.clamsnet.org/ • The Minute Man Library Network Consortium of Libraries made up of the Eastern Massachusetts region -- Route 128 suburban public libraries and academic institutions. • Telnet and web links to Eastern Massachusetts academic institutions. The Library of Congress • The Library of Congress (LOC) is the federal government's main library and is accessible electronically. The following are some important links to the LOC: • Library of Congress Online Catalogs, http://lcweb.loc.gov/catalog/ • Thomas, the Congressional legislation, record, and other information, http://thomas.loc.gov/ • Library of Congress, Collection & Services, http://www.loc.gov/library/ LOC online catalog of its holdings • LOC attempts to have a copy of every book published. • Its has one of the largest collections of books and other media in the world. The Virtual Library The virtual library is an online web site created by professional librarians for the professional researcher. What is the Virtual Library • It is a collection of sites and materials evaluated for their content and usefulness and designed to assist the specialist in finding material in an efficient manner. • The virtual library is a catalogue of special collections of data, documents, and research and reference tools for information found in web sites. • It does not have a physical library of books but has only electronic documents. • Finally, virtual libraries are also called Gateways, Information Portals, and Cyber libraries Examples of Virtual Libraries • The WWW Virtual Library was started soon after the WWW was created by Tim Berners-Lee and a large group of volunteers. While extensive it is not uniformly maintained. http://www.vlib.org/ • The University of Michigan's School for Information http://www.ipl.org/ it seeks to learn by doing -- using the faculty and graduate students to create a state of the art information system. An online collection that will grow and follow leading edge of the art practices. Special collections and databases • Contain information about specific topics. Librarians and research specialists examine and select material based on relevance, organize them into databases, and/or create index catalogs. Scientific data, presidential documents, patents, laws and legal precedents are examples of special collections. • Social Science Data Archives (an extensive annotated survey of global data) http://www.spc.uchicago.edu/SocialClass/archives.html • IBM Intellectual Property Network http://www.patents.com/ This site can provide a good starting point but patents have many legal and technical issues that require the use of research professionals and highly specialized databases. • US Patent Office All US patents since 1790 Google’s Library Project The library project is part of Google’s Print Division and intends to digitize millions of books and periodicals and put them online as part of its search engine. Working with publishers and major university libraries. Source: http://www.eweek.com/print_article2/0,1217,a=140962,00.asp From Google’s Point of View • What is the Library Project? Google Print makes offline information searchable. As part of this project, we're now working to index the book collections of several major research libraries and make this content searchable through Google Print alongside books provided by publishers through our Publisher Program. Goals of the Library Project What is the goal of Google Print for Libraries? This project's aim is simple: make it easier to find relevant books. We hope to guide more users to books – specifically books they might not be able to find any other way – all while carefully respecting authors' and publishers' copyrights. Our ultimate goal is to work with publishers and libraries to create a comprehensive, searchable, virtual card catalog of all books in all languages that helps users discover new books and publishers find new readers. Source: http://print.google.com/googleprint/library.html Participating Libraries Each Participating library has between 7-15 million volumes. 1. Stanford University (entire collection) 2. Oxford University (digitize all volumes 1900 and before) 3. University of Michigan (entire collection) 4. Harvard University (pilot study of subset of collect) 5. New York Public Library (pilot study of subset of collect) Strategy based on Copyright laws Three classes based on copyright status: 1. Books out of copyright will be placed online in their entirety. 2. Books in copyright with partnership of publisher will be placed online in their entirety 3. Books in copyright and no partnership will be given short reference. Progress Report The Google library project is moving forward but has encountered legal issues on intellectual property rights. E-books The British Library One of the great libraries of the world has open its collections of rare books to be viewed over the web. The digital images of the original volumes with resource links for the scholar and the general public. Source: http://www.bl.uk/index.shtml The Treasures of the British Library • Scan the digital images of the original documents of the British government • Examine digital images of rare books that changed the world and its literature, i.e. your own private tour of the rare books room. http://www.bl.uk/treasures/treasuresinfull.html Searching for Information on the web 1. Search engines 2. Meta-Search engines Search Engines Search engines have two parts: 1. The search sends out onto the Internet a software called a spider or bot (robot). • • 2. Traces all the links and returns all the pages found. The pages are characterized by algorithms and stored in databases The retrieval system that takes a query and maps against the databases. • • The retrieval rank orders the responses by relevance Each search engine uses a unique technique for retrieval and ranking. Meta Search Engines • Meta search engines are search engines that use their own resources for answering the question • but they mostly form the query from the user input and package it and send it off to many other search engines simultaneously (the process is called spawning) and then wait until the replies come back. • After a fixed time the meta takes the responses received and pulls them together into a report. • There are many ways to create a meta search based on the idea. Some allow you to search only the web, others newsgroups, newspapers, and scientific journal. Meta Search Engines • Ask Jeeves -- frequently get the answer in the first pass. Jeeves allows queries in natural language. • Dogpile -- for its variety of sources (web, newsgroups, newspapers) • Ixquick • Metacrawler • ProFusion Why is an understanding of how a search engine works important? • From the view of a user: – The user wants to find the information with as few downloads as possible. – The easier to use and the more accurate the ranking the better. • From the view of a web site developer: – The developer wants the site to found by in the first 510 ranked responses to a query. – The merit of a web design is often based on the search rankings. This requires a knowledge of a given search engine ranks a page. Using Search Engines Forming successful queries using Boolean logic: • When searching for a large scale database, it is important to be extremely precise in characterizing the query. • Avoid using vague or common words that will only produce millions of pages with little to do with the subject at hand. • Boolean Logic is based on a proposition either being true or false. Boolean logic has several operators among them: AND, OR, NOT that can be used with propositions to determine the truth value of a complex expression. • The AND operator written as, a AND b, is only true when both the propositions a and b are both true and false otherwise, • OR written as, a OR b, is true when either a or b is true or when they both are and false when both are false. • The negation of a proposition a written as, NOT a, is true when a is false and false when a is true. How do we apply Boolean Logic to refine our search to only cases of interest? • • • • • • • Most search engines use some form of keywords and Boolean logic to refine the definition and improve the performance of the search's efficiency and results. If you are looking for a proper name, a phrase, or an other collection of words that normally are found together, then enclose them in double quotes, i.e. "President Gerald Ford". If the web page should have one or more words that must be on the page, then use the logical And, i.e. President And Ford And "United States". If the web page may have different forms of the name, or titles, etc. then use the logical Or, i.e. President Or "Vice President" Or Representative And "Gerald Ford". If document should exclude a word or phrase, then use the logical Not, i.e. "Gerald Ford" Not "Ford automotive" and Not "Ford car" and Not "Ford truck". While not Boolean logic, some search engines allow concepts like -- NEAR and FOLLOWED BY are also allowed, to indicate the relationship of the words or phrases other words and phrases. Normally these relations can be which comes first or whether the word is within a certain number of words to the first word. This concept is called proximity logic. Not all search engines use the AND, OR, NOT notation some like Alta Vista use " +" for AND and "-" for NOT. When in doubt ask a librarian: • The librarian is a trained professional and are well versed in using the various WWW resources for finding answers to a vast array of subjects. • The librarian should be used for difficult searches; but the student will wisely observe, learn, and contemplate the librarian's techniques, resources, and methods. Dogpile for finding non-text based files This material is now largely obsolete. Due to Security considerations most sites do not have non-text directories that are open to search and file download. Dogpile still allows search for images, audio, and videos. Evaluating and Using the Information Found The fact that the information matches your query is not the end of the trail. The material may be totally worthless, deliberately misleading, or even criminal in intent Did You Find Gold? Questions to consider: Concepts Who is the author(s) and/ or institution that created the document Does the author's supplied background support the needed expertise? Is the organization reputable? Is the information timely? Is the information current or is it outdated or overtaken by events? Is it written for the general public, a group with an emotional or political bias, or is it a professional society? Who is the target audience? Using the UMass Lowell E-library