C/IL102Lab Fall 2003 Laboratory: Searching the Web There are literally tens of millions of pages on the World Wide Web. Many of them have useful information on them. In this lab we will learn how to find things on the web. There are two sources of web information: 1) search engines and 2) directories. A search engine is a system that contains a “crawler”. A crawler is a program that automatically examines numerous web pages and obtains information from them. A directory is also a listing of information, but it usually is sent in by a person and evaluated by a committee of people. When you use either one of these systems, your query is compared to the information in their database and possibly relevant sites are listed for you to examine. There are several major sites that provide either search engines or directories. Most of these are free to use. Since they are free the providers of these services must make money some way in order to stay in business. Often what they do is to give certain sites preferential treatment. This means that searching is often times biased. As you become a veteran web user you will find sites that you trust more than others. You may also find certain sites that are good for some searches but not for others. AltaVista http://www.altavista.com/ Excite http://www.excite.com/ Google http://www.google.com/ HotBot http://www.hotbot.com/ Infoseek http://www.infoseek.com/ Lycos http://www.lycos.com/ Northern Light http://www.northernlight.com/ Yahoo http://www.yahoo.com/ are among the largest and most famous. Yahoo is a directory with a search engine, all the others are search engines. For this Lab we will concentrate on Yahoo and Alta Vista. That choice is mostly arbitrary and has nothing to do with the merits of the search engines. My advice to you is to experiment with various search engines until you find the one that suits your personal taste. Altavista is the basic search engine that is incorporated into the University of Scranton's official web page. My personal choice for search engines is Google. Another kind of search technique is to use a metacrawler. A metacrawler is a system that uses several search engines and directories simultaneously and then arranges the results of the search for your use. Among the most interesting search engines are metacrawler http://www.metacrawler.com/ the original Dogpile http://www.dogpile.com/ finds everything, but do you want 467589 references to a topic? Ask Jeeves http://www.askjeeves.com/ interesting search refinement tool which has in my opinion become the victim of too much success We will use metacrawler to get experience using one of these methods. Preliminary: There is a word document that you will use to hold the answers for this assignment. The name of the document is "web-srch.doc" (without the " marks). It is located in F:\cil-lab. Double Click My Computer Double Click Drive F. Double Click Web-srch. This will run Word using Web-Srch.doc as the data file. Insert your name, the lab section and date You will be copying information from various web pages and pasting them into the word document as the lab progresses. You will also type information into the sheet when instructed to do so. You will turn in a printout of this document. Answers must be typed on the answer sheet. Handwritten answers on the answer sheet will NOT be graded. Exercise 1. You are building a web page and would like to have some interesting graphics for the page. You would prefer them to be free. Where can you find them? Search the Web. To compare the various web search techniques, we will use two of the major search tools with a search request of free web page graphics. We want the search to include each of these items. That is we don’t want to find articles on spider webs or congressional pages or free market economy. So we need a search that will use all of these ideas. The concept that we are using here is to combine keywords using AND. Some of the search engines will let you just type free web page graphics while others will demand something like +free+web+page+graphics or “free web page graphics” or free AND web AND page AND graphics. Note that some of the search engines are very particular. Typing +free + web + page + graphics into Altavista used to give over 600,000 hits before changes were made to their search engine. Typing +free+web+page+graphics (without the spaces) gives 760 hits. The moral of the story is that you have to know your search engine and its quirks to effectively search the web. Use Alta Vista and Yahoo to find out about free web page graphics and answer the 1 C/IL102Lab Fall 2003 questions on the answer sheet. A row for Altavista has already been filled out based on data obtained in October (of 2003), but you should fill it out again based on current data. Find one of the pages that looks like it would satisfy your needs and then copy its information and paste it into your word document as described below. In general to perform a search do the following: Start Netscape navigator. Click on File Select Open Page Type the URL in the box -- for example: http://www.google.com/ Click open When you get to the page follow the instructions. FOR TODAY'S ASSIGNMENT DO THE FOLLOWING AFTER LOADING NETSCAPE Choose the altavista search engine as described above. It is at http://www.altavista.com. For Altavista type free web page graphics into the search box and press the search button. When the list comes up remember the number of "hits", bring up the word document, and type it in using the same format given in the row above. This number is found in the list below the “sponsored” matches. Go back to Netscape. Choose the site for Pambytes graphics. From that page choose buttons and from the buttons page choose oval buttons. At the oval buttons page, place your mouse pointer into the location box and click. This should highlight the URL. Press CTRL-C to copy the URL to the computer memory Bring up the word document, move the cursor to the line below where you type the answer to the previous question and press CTRL-V which should then paste the URL of the web page into your answer sheet. Now go to the Yahoo search engine at http://www.yahoo.com and repeat the process At the Yahoo search engine, click on advanced search to the right of the search button. On the options page choose as follows: Yahoo!, type free web page graphics into the box include all words, English, any country, past 6 months and. Choose search. Go to this web site and type the number of hits into your word document, going to the second page suggested and copying the link into your word document. Exercise 2. YOUR CHOICE HERE. We will finish up by using the Web to find information that one doesn’t normally associate with web pages. We will use searching techniques to find businesses and to find addresses, driving directions, phone numbers and e-mail addresses for people. Exercise 3. We will find a restaurant. I am going to a convention in Minneapolis, MN. and would like to eat at an Indian restaurant. I don’t happen to have a phone book for Minneapolis available so you will need to use an online yellowpages to search for an Indian restaurant. Most search engines have yellow pages available from them. However we will pick one of the standard yellow page selections. Two of them are bigyellow and bigbook. Connect with page http://www.bigbook.com/ and find me an Indian Restaurant in Minneapolis, MN. (name, address, and phone number). Enter the information to your Word document in the indicated place. Exercise 4. I would like to take a trip to Niagara Falls this weekend for a short holiday. What is the driving mileage from here to there and what is the mileage back? Hint: Go to a search engine such as http://www.google.com/ and search for “road maps”. One of the first choices should be MapQuest, probably the best site for finding mileages and routes. Go there and choose Driving Directions. Enter the appropriate information to find a route from 800 Linden St. zip 18510, the official address of the University of Scranton, and Niagara Falls. On the answer sheet print the mileage. Now find the mileage from Niagara falls back to the U and print it on the sheet. Exercise 5. Find out some stuff about people from the web. See what you can find out about the following people: me (Dr. James R. Sidbury), Dave Rhodes, Gordon Sinclair, Amy Bruce, Craig Shergold. For me find my address, phone number and web presence. For each of the others, explain why they are famous enough to be asked about in this question. 2