Web Searching - University of Scranton: Computing Sciences Dept.

advertisement
C/IL102Lab
Fall 2003
Laboratory: Searching the Web
There are literally tens of millions of pages on the World Wide Web. Many of them have useful information on them. In this lab we
will learn how to find things on the web. There are two sources of web information: 1) search engines and 2) directories. A search
engine is a system that contains a “crawler”. A crawler is a program that automatically examines numerous web pages and obtains
information from them. A directory is also a listing of information, but it usually is sent in by a person and evaluated by a
committee of people. When you use either one of these systems, your query is compared to the information in their database and
possibly relevant sites are listed for you to examine.
There are several major sites that provide either search engines or directories. Most of these are free to use. Since they are free the
providers of these services must make money some way in order to stay in business. Often what they do is to give certain sites
preferential treatment. This means that searching is often times biased. As you become a veteran web user you will find sites that
you trust more than others. You may also find certain sites that are good for some searches but not for others.
AltaVista
http://www.altavista.com/
Excite
http://www.excite.com/
Google
http://www.google.com/
HotBot
http://www.hotbot.com/
Infoseek
http://www.infoseek.com/
Lycos
http://www.lycos.com/
Northern Light http://www.northernlight.com/
Yahoo
http://www.yahoo.com/
are among the largest and most famous. Yahoo is a directory with a search engine, all the others are search engines. For this Lab
we will concentrate on Yahoo and Alta Vista. That choice is mostly arbitrary and has nothing to do with the merits of the search
engines. My advice to you is to experiment with various search engines until you find the one that suits your personal taste.
Altavista is the basic search engine that is incorporated into the University of Scranton's official web page. My personal choice for
search engines is Google.
Another kind of search technique is to use a metacrawler. A metacrawler is a system that uses several search engines and directories
simultaneously and then arranges the results of the search for your use. Among the most interesting search engines are
metacrawler
http://www.metacrawler.com/
the original
Dogpile
http://www.dogpile.com/
finds everything, but do you want 467589 references to a topic?
Ask Jeeves
http://www.askjeeves.com/
interesting search refinement tool which has in my opinion become
the victim of too much success
We will use metacrawler to get experience using one of these methods.
Preliminary: There is a word document that you will use to hold the answers for this assignment. The name of the
document is "web-srch.doc" (without the " marks). It is located in F:\cil-lab.
 Double Click My Computer
 Double Click Drive F.
 Double Click Web-srch. This will run Word using Web-Srch.doc as the data file.
 Insert your name, the lab section and date
 You will be copying information from various web pages and pasting them into the word document as the lab
progresses.
 You will also type information into the sheet when instructed to do so.
 You will turn in a printout of this document. Answers must be typed on the answer sheet. Handwritten answers on the
answer sheet will NOT be graded.
Exercise 1. You are building a web page and would like to have some interesting graphics for the page. You would prefer them to
be free. Where can you find them? Search the Web. To compare the various web search techniques, we will use two of the major
search tools with a search request of free web page graphics. We want the search to include each of these items. That is we don’t
want to find articles on spider webs or congressional pages or free market economy. So we need a search that will use all of these
ideas. The concept that we are using here is to combine keywords using AND. Some of the search engines will let you just type
free web page graphics while others will demand something like +free+web+page+graphics or “free web page graphics” or free
AND web AND page AND graphics. Note that some of the search engines are very particular. Typing +free + web + page +
graphics into Altavista used to give over 600,000 hits before changes were made to their search engine. Typing
+free+web+page+graphics (without the spaces) gives 760 hits. The moral of the story is that you have to know your search engine
and its quirks to effectively search the web. Use Alta Vista and Yahoo to find out about free web page graphics and answer the
1
C/IL102Lab
Fall 2003
questions on the answer sheet. A row for Altavista has already been filled out based on data obtained in October (of 2003), but you
should fill it out again based on current data. Find one of the pages that looks like it would satisfy your needs and then copy its
information and paste it into your word document as described below.
In general to perform a search do the following:
 Start Netscape navigator.
 Click on File
 Select Open Page
 Type the URL in the box -- for example: http://www.google.com/
 Click open
 When you get to the page follow the instructions.
FOR TODAY'S ASSIGNMENT DO THE FOLLOWING AFTER LOADING NETSCAPE
 Choose the altavista search engine as described above. It is at http://www.altavista.com.
 For Altavista type free web page graphics into the search box and press the search button.
 When the list comes up remember the number of "hits", bring up the word document, and type it in using the same format
given in the row above. This number is found in the list below the “sponsored” matches.
Go back to Netscape. Choose the site for Pambytes graphics. From that page choose buttons and from the buttons
page choose oval buttons.






At the oval buttons page, place your mouse pointer into the location box and click. This should highlight the URL. Press
CTRL-C to copy the URL to the computer memory
Bring up the word document, move the cursor to the line below where you type the answer to the previous question and
press CTRL-V which should then paste the URL of the web page into your answer sheet.
Now go to the Yahoo search engine at http://www.yahoo.com and repeat the process
At the Yahoo search engine, click on advanced search to the right of the search button.
On the options page choose as follows: Yahoo!, type free web page graphics into the box include all words, English, any
country, past 6 months and. Choose search.
Go to this web site and type the number of hits into your word document, going to the second page suggested and copying
the link into your word document.
Exercise 2. YOUR CHOICE HERE.
We will finish up by using the Web to find information that one doesn’t normally associate with web pages. We will use searching
techniques to find businesses and to find addresses, driving directions, phone numbers and e-mail addresses for people.
Exercise 3. We will find a restaurant. I am going to a convention in Minneapolis, MN. and would like to eat at an Indian restaurant.
I don’t happen to have a phone book for Minneapolis available so you will need to use an online yellowpages to search for an Indian
restaurant. Most search engines have yellow pages available from them. However we will pick one of the standard yellow page
selections. Two of them are bigyellow and bigbook. Connect with page http://www.bigbook.com/ and find me an Indian
Restaurant in Minneapolis, MN. (name, address, and phone number). Enter the information to your Word document in the
indicated place.
Exercise 4. I would like to take a trip to Niagara Falls this weekend for a short holiday. What is the driving mileage from here to
there and what is the mileage back? Hint: Go to a search engine such as http://www.google.com/ and search for “road maps”. One
of the first choices should be MapQuest, probably the best site for finding mileages and routes. Go there and choose Driving
Directions. Enter the appropriate information to find a route from 800 Linden St. zip 18510, the official address of the University of
Scranton, and Niagara Falls. On the answer sheet print the mileage. Now find the mileage from Niagara falls back to the U and
print it on the sheet.
Exercise 5. Find out some stuff about people from the web. See what you can find out about the following people: me (Dr. James
R. Sidbury), Dave Rhodes, Gordon Sinclair, Amy Bruce, Craig Shergold. For me find my address, phone number and web presence.
For each of the others, explain why they are famous enough to be asked about in this question.
2
Download