Introduction

advertisement
Dealing Effectively With Data
Section 1 Introduction
Introduction.
What Happened to the Library?
The fact that you’re taking this course on the SUNY Learning Network, independent of
any physical classroom space or library, probably has already clued you in that I have
some reservations about the exact title of this course. I will ask you to visit a library as
part of your assignments. We hope that Feinberg Library is a friendly place for studying
or getting together with friends to work on an assignment. We also have around 300,000
books and 1400 magazine and journal subscriptions. Yes, you can find lots of newspaper
articles touting the “virtual library” or the “library without walls”. Ever tried to read an
entire book or journal article online? I must admit I usually make a paper copy of
magazine articles that I find online. If you think I’m wrong about this, Feinberg Library
gives you access to a couple of online book services: NetLibrary and Books 24/7. Give
them a try. Yes, you can get the book this way at anytime of day or night. However, try
reading it this way on a slow Internet connection or at the beach. I guess I don’t see the
physical library building disappearing anytime soon.
Also, don’t forget about the people that work at Feinberg and other libraries. I know it’s
unlikely that anybody is going to write a movie script character for Julia Roberts or Brad
Pitt to star as a librarian in the next romantic thriller. Ok, I can accept that. On the other
hand most of the library staff doesn’t live for telling people to “shush” or quiet down in
the library. In fact we totally redid our Reference room a couple of years ago to make it
comfortable for small groups to do research together at the computers. The reference
librarians and all members of the library staff are here to help you with your research.
Don’t be afraid to ask us for help or just stop by to chat. Enough of my soapbox. . .
Change is the Only Constant
What are the challenges for you during this class and 10 to 20 years down the road? I
know you’ve heard this one before, but it bears repeating for this class. The one constant
you will encounter when working with computing and information environments is
change. Web pages certainly do not stay the same; you’ll be lucky if the structure for
“library research resources” pages we’ll be using for large part of the course don’t change
during this semester. Yes, we have plans for a major overhaul again for next fall.
Mainstream computer program products also change. My first word processor was
WordPerfect. I think it’s still around, but you’d be hard pressed to find anyone who owns
a current version. However, one of these days Bill Gates is going to lose a government
lawsuit and have to open up the personal software computing field a bit. Databases??
We keep adding new ones. . . The InfoTrac that you used on a CD-Rom in high school is
now called IAC Searchbank. Its search interface may look a little different but it’s the
same product. All of our databases continually have minor tweaking in the search
1
Dealing Effectively With Data
Section 1 Introduction
interface or the way the results get displayed back to you. Fortunately, once you
understand how databases are set up you will be easily able to adapt to these changes.
We’ll talk about this in much more detail in chapter three.
A Brief History of Computing and the World Wide Web
As we start this course, you need to keep in mind that the history of the computer and the
Word Wide Web is extremely recent. The precursors to computers, code breaking
machines invented for World War II, date only back to the 1940’s. The ENIAC,
considered the first real computer doesn’t appear until the late 1940’s and consisted of a
large room filled with vacuum tubes weighing 30 tones for its computational power.
How did the scientists transfer information to be processed to these computers? Today
we’re accustomed to typing in our word processors or typing in HTML, JAVA, or Visual
Basic programs.
It’s only been 20 years ago that the main way to enter information into one of these huge
(in terms of size, not processing power) computer was to use punch cards. In other words,
you’d type your BASIC program using a special typewriter that punched holes in the
cards representing special commands. Each line had to be on a separate card.
For example, the following “program” hopefully prints out the word hello 10 times
10 X=1
20 FOR X=1 to 10
30 Print "Hello"
40 X=X+1
50NEXT X
60 END
This simple program would have required at least six punch cards that would then have
needed to be fed through a special punch card reader into the computer.
Let me give you some idea as to the scale on which computing power has increased since
as recently as the 1960’s. Co-founder of Intel (as in your Pentium processor) Gordon
Moore observed a phenomenon in computing in 1965 that has since become referred to as
Moore’s law. “Transistors per integrated circuit” is a method of measuring computing
power. In 1965 he predicted that the number of “transistors per integrated circuit” would
double every 18 months and that this growth would continue until 1975. This ratio in
computing power increase still holds up well today.
Here’s an illustration from Intel’s own web site showing increases in computing power.
I’m sure you recognize some of these computing models from your own experience. My
first computer when I started here at PSU was a model 286. The computer I have now is a
Pentium III. It pretty much blows anything I could do with the 286 out of the water and
I’m sure many of you have computer setups a lot more powerful than mine!
2
Dealing Effectively With Data
Section 1 Introduction
So How Does This Internet-Web Fit into the Picture?
Sorry Al, you didn’t invent the Internet. However, I must credit Former Vice-President
Al Gore with working a lot to popularize the Internet for the average person and being the
architect of legislation that benefited the growth of the Internet.
Much like Moore’s law, the Internet actually started during the 1960’s with names like
ARPANET and DARPANET. It was used primarily as a means of transferring
information (files and data) between scientists and government researchers. It did three
main things: TELNET directly to another computer, FTP file transfer protocol, and a very
rudimentary form of email. Today in addition to these three things it also includes
HTTP, Hypertext transfer protocol which is what we use to bring up web pages. Do I
expect you to remember DARPANET or what FTP stands for? No. I just want to make
sure you understand the difference between the World Wide Web and the Internet. The
World Wide Web is really a subset of the Internet. Without all the Internet components
you could not, for example, do email.
There’s another feature of Internet history that you need to know about to understand
today’s Internet. The D in DARPANet is for the U.S. Department of Defense. You’ll
recall that the 1960’s and 1970’s were the height of the Cold War with the advent of the
Cuban missile crisis and the nuclear ICBM buildup. Russian missiles were considered a
real threat and we got to do “duck and cover” drills in school. (I doubt this would have
done much good against a nuclear bomb, but I just want do give you some sense of the
time.) Anyway, because of this the Internet was set up to be redundant. What do I mean
by this? On the Internet information gets transferred between various points on the way
to its final destination. These various points can be changed along the way if a particular
point (location) was attacked by a nuclear bomb. It’s called routing and actually came
3
Dealing Effectively With Data
Section 1 Introduction
in very useful during the recent attack on New York City. Our SUNY campuses in and
near the city were able to get Internet access by having the Internet routed through
various other points of the system that were not affected.
Ultimately for you as the Internet user this redundancy has one major consequence. The
Internet and the information you pull off the world wide web today has no central
hierarchy. It has no governing board or authority that decides what information is
available or who (person, organization, government) has the right to put information on
the Internet. The result for you as the Internet information user is that you need to be
extremely aware of figuring out whom exactly is the source of the data that you’re
pulling off the web.
Just how much stuff is there really on the World Wide Web part of the Internet?
Interestingly enough the World Wide Web did not exist yet when I was in college. Most
people trace the origin of the World Wide Web to around 1993 with the advent of
Gopher, a text based (no graphics) way of viewing web information and Mosaic, a
graphical web program that is a precursor to Netscape and Internet Explorer (IE). This
means that in addition to the fact that the Internet has no central authority, the World
Wide Web that we take for granted when we use Netscape or IE is less than 10 years old.
Hopefully this will go a long way toward explaining the problems with Internet search
engines when we get to Module 7.
Let’s take a quick look at the growth of the World Wide Web from Hobbes’ Internet
Timeline, which is considered one of the standards for examining the WWW.
As you can see, in 1993 there were only 130 WWW sites. Most of these were
governmental or from universities. Remember there is a difference between a web site
and a web page. Each document that you view as part of this SLN course is considered
4
Dealing Effectively With Data
Section 1 Introduction
an individual web page. The entire course is considered a web site. Hobb’s timeline
shows close to 40 million website by the end of 2001. Each of these web sites can
contain many thousands of pages. Interestingly, many of these websites are less than
three years old. No wonder you can’t find anything on the WWW. It’s grown so quickly
that the search engine software can’t keep up!
5
Download