Part Two: Turning Information into Data

advertisement
Jim Williams
HONP-112
Week 7



Part 1: Networking overview
Part 2: Data transfer methods
Part 3: Communication Channels



A Network consists of at least two computers,
and other peripherals (like a printer) that are
connected to each other.
It also contains the intermediate equipment
that makes these connections possible.
Computer networks allow resources to be
shared, and allows data to be transferred
from one location to another.


Efficiency: For example, you can have one
inventory database that is shared by many
different store branches.
Economy/Ecology: For example, instead of
buying and maintaining a printer for every
computer in a classroom, you can just have
one shared printer on the network.



Network Interface Card (NIC) or the equivalent
circuitry must be in each device that is
connected to the network.
Communications channel(s): the appropriate
physical wiring.
Separate machines that regulate how data is
communicated along the network (some
examples: hub, switch, router, gateway,
bridge – some functionality may overlap). We
will not discuss these details in this class.



A Local Area Network (LAN) covers a small
geographical area (a house, office building,
college campus, etc.)
A Wide Area Network (WAN) covers a large
geographical area (a college with multiple
campuses, a business with many store
locations, etc.)
Frequently, WANs consist of multiple LANs
that are connected to each other over long
distances.


A Protocol means a “set of rules” that the computers
on a network must understand in order to
communicate with each other.
Different networks can use different protocols.
Some you may have heard of include:
◦ Ethernet (common for LANs and WANs)
◦ TCP/IP – Transfer Control Protocol / Internet Protocol
◦ HTTP – Hypertext Transfer Protocol (i.e. the WWW)


There are many others. Just know that a computer
that understands one protocol cannot communicate
with another computer that understands a different
protocol.
However, in practice, MOST computers and other
digital devices are configured to “understand” TCP/IP.





Part of the network interface circuitry in a
machine includes a unique address called a
Machine Access Control (MAC) address.
These addresses are binary patterns which
are 48 bits long.
How many bytes long are MAC addresses
How many possible MAC addresses are there?
Do you think it is possible to run out of MAC
addresses? What then?






Besides MAC addresses, another way of
identifying a computer is the IP Address.
This is only applicable of course devices that
understand the TCP / IP protocol, and wish to
operate over the Internet.
These addresses are binary patterns which are (at
this time) 32 bits long.
How many bytes long are IP addresses
How many possible IP addresses are there?
As Internet use has increased, there is a danger
of running out of available IP addresses. Think
how you would solve the issue, then research
how the issue is currently being handled.

Peer-to-peer networks consist of numerous
computers connected to each other, but there is
no “main” computer in the system.
◦ Example: file-sharing networks like bit torrent, limewire,
etc.

Client-Server networks consist of a main
computer called the “server,” and the multiple
computers attached to it are called the “clients.
◦ Example: the BlackBoard system at this college. You
have to log into the BlackBoard server from your home
computers (which function as the clients).

Peer-to-peer networks consist of numerous
computers connected to each other, but there is
no “main” computer in the system.
◦ Example: file-sharing networks like bit torrent, limewire,
etc.

Client-Server networks consist of a main
computer called the “server,” and the multiple
computers attached to it are called the “clients.
◦ Example: the BlackBoard system at this college. You
have to log into the BlackBoard server from your home
computers (which function as the clients).


Topologies describe the ways that machines are
connected to each other on a network.
We will briefly discuss some – try to think of the
advantages and disadvantages of each as we do.

Ring : Each machine on the
network is attached to only and
only two others.
◦ Less expensive, but less resistant to
failure

Mesh: Each machine is attached
to every other machine
◦ Very resistant to failure, but
expensive to maintain.

Star : Each machine has a single
connection to a central machine.
◦ Very common for client-server
networks. Network will fail if central
machine goes down.

Bus: Each machine is connected
to a single high-capacity data line
called a “backbone”
◦ Example: large telecommunications
networks.

Tree : This is actually several star networks
attached to a “backbone” that essentially
connects networks to other networks.
◦ Very common for wide–area networks of large
businesses, universities, etc. that may have
multiple physical locations, each with its own
network.


Networks would be useless unless we were
able to actually send data from one machine
to another.
There are many ways to do this, but we will
focus on one very common method that is
shared by many types of network protocols.


Data communication in most cases happens
asynchronously.
This means that a single logical “piece” of
data sent over the network (for instance a file,
an e-mail message, etc.) is not transmitted
“in order.” Rather it can be “broken up” into
smaller pieces, and later “reassembled” once
all the pieces arrive at the destination. It
does not matter what order the pieces arrive
in!



Imagine you had written a book, and you were
MAILING (using the Post Office) a copy to your
editor.
But for some strange reason your editor only
allows you (and all the other authors he works
with) to mail one page at a time, in separate
envelopes! There is of course no guarantee that
all of the envelopes will arrive at the publisher in
the correct order, etc.
Is this possible? Yes. But think for a minute
what would be needed for this scheme to work
successfully.








You (the sender) would need a way to communicate
directly with the editor (the receiver).
The sender needs to tell the receiver to expect a new
book.
The sender needs to know where to mail the pages to.
The receiver needs to know where the pages are being
mailed from.
The receiver also needs to know what book the page is
from (the same author may be mailing several books).
The receiver also needs to know how many pages each
book should have.
The receiver needs a way to notify the sender if any pages
are missing. The sender can the re-send.
The receiver needs a away to confirm with the sender that
the entire book was received successfully.



If we are mailing a letter, an envelope can
contain the sender and receiver address.
If we wish, we can also write the title of the
book, and something like “page number 7 of
153” on the envelope as well.
To keep things safe we can also write this
information on each page that we put into the
envelope (in case the envelope gets lost...).



The receiver will of course first divide the
many envelopes he receives into separate
stacks for each sender/book he receives
books from.
Then he opens each envelope and assembles
the book pages into the correct order as they
come in.
If after a certain amount of time (days in this
case??) there are still missing pages, based on
what the total number of pages is, he will call
the sender so the page can be re-sent.

Notice the various types of communication
that need to happen:
◦ Author tells editor to expect a book shipment (good
thing to check – what if editor is on
vacation/unavailable)?
◦ Editor needs to ask author for any missing pages
after an agreed-upon time.
◦ Author needs to inform editor that the book has
been received in its entirety, and author needs to
confirm/agree. This marks the end of that
transaction between the two parties.



When a file, or other single “piece” of data is
send over a network, almost the same thing
happens as in our hypothetical author/editor
example.
The data is broken into smaller chunks called
“packets” which are sent out to their
destination on the network.
All the packets may not arrive at the same
time, or even by the same means, but
hopefully they all eventually get to their
destination.




A data “packet” consists of two main parts.
There is a “header” which describes what the
packet contains, where it is coming from,
where it is going to, etc. This is simliar to the
“envelope” in our example.
There is also the data itself the packet
contains. This is similar to the “page” in our
example.
Different protocols have different formats for
packets. We will not be concerned with the
technical details. Just know the principles.





The sending machine must establish communication
with the receiving machine. The receiving machine
must of course respond in kind. In technical terms
this is called a “handshake.”
When packets are received they must of course be reassembled.
If any expected packets are missing, the receiving
machine sends a special signal to the sending
machine to re-send the packet.
When the file is successfully reassembled the
receiving machine sends an “Acknowledgement”
signal to the sending machine that the file was
received.
The sending machine can then end the “handshake”
and the transaction is complete.


Given: Our imaginary packet format is as
follows:
Header:
◦
◦
◦
◦
◦

File ID Number:
From:
To:
Total packets for this file:
Packet Number for this packet:
Data
◦ 16 Byte “chunks”

File: a text file that says “I love learning about
data communication.”
◦ Assume we are using ASCII (1 byte per character) as
per our ASCII chart to store this data digitally.

So the file is 41 bytes long. We said that it
needs to be broken up into 16-byte “chunks”.
This means we need to break it up into three
packets before we send it over the network.







It is easiest at this point to express the file in
hexadecimal bytes as per our ASCII chart..
Then we can easily break up into our three
packets.
“I love learning about data communication.”
Expressed in hexadecimal characters:
49 20 6C 6F 76 65 20 6C 65 61 72 6E 69 6E 67 20
61 62 6F 75 74 20 64 61 74 61 20 63 6F 6D 6D 75
6E 69 63 61 74 69 6F 6E 2E

Here is our hexadecimal representation of the
file. Remember that a single ASCII byte is
represented by two hex digits. So:
Packet 1 data:

49 20 6C 6F 76 65 20 6C 65 61 72 6E 69 6E 67 20

Packet 2 data:

61 62 6F 75 74 20 64 61 74 61 20 63 6F 6D 6D 75

Packet 3 data:

6E 69 63 61 74 69 6F 6E 2E







Each packet must also contain a header. We
will be completely hypothetical and say that
we can represent this however we wish. Let’s
just keep our header in “plain English” for this
exercise. Let’s use this header:
File ID: 435-2011-10-13-08:09:76.341AM
From: 435
To: 3998
Total number of packets: 3
Packet number: (# of packet)







Here is what packet 2 for this example may
look like:
File ID: 435-2011-10-13-08:09:76.341AM
From: 435
To: 3998
Total number of packets: 3
Packet Number: 2
Packet Data: 61 62 6F 75 74 20 64 61 74 61 20 63
6F 6D 6D 75



Understand the basics of how asynchronous
transfer works.
I may give you packets and ask you to
reassemble the text file (and also possibly
identify any errors with the packets if that is
the case!).
I may also give you a text file and tell you to
break it into packets, using some clearlydefined hypothetical scheme.


The next slides will discuss data
communication channels and their units of
performance measurement.
We will also work out some hypothetical
problems to determine how much time
certain data transfers may take.




Most communication channels (wires/lines) are
suited for analog electrical signals (due to
physical properites that are beyond this class)
So, before data is sent through a channel, it has
to be modulated into an analog form (without
getting too technical: 0s and 1s are translated
into different “frequencies”).
When it is received by the other machine, it must
be de-modulated back to digital form.
The device that does this is called a
modulator/de-modulator, or “Modem” for short.


This term is used to describe the amount of
data that can pass through a point in the
network during some specified length of
time.
The time element is very most important to
keep in mind. Because, in this context, the
bandwidth is what determines the speed in
which the data can travel from one point to
another.




Imagine I wanted to pump 100 gallons of
water from a lake into a storage tank.
I have two hoses I can choose from (a narrow
garden hose, and a wide firefighter’s hose).
Of course, 100 gallons of water can pass
through either hose with no problems.
But, through which hose will it take less TIME
for the 100 gallons to pass through?



Simple physics tells us that it will take longer for
the same volume of liquid to move from point A
to point B when confined to a smaller space,
versus a larger space (because less water can
reside in a smaller space - at a point in time than in a larger space)
So, the same volume of water will take a longer
time to move through the narrow hose, as
opposed to the wider hose.
Looked at another way, we can pick a point on
the hose and measure the volume of water which
passes that point in a given time (like one
second).




Bandwidth is a figure that describes how many
bits can be communicated along the data
communications channel in a given interval of
time.
Different communications channels have
different bandwidths (for various physical and
technical reasons).
In most cases bandwidth is measured relative to
Bits (as opposed to bytes).
So you will frequently see figures relative to
kilobits (Kb), Megabits (Mb), etc. Notice the
lower case “b” – which means a single bit.

Ethernet (LAN): 100 Mbps
T1 backbone (WAN): 1.5 Mbps
T3 backbone (WAN): 44 Mbps
Telephone Modem: 56 Kbps
ADSL Modem:

Cable Modem:

Fiber Optic Internet




◦ 2 Mbps (download) / 512 Kbps (upload)
◦ 5 Mbps (download) / 384Kbps (upload)
◦ 15Mbps (download) / 5Mbps (upload)
** IMPORTANT – These are just some samples. There are
many variations – too numerous to mention. If you are
curious about the bandwidth of your own personal
network service (like Cable/DSL, etc.) contact your vendor.



Bandwidth figures do not always give an accurate
representation of real-world performance.
Many issues, including the current amount of
other network activity, can cause performance to
degrade and the data communication “speeds” to
be much slower that the stated figures.
While “bandwidth” describes what a network is
capable of, the term “throughput” is sometimes
used to describe actual performance (of course
this changes as the network conditions change).



I want to download a 6.5 MB file from the
Internet. I am using an ASDL service capable
of 2Mbps download bandwidth. Assuming
that I am getting maximum throughput, how
long, in seconds, will it take to down load the
file?
You can do this two ways: express the file in
terms of bits, or express the bandwidth in
terms of bytes.
For this example we will use the first method.




Express a 6.5 MB file size in terms of bits.
There are 8 bits for every byte, so we multiple
6.5 by 8 to get the size in bits.
6.5 MB * 8 = 52 Mb (notice the small “b”).
From now on consider the file to be 52 Mb.



Now, we know the file size is 52 Mb. We also
know the bandwidth is 2 Mbps.
So divide the file size by the bandwidth. The
Mb will “cancel out” and we will be left with a
measurement in seconds.
X = 52 Mb / (2 Mb / sec) = 26 seconds




When we discussed the file wise, we did NOT
account for any breaking it up into packets,
additional space for packet headers, etc.
We also did not adjust for any other issues that
may impact the network performance.
This is because I am only trying to teach the
basic principles of how we can calculate the
network performance, given a file size and a
bandwidth.
In real life there are many other factors that come
into play, beyond the scope of this class. But
hopefully these simplified problems will help you
understand the concept.



Let’s try another probem, stated in simpler
terms:
Given: File size = 45 MB, bandwidth = 384
Kbps (cable upload speed)
How many minutes will it take to upload the
file?

File Size: 45MB * 8 = 360 Mb
◦ ** file size now expressed as megabits



Bandwidth: 384 Kbps
X = 360 Mb / (384 Kb / sec)
X = 360 Mb / (.384 Mb / sec)
◦ ** both figures expressed as megabits. (Remember K
means “times a thousand” and M means “times a
million”).


X = 937.5 sec
X = 937.5 sec / (60 sec / min) = 15.625 min
◦ **answer expressed in minutes





For solving performance problems:
Know the time increments the answer is
supposed to be in
Take care to do proper conversions to make sure
both the file size and bandwidth are relative to
the same units of measure (bits vs bytes).
Along with this, also make sure you handle the
prefixes like Kilo- and Mega- correctly.
We did all these things in our second example
problem. Know how to solve other problems of
similar complexity.
Download