First Steps with Excel

advertisement
First steps with Excel
Dr. M. Lawrence Clevenson
Starting Excel
To start Excel, there may be an icon for Excel on your screen; if so, double-click
it. Also, the Start menu may have Open Office Document; click on Start in the
lower left corner, and see if an icon for New Office Document appears. After
clicking on that, see if one of the choices is Blank Workbook with a green X. If
so, double-click on that choice. The choice that will surely be available for
everyone is to first click on Start, then Programs, then either Microsoft Excel or
MS Office 2000 and then Microsoft Excel.
Moving Around
Once “in” Excel, you will see a rectangular array of cells, above which are rows
of menu choices and icons. The cell in row 1 and column A has a darker border
than the others; it is the active cell, which means that anything that you input with
the keyboard will go will go in that cell, called A1. You can change the active
cell with the mouse, by clicking, or with the navigation keys, arrow keys and Page
Up and Page Down keys. The End key in combination with an arrow key moves
the active cell to the last nonblank cell if you start in a nonblank cell, and the first
nonblank cell if you start in an empty cell. All of your cells should start empty
(blank), so End (just press once and release; do not hold down) DownArrow and
End RightArrow should bring you to cell IV65536, which informs you that Excel
allows 65,536 rows and 256 columns, labeled A, B, C, …, Z, AA, AB, …, AZ,
…, IU, IV. To return to A1, try End UpArrow and End LeftArrow, or
Ctrl+Home, a command that (usually) returns to A1.
Entering Data
The cells in Excel are either empty or contain labels, values, or formulas. Let’s
start with a simple example: enter Marital Status of US Adults in A1. In B2 and
C2 enter Marital Status and Count (millions). Then make a column of entries
beneath these two headings that look like the columns below
Never married
43.9
Married
116.7
Widowed
13.4
Divorced
17.6
Most likely the labels in B2, C2, and B3 through B6 don’t fit in the cells, but
Excel is recording all of the characters. To see them, use the mouse to click on
the space between the column labels B and C, and then click and drag it to the
right; this adjusts the width of the column. If you double click between the
column labels, Excel will make the width just large enough to cover all the
existing entries in that column. The menu command is Format > Column >
Autofit or Alt+o c a (with the cells selected or highlighted)
First steps, p. 1 of 6, Dr. M. Lawrence Clevenson
First Graph
There are many ways to make graphs with Excel. One of the easier ways is to use
the “Chart Wizard.” Highlight the range from B2 to C6. For a small area like
this, the easiest way to highlight is to click on any “corner” (B2, C2, B6, or C6),
and drag to the opposite corner. Very large areas are highlighted more quickly by
holding the Shift key and then using the End and Arrow buttons. Try that method
also. While this area is highlighted, click on the Chart button; this icon looks like
a blue, yellow, and red bar graph. Excel gives you many choices and sub-choices
of graphs. A Pie Chart is good for the marital status data. Click on Pie, the fourth
choice down. The highlighted sub-type is described, and you can click on other
choices to see what they look like. You can always change to a different choice;
choose any one for now. Click Next. Step 2 allows you to choose or change the
input data. No changes are necessary for this example; click Next. Step 3 allows
you to input a title for you chart. Notice that Excel already “guessed” a title for
you, but you can change that by highlighting the title, and typing “over” it. Try
that. Then choose the Data Labels tab. Try the Show label and percent button;
do you like the graph? Then the legend on the right is not necessary. Click on the
Legend tab, and uncheck the Show legend box. Remember, any of this can be
changed easily. Click Next. The last choice is entirely subjective; I usually like
my charts in separate sheets; try that. There are advantages to either choice so we
should try both.
Simple Formulas
Suppose we want to find the percentage of each marital status (without the graphs
we just did). Highlight the values in C3 to C6. Click on the sum () key. Click
on C7. Notice the formula: =Sum(C3:C6). Since spreadsheets often compute
totals for rows or columns, the  button provides an easy way to do it. However,
many formulas can be entered at least somewhat automatically. Next to the 
icon is the icon fx. Choose an empty cell, say C8, and click the fx button. The
Paste Function dialog box appears and asks you to choose a function. For
category, choose All, or Math & Trig, and then scroll down for your function,
Sum. Highlight Sum and Click OK (or double click Sum). A second dialog box
appears that allows you to enter the data, or the arguments, to the Sum function.
Click on the little red arrow button to the right of the box. Now highlight the data
you want summed, from C3 to C6. Click the red arrow button again. You are
again presented with the dialog box, but there are no more values to be summed.
Note that the dialog box also displays the answer to your function, 191.6
In D2 enter the label Percent. In D3 enter =C3/C$7. This formula divides the
entry in C3 by the entry in C7 and shows the results of this calculation in D3.
Note that when D3 is the active cell, the entry window shows the formula, but D3
shows the result of the calculation. This does not look like a percentage; make D3
active and click on Format in the menu row. (Skilled keystrokers may prefer
Alt+o, the underlined letter in Format.) Click Cells and the Number tab. Then
First steps, p. 2 of 6, Dr. M. Lawrence Clevenson
Click Percentage, and choose the number of decimal places you want, say 1, by
clicking the up or down arrow keys in the Decimal places box. Look OK?
One reason spreadsheets are powerful calculation tools is that they can copy and
paste functions with automatic adjustment. We could enter a similar formula into
D4 through D7, but if these data were lengthy, that’d be tedious. Instead, we will
copy and paste this formula. With D3 active, click on Edit > Copy (or the Copy
icon, or Alt+E C, or Ctrl+C). Then highlight D4 through D7. Then Edit > Paste
(or the Paste icon, or Alt+E P, or Ctrl+V). Another way to do this is with the fill
handle, but we’ll discuss this later.
Relative and Absolute References
When we copied the formula in D3, =C3/C$7, into D4, notice that the formula in
D4 became =C4/C$7. The C3 became C4, but the C$7 stayed the same. The
reason is that the $ sign before a number or letter makes this reference absolute,
and will not change (or adjust relative to the formula’s other references) when the
formula is copied. E.g., try entering into E3, this formula: =C3/C7, and then
copying and pasting it into E4. It will become =C4/C8 and probably produce an
error message. Try putting dollar signs into the formula in E3 at various places
(and choose arbitrary rows and columns for the paste) and see how that changes
the pasted results.
Practice: Make a normal table of male and female heights. This should run from
heights of 4 feet 10 to 7 feet 2 inches in increments of 0.5 inches and table the
percentage of males and females shorter than each of these heights. The function
you will need is =normdist(x, mean, standard deviation, 1), where these values
of (mean, standard deviation) are approximately (64.5” and 2.2”) for women and
(70” and 2.8”) for men. (The 1 dictates that the formula is for area to the left of
x.) Put the heights in Column A, percentages for women in column B, and
percentages for men in column C. Try to use references properly so that your
formulas should look like =normdist(A10, B$1, B$2, 1) rather than
=normdist(A10, 64.5, 2.2, 1). Part of your table should look something like this:
Height
(in.)
Women Men
…
…
…
64.5 0.5000 0.0247
65 0.5899 0.0371
65.5 0.6753 0.0540
Fill Handles – Copying and Series
The table for the work above has consecutive heights in Column A. Start a new
sheet by clicking on Sheet 2 in the sheet tabs below the spreadsheet cells. (If you
have only one (the current) sheet available, click Insert > Worksheet). In A3 enter
First steps, p. 3 of 6, Dr. M. Lawrence Clevenson
58 (4 feet 10 inches), and then in A4 enter 58.5. Highlight these two cells. In the
lower right hand corner of cell A4 is a tiny + sign. If you place the mouse cursor
over that lower right hand corner, the cursor will change shape to a + sign
indicating you have engaged the “Fill handle.” Click and drag down. A small
message to the right of the highlight informs you that you reach 86 (7 feet 2
inches) around cell A59. Release the click and you will have created an
arithmetic series with increments chosen by the difference between the two
highlighted cells in A3 and A4.
Play with the fill handle for a bit. Try entering a number like 4 in any open cell,
say C3. See what happens when you use the fill handle on it. Then try a formula,
e.g., =Average(a3:a10) entered into cell D5. Fill down; fill right.
Then finish the normal table above.
Homework: On a new sheet make the normal table indicated on pp 40 to 41 of
the textbook. Extend that table to negative values of z in the negative direction to
–3.99. Be careful; -2.00 + 0.04 = -1.96, not –2.04. Note that you can insert rows
as necessary by highlighting the place(s) where rows are to be inserted and then
click Insert > Rows. The interior of your table should look something like the
table below: (Note: The correct function for standard normal distribution in
Excel 2000 is NORMSDIST(Z) (with an S in the middle). This is equivalent to
NORMDIST(Z, 0, 1, 1), a normal distribution with mean 0, standard deviation 1,
and the last entry, 1, asks for the cumulative area to the left of Z.
z
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
… …
…
…
…
…
…
…
…
…
…
… …
…
…
…
…
…
…
…
…
…
-0.2 0.4207 0.4168 0.4129 0.4090 0.4052 0.4013 0.3974 0.3936 0.3897 0.3859
-0.1 0.4602 0.4562 0.4522 0.4483 0.4443 0.4404 0.4364 0.4325 0.4286 0.4247
-0.0 0.5000 0.4960 0.4920 0.4880 0.4840 0.4801 0.4761 0.4721 0.4681 0.4641
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141
Elementary Statistical Calculations and Graphs
Retrieve the data called SurvivalTimes from the shared hard drive. The data are
the number of days guinea pigs survived after a bacterial injection. (This was a
medical study on acquisition of resistance to infection, taken from American
Journal of Hygiene 72 (1960), pp130-148). Insert a column of ID numbers for
the data from 1 to 72. The missing data are:
ID
Days
26
53
27
84
First steps, p. 4 of 6, Dr. M. Lawrence Clevenson
28
126
56
80
57
83
58
80
Use End and Down arrow to efficiently fill in the missing values. Then sort the
data from the smallest to largest, but keeping track of the ID’s as follows:
1. highlight the entire block of data to be sorted, from A1 to B73. (This step
is not necessary, in this case, where there is only one block of data, and
we’re sorting all of it.)
2. Choose Data > Sort (or Alt+D S). A dialog box asks how you want to
sort the data; note that Header row should be chosen, and use column B
ascending.
3. When all your choices seem ok, click the OK button
Observe that your data is now sorted from smallest to largest (ascending) and that
the ID’s have the same correspondence.
The median is the datum in the middle when data are ordered. This datum is in
position (n+1) / 2, where n is the number of data. In this case, n = 72, and the
median’s position is “36.5”, which means the median is the average of the two
measurements in the 36th and 37th positions of the ordered data. These two data
are 102 and 103 so that the median is 102.5. Sorting is not necessary to do this,
since Excel provides fx functions for statistical calculations. Insert a few rows at
the top of the worksheet for some statistical functions. In column A write the
name of the statistic, and in column B its value. The median is one of five
numbers often cited in a five-number summary, which consists of the minimum,
maximum, first and third quartiles (25th and 75th percentiles), and the median.
Find all these values. Note that the quartile function in Excel is designed to give
you all five of these numbers. Can you see how to do this easily with Copy and
Paste?
Another stat is the mean or average; Excel uses the name average. Compute the
mean (average) survival time. Check that the average is defined as the sum
divided by the number by comparing =Average(range) with =Sum(range) /
Count(range). What do you think the function Count computes?
The most commonly used measure of data spread is the standard deviation. The
Excel function is STDEV; if the data are more “spread out,” the standard
deviation is larger. Like most “lifetime” data, whether of machinery under stress
or survival with an illness, these data are right-skewed. This means that the righthand tail, defined as the maximum – median, is (much) greater than the left-hand
tail, defined as the median – the minimum. One could also compare the “extreme
tails” defined as maximum – third quartile and …
First steps, p. 5 of 6, Dr. M. Lawrence Clevenson
A graph of the data will show this right-skewness. Numeric data like survival
times are graphed with a histogram, a contiguous column graph of the frequency
or percentage of data in each group. Here are the steps.
1. Click Tools > Data Analysis. If that is not available (click on the down
arrows at the bottom of the menu if you don’t see it immediately), follow the
instructions on pp. 23-24 of the text to “add it in.”
2. A dialog box with many statistical procedures appears. Scroll to histogram
and click.
3. Select our SurvivalTime data, either by typing directly in the range (like
B9:B80) in the window, or, clicking on the red arrow button and selecting it
off the spreadsheet as we have done before.
4. Choose the bin range.
a. For now, leave the Bin range blank. Go to Step 5.
b. Go to step 7 to see how to choose a bin range.
5. Choose a place for the output, either on a new sheet or some (preferably
blank) part of this sheet.
6. Make the chart look better (it is ugly).
a. Stretch the graph by left-clicking anywhere inside it and then
dragging, say, the lower right corner down and right (watch me).
b. Double-click inside one of the columns. Format Data Series dialog
box appears.
c. Click on options tab.
d. Set the Gap width to zero. Choose to Vary colors by point for a
colorful graph. You can choose individual colors later also.
e. Clear the legend (if you like; I like) by right clicking inside it, and then
clicking on clear.
7. It’s still not too good, and that’s usually the case if we leave the Bin range
blank. There is no rule for choosing bins, but usually we can improve the
graph appearance by over-riding Excel’s choice. We need a little narrower
bin-width, so for starters, try again, except we’ll set bins of width 50, say.
a. Choose a blank area of the spreadsheet, say D15 of the data sheet (the
sheet with the original SurvivalTimes
b. Start a series with 50, 100, and then use automatic fill procedure to go
to 600.
c. When you get to step 4, choose the range of the spreadsheet that these
numbers occupy by either method (typing or pointing).
d. Go back to step 5 and finish your modifications.
First steps, p. 6 of 6, Dr. M. Lawrence Clevenson
Download