How to Produce Statistical Graphics

advertisement

How to Produce

Statistical Graphics

General Clinical Research Center

August 15, 2005

Rachel Enriquez

What are we going to talk about?

 Why should we care about statistical graphics?

 What is the theoretical framework for statistical graphics?

 When do we make statistical graphics?

 How can we produce good quality graphics?

Why do we care about statistical graphics

?

Interpretation - good graphs help you understand your data.

Data visualization is part of analysis

Communication of results is the last step in the scientific process

 Many people can comprehend the results better by seeing them in a figure than they can by reading them in a table.

 Do you have an opinion?

 Graphics can help persuade.

 Objectively correct graphics can call attention to the result you WANT the viewer to see.

Get Attention

 Can you produce exceptional statistical graphics? (me neither)

 Do you want people to know that you are committed to the scientific process?

If people understand your research, they’ll listen to you and do what you tell them to.

The Theory of Statistical

Graphics

Data Visualization

 Visually encode the data.

 Viewers decode the picture

 Easy to figure out

 Learn something new

 See the right comparisons

Hierarchy of Visual Perception

Position along common scale

Position along nonaligned scale

Length

Angle / Slope

Area

Volume

Color

Aesthetics

 A personal matter

 Unless you ask Tufte

 Data / Ink ratio

 Avoid 3-D

 Fill patterns are bad

 Obtain good resolution

 Text can be small

 (in print)

Aesthetics

When do we make statistical graphics?

For preliminary analysis

 Speed

3 5

3 0

2 5

P e r c e n t

2 0

1 5

1 0

5

0

1 6 2 4 3 2 4 0

A g e

4 8 5 6 6 4 7 2

For Publication in Journals

Data density is good.

Excellent resolution is required.

Color is difficult.

Column width is a consideration.

MS office is frequently not an option.

Too many tables!

A plot is better

Confounding variable

TABLES - Consider the on-line supplement

Maybe…

35

30

25

20

15

10

5

0

Esophageal Squamous Cell Carcinoma

Adenocarcinoma of Gastric Cardia

Esophageal Adenocarcinoma

1/wk 2-3/wk

Frequency of Reflux Symptoms

>3/wk

35

30

25

20

15

10

5

1-2 points 2.5-4 point s 4.5-6.5 point s

Reflux Symptom Score

35

30

25

20

15

10

5

<12 yr 12-20 yr

Duration of Reflux

>20 yr

Variable

Frequency of reflux symptoms

Reflux symptom dcore

Duration of reflux symptoms

Esophageal

Adenocarcinoma

35

30

25

20

15

10

5

0

1/week 2-3/week >3/week

20

15

10

5

0

35

30

25

1 - 2 2.5 - 4 4.5 - 6.5

35

30

25

20

15

10

5

0

<12 yrs 12-20 yrs >20 yrs

15

10

5

0

35

30

25

20

20

15

10

5

0

35

30

25

35

30

25

20

15

10

5

0

Adenocarcinoma of the Gastric Cardia

1/week 2-3/week >3/week

1 - 2 2.5 - 4 4.5 - 6.5

<12 yrs 12-20 yrs >20 yrs

35

30

25

20

15

10

5

0

20

15

10

5

0

35

30

25

35

Esophageal Squamous-

Cell Carcinoma

30

25

20

15

10

5

0

1/week 2-3/week >3/week

1 - 2 2.5 - 4 4.5 - 6.5

<12 yrs 12-20 yrs >20 yrs

The figure should be labeled!

Oral Presentations

 Data density should be moderate.

 Color is available.

 LABEL!

 Hope you have interesting data

Posters

 Smaller audience

 Experimentation is good.

 Graphics will bring you customers!

Experimentation may, or may

NOT work.

How Do I do this?

How much time do you have?

 It is not easy.

 There is no perfect, easy to use, cheap software that is going to solve your problems.

This is not too hard

Books are not very helpful

 Software changes quickly.

 People use different software.

 You want to do it

NOW, not after reading for 5 hours.

 Surfing the net is frequently useful.

Vector Graphics vs Bitmaps

 Vector graphics.

 A set of instructions that tells the device how to display the document.

 Adobe software is the most common way to edit vector graphics.

 Bitmaps

 Resolution depends on the size of the computer file.

 Easy to open and publish on-line.

 Generally not accepted for publication.

 Vector graphics can be made into bitmaps.

 Bitmaps cannot be made in vector graphics.

Bitmaps, compression, and enlarging

 Compression can be

‘lossy’

 We are familiar with the grainy effect of enlargement.

Software

 SPSS

 Many chart options

 Graphics can be edited

 Can export vector graphics.

 SAS

 Known for poor graphics.

 However, some people produce very good graphs with SAS.

 Hope SAS improves and use something else for now?

Stata

 Any comments?

R

It is free.

Produces good graphics that can be exported in various formats.

Infinitely customizable

Difficult for the novice statistician / programmer

R clinic

SyStat

EpiInfo

S+

Spotfire

Prism – also available in GCRC computer lab.

Others…..

Sigma Plot

 Can be used with Excel and SPSS

 Opens other data formats

 Menu driven

 Multiple graphics options

 Easily produces compound graphics

 Exports graphics in multiple formats.

MS Office

 Windows Metafile is a vector graphic format.

 Excel

 More control over graphics

 Limited selection of graph types

 User typically provides the S.E.s and effect estimates.

 PowerPoint

 Surprisingly good at managing bitmaps.

 If you already use it, then improve your graphics by applying aesthetic rules.

For example…

4

3

6

5

2

1

0 negative family history postive family history

6

5 hay fever asthma eczema current wheeze

2

1

4

3

0 negative family history postive family history

Scanners

Scanned figures are an option.

Good way to clean up figures from journals if you’re proficient in Photoshop

The bitmap resolution problem remains

Which file format and program will avoid lossy compression?

Art Software

As a novice graphic preparator, I appreciate the ability to draw on graphs.

Can also ‘cover’ unwanted parts with white shapes.

 Group the resulting collection of shapes and save as a picture.

 Adobe Illustrator

 Adobe

Photoshop

 These programs may seem counter-intuitive at first use.

 Paint, MS office, etc.

 Easy to use

 Bitmap products.

Call the experts

 The Medical

Illustrators at

VUMC will improve your graphs.

 $50/hr

 Average graph is

20 minutes.

 Grow your own group ‘expert’.

Download