Data Visualization

advertisement

Outline

Plan for Tonight

A synopsis of our progress through the course

Lecture 11 Visualization in PERL

Lab 2

Based in Part on Previous Lecture of J. Grefenstette

BINF 634 Fall 2013 - Lec 11 Visualization 1

Miles to go …

Where We are and Where We are

Going

Program 1, Program 2, turned in and graded

Midterm turned in and graded

Program 3 turned in, Program 4 assigned tonight

Program 4 due in two week (11/25/13)

Quiz 1, Quiz 2, Quiz 3, Quiz 4, lab1 turned in and graded

Lab 2 assigned tonight (11/11/13)

Lab 2 Due next week 11/18/13

Take home final provided 11/25/13

Take home final due to me 12/16/13

BINF 634 Fall 2013 - Lec 11 Visualization 2

Based in Part on Previous Lecture of J. Grefenstette

Viz strategy

Data Visualization

Numbers are well and good, but often we get more insight by visualizing data

As always, it's a good idea to break the process down into individual steps: data analysis script => output file => plotting script => chart file

This can all be wrapped for Web access:

CGI script to get user input => data analysis script => output file => plotting script => chart file

=> CGI script displays chart on web page

BINF 634 Fall 2013 - Lec 11 Visualization 3

Based in Part on Previous Lecture of J. Grefenstette GD

GD::Graph

Created by Martien Verbruggen

GD::Graph is a Perl module to create charts using the GD module. The following classes for graphs with axes are defined:

GD::Graph::lines

-- Create a line chart

GD::Graph::bars and GD::Graph::hbars

-- Create a bar chart with vertical or horizontal bars.

GD::Graph::points

-- Create an chart, displaying the data as points.

GD::Graph::linespoints

-- Combination of lines and points.

GD::Graph::area

-- Create a graph, representing the data as areas under a line.

GD::Graph::pie

-- Create a pie chart.

BINF 634 Fall 2013 - Lec 11 Visualization 4

#!/usr/bin/perl use strict; use GD::Graph::bars;

Based in Part on Previous Lecture of J. Grefenstette

GD

# File: bargraph.pl ( based on http://linuxgazette.net/issue83/padala.html)

# All data arrays should have the same number of entries my @x = ("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug",

"Sep", "Oct", "Nov", "Dec"); my @y = (23, 5, 2, 20, 11, 33, 7, 31, 77, 18, 65, 52);

# create a 2-dimensional array, where row 1 is x and row 2 is y my @data = (\@x, \@y);

#create a new bar graph object and give it size (in pixels) my $graph = GD::Graph::bars->new(500, 300);

# set graph features such as labels on the axes, title

$graph->set( x_label=>'Month', y_label=>'Number of Hits', title=>'Number of Hits in Each Month in 2002') or warn $graph->error;

# plot the data to create an image object my $image = $graph->plot(\@data) or die $graph->error;

# print the image as a PNG file format open IMG, ">hist.png" or die "Can't open hist.png\n"; print IMG $image->png; exit;

BINF 634 Fall 2013 - Lec 11 Visualization 5

Based in Part on Previous Lecture of J. Grefenstette

Output

% open hist.png

GD

BINF 634 Fall 2013 - Lec 11 Visualization 6

#!/usr/bin/perl use strict; use CGI qw(:standard); use GD::Graph::bars;

# File: bargraph.cgi

GD

Let’s create a web version !

Based in Part on Previous Lecture of J. Grefenstette

# All data arrays should have the same number of entries my @x = ("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug",

"Sep", "Oct", "Nov", "Dec"); my @y = (23, 5, 2, 20, 11, 33, 7, 31, 77, 18, 65, 52);

# create a 2-dimensional array, where row 1 is x and row 2 is y my @data = (\@x, \@y);

#create a new bar graph object and give it size (in pixels) my $graph = GD::Graph::bars->new(500, 300);

# set graph features such as labels on the axes, title

$graph->set( x_label=>'Month', y_label=>'Number of Hits', title=>'Number of Hits in Each Month in 2002') or warn $graph->error;

# plot the data to create an image object my $image = $graph->plot(\@data) or die $graph->error;

# print the image as a PNG file format print "Content-type: image/png\n\n"; print $image->png; BINF 634 Fall 2013 - Lec 11 Visualization

7

Based in Part on Previous Lecture of J. Grefenstette GD

BINF 634 Fall 2013 - Lec 11 Visualization

8

#!/usr/bin/perl use strict; use GD::Graph::bars;

# File: bargraph2.pl

GD

Plotting points from a file.

# read a file to get the data: my @x = (); my @y = ();

Based in Part on Previous Lecture of J. Grefenstette open FH, "hist.dat" or die; while (<FH>) { my ($xval, $yval) = split; push @x, $xval; push @y, $yval } close FH;

# create a 2-dimensional array, where row 1 is x and row 2 is y my @data = (\@x, \@y);

#create a new bar graph object and give it size (in pixels) my $graph = GD::Graph::bars->new(500, 300);

# set graph features such as labels on the axes, title

$graph->set( x_label=>'Month', y_label=>'Number of Hits', title=>'Number of Hits in Each Month in 2002') or warn $graph->error;

# plot the data to create an image object my $image = $graph->plot(\@data) or die $graph->error;

# print the image as a PNG file format open IMG, ">hist.png" or die "Can't open hist.png\n"; print IMG $image->png;

BINF 634 Fall 2013 - Lec 11 Visualization

9

Based in Part on Previous Lecture of J. Grefenstette

Apr

May

Jun

Jul

Aug

Sep

% cat hist.dat

Jan

Feb

Mar

23

5

2

Oct

Nov

Dec

20

11

33

7

31

77

18

65

52

BINF 634 Fall 2013 - Lec 11 Visualization

GD

10

#!/usr/bin/perl use strict; use GD::Graph:: linespoints ;

# File: plot.pl

my @x = (); my @y = ();

GD

Creating a plot with lines and points.

Based in Part on Previous Lecture of J. Grefenstette

# read a file to get the data: open FH, "hist.dat" or die; while (<FH>) { my ($xval, $yval) = split; push @x, $xval; push @y, $yval } close FH;

# create a 2-dimensional array, where row 1 is x and row 2 is y my @data = (\@x, \@y);

#create a new bar graph object and give it size (in pixels) my $graph = GD::Graph:: linespoints ->new(500, 300);

# set graph features such as labels on the axes, title

$graph->set( x_label=>'Month', y_label=>'Number of Hits', title=>'Number of Hits in Each Month in 2002') or warn $graph->error;

# plot the data to create an image object my $image = $graph->plot(\@data) or die $graph->error;

# print the image as a PNG file format open IMG, ">out.png" or die "Can't open out.png\n"; print IMG $image->png;

BINF 634 Fall 2013 - Lec 11 Visualization 11

Based in Part on Previous Lecture of J. Grefenstette

GD

BINF 634 Fall 2013 - Lec 11 Visualization 12

Getting the Chart on the Web

GD

This can all be wrapped for Web access:

CGI script to get user input (e.g., wrapper) => data analysis script => output file => plotting script (e.g., plot.pl) => chart file (e.g., out.png)

CGI script (wrapper) displays chart on web page

To display a file on the web, it has to be in a readable directory in the ~/public_html directory tree

1.

Create a directory that can be written to by the web server process:

% mkdir ~/public_html/cgi-images

% chmod 777 ~/public_html/cgi-images

Based in Part on Previous Lecture of J. Grefenstette

BINF 634 Fall 2013 - Lec 11 Visualization 13

2.

3.

4.

Getting the Chart on the Web

In the wrapper, create the graphics file in the working directory

Copy it to the images directory

Display it using print img{src=>...} use CGI qw(:standard);

...

GD

# change to working directory mkdir $dir or die "Can't create directory $dir\n"; chdir $dir or die "Can't change to directory $dir\n";

# create the data file for plotting system "cp /userhomes/faculty/jsolka/binf634/visualization/hist.dat $dir";

# run the plotting program system "/userhomes/faculty/jsolka/binf634/visualization/linegraph.pl";

# copy the output image to the web images directory system "cp out.png /userhomes/faculty/jsolka/public_html/cgiimages/out.png";

# display image file on the web page print img {src=>"/jsolka/cgi-images/out.png"};

Based in Part on Previous Lecture of J. Grefenstette

BINF 634 Fall 2013 - Lec 11 Visualization 14

GD

Based in Part on Previous Lecture of J. Grefenstette

Limitations to GD::Graph

GD::Graph is good for simple graphs but has some severe limits for scientific plotting

X-axis data assumed evenly spaced

2-dimensional

Limited control over labels, arrows, etc.

BINF 634 Fall 2013 - Lec 11 Visualization 15

Based in Part on Previous Lecture of J. Grefenstette gnuplot

gnuplot

 gnuplot is an open source Unix application for creating high quality plots from data

 http://www.gnuplot.info/

Runs interactively and in batch mode

Interactive mode:

% gnuplot

> plot sin(x)

BINF 634 Fall 2013 - Lec 11 Visualization 16

gnuplot

gnuplot

Batch mode

 put commands into a file, for example, histogram.plt

 run gnuplot with the command file as argument

% gnuplot histogram.plt

% cat histogram.plt

# file histogram.plt

set xlabel "Length" set ylabel "Sequences" set title "Sequence Length Distribution" plot "plot.dat" with boxes exit

Based in Part on Previous Lecture of J. Grefenstette

BINF 634 Fall 2013 - Lec 11 Visualization 17

gnuplot

Based in Part on Previous Lecture of J. Grefenstette gnuplot accepts input files with an x,y pair on each line:

% cat plot.dat

500 22

550 14

600 12

650 13

700 3

750 7

800 0

850 7

900 6

950 3

1000 2

BINF 634 Fall 2013 - Lec 11 Visualization 18

gnuplot

gnuplot gnuplot can create PNG format files, which work in all common browsers

# file hist.plt

# usage: gnuplot hist.plt

set output " hist.png

" # name of output file set terminal png medium mono # type of output file = PNG set size 0.8, 0.8 # adjusts size of text unset key # do not display key set xlabel "Length" # x-axis label set ylabel "Sequences" # y-axis label set title "Sequence Length Distribution" set style fill solid 1.0 # make boxes black set boxwidth 0.9 relative # leave small space between boxes plot " plot.dat

" with boxes # plot the data file exit

Based in Part on Previous Lecture of J. Grefenstette

BINF 634 Fall 2013 - Lec 11 Visualization 19

Based in Part on Previous Lecture of J. Grefenstette

Output

gnuplot

BINF 634 Fall 2013 - Lec 11

Visualization

20

#

Based in Part on Previous Lecture of J. Grefenstette

# File: hist3.plt

# usage: gnuplot hist3.plt

# gnuplot

# assumes data is stored in “myplot.dat"

# set terminal png # type of output file = PNG set output "hist3date.png" # name of output file set title "Hits Per Month" # title of graph set xlabel "Months" 0,-1 # x-axis label (down 1 character) set ylabel "Hits" # y-axis label set size 0.75,0.75 # reduce graph size unset key # do not display key set boxwidth 0.9 relative # leave small space between boxes set style fill solid 1.0 # make boxes solid set xrange [0:13] # control size of x-axis

# fine tune labels on x axis set xtics rotate ("Jan" 1, "Feb" 2, "Mar" 3, "Apr" 4, "May" 5,

"Jun" 6, "Jul" 7, "Aug" 8, "Sep" 9, "Oct" 10, "Nov" 11,

"Dec" 12)

# plot the data file, creating "hist.png" plot “myplot.dat" with boxes

BINF 634 Fall 2013 - Lec 11 Visualization 21

Based in Part on Previous Lecture of J. Grefenstette

gnuplot data file

% cat myplot.dat

1 23

2 5

3 2

4 20

5 11

6 33

7 7

8 31

9 77

10 18

11 65

12 52

BINF 634 Fall 2013 - Lec 11

Visualization gnuplot

22

Based in Part on Previous Lecture of J. Grefenstette

hist.png

gnuplot

BINF 634 Fall 2013 - Lec 11 Visualization 23

2.

3.

4.

Getting the Chart on the Web

gnuplot

In the wrapper, create the graphics file in the working directory

Copy it to the images directory

Display it using print img{src=>...}

Based in Part on Previous Lecture of J. Grefenstette use CGI qw(:standard);

$ENV{PATH} = "/bin:/usr/bin: /usr/local/bin "; # for gnuplot

...

# change to working directory mkdir $dir or die "Can't create directory $dir\n"; chdir $dir or die "Can't change to directory $dir\n";

# create the data file for plotting system "cp /userhomes/faculty/jsolka/binf634/visualization/plot.dat $dir";

# run the plotting program system "gnuplot /userhomes/faculty/jsolka/binf634/visualization/hist.plt";

# copy the output image to the web images directory system "cp hist.png /userhomes/faculty/jsolka/public_html/cgiimages/hist.png";

# display image file on the web page print img {src=>"/jsolka/cgi-images/hist.png"};

BINF 634 Fall 2013 - Lec 11 Visualization 24

Perl Data Language (PDL)

PDL http://pdl.perl.org/

BINF 634 Fall 2013 - Lec 11 Visualization 25

PDL

A Few Discussions on PDL

The next several slides were adapted from the PDL tutorial by David Mertens at this URL

 http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 26

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 27

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 28

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 29

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 30

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 31

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 32

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 33

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 34

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 35

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 36

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 37

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 38

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 39

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 40

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 41

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 42

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 43

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 44

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 45

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 46

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 47

PDL

Adapted from David Mertens at this URL http://www.slideshare.net/dcmertens/p-lplot-talk

BINF 634 Fall 2013 - Lec 11 Visualization 48

More information

Online documentation for GD::Graph

% perldoc GD::Graph

Online doc for gnuplot

% gnuplot

> help

Also see: http://www.gnuplot.info

PLplot

 http://plplot.sourceforge.net/examples.php

PDL

 http://pdl.perl.org/

Summary

BINF 634 Fall 2013 - Lec 11 Visualization 49

Homework and Reminders

Program 4 due next week 11/26/12

Lab 2 due by next week

No more quizzes

No more programming assignments

Only the final

Read Chapter 11

Exercise 11.1 and 11.2

Reminders

BINF 634 Fall 2013 - Lec 11 Visualization 50

Download