Dynamic drill down based on numeric data

advertisement
Paper TT08
Dynamic Drill Down on Numeric Data
Hsiwei Yu, Ming Tech, Edison, NJ
Dong-Min Shen, Schering-Plough, Kenilworth, NJ
Abstract
Problem: Have a 2*2 table, say, gender by treatment group, and each cell representing the
average improvement score after treatment. Can we drill down by HTTP hyperlink on a
cell to see, say; detailed statistics for events belongs to that cell?
Solution: We can’t simply use a numeric format to map the average in the four cells to
different locations because the averages can be identical. For example both man and
woman in treatment group ‘A’ have the same improvement scores 5.9. Our solution is to
create a new variable whose value is the record number, i.e. _N_, and associate it with an
artificial format. This artificial format, representing URL locations, is made up of
additional variables’ values. Therefore cells having 5.9 in them, say for ‘man’ and
treatment group ‘A’ and for ‘woman’ and treatment group ‘B’ can point to different URL
locations showing specific information.
Introduction
To add drill down capability for a simple table below, we can create a numeric format to
map the average score to specific web page.
Average Score
Treatment
A
Gender 3.7
B
5.9
Female
Male
3.4
5.9
A format fragment is like:
Proc format;
Value scr2pag
3.7= ‘<a href=”./Gender-Female Treatment-A.htm”>3.7</a>’
5.9= ‘<a href=”./Gender-Female Treatment-B.htm”>5.9</a>’
5.9= ‘<a href=”./Gender-Male Treatment-B.htm”>5.9</a>’
.. .. run;
Normally it works but not in this case. It fails,
ERROR: This range is repeated, or values overlap: 5.9-5.9.
because format is one-to-one relationship. F
1
TT08
Three-Steps Solution
First run the procedure without printing to create an output data set having the desired
statistics.
Create artificial format between _N_ and the desired web page from the output data
set.
Finally show results with the output data set and the artificial format.
Run Proc to Get Stats Without Print
We run the procedure to create an output data set but don’t display any output. It’s
easy to suppress output by ODS listing close. We use the average score for illustration.
Here’s
proc tabulate out= data_for_artificial_format;
var score;
class gender treatment;
table gender
, score * mean * treatment;
run;
Create Artificial Format
Obs gender treatment Score_Mean
1
Female A
3.7
2
Female B
5.9
3
Male
A
3.4
4
Male
B
5.9
From the above output data set, use a data step to create a format associating the record
number, _N_, with desired web page. For example, with a data step we can build the
desired format since input data set has gender, treatment group and the average to
construct it.
Proc format;
Value scr2pag
1= ‘<a href=”./Gender-Female Treatment-A.htm”>3.7</a>’
2= ‘<a href=”./Gender-Female Treatment-B.htm”>5.9</a>’
4= ‘<a href=”./Gender-Male Treatment-B.htm”>5.9</a>’
.. .. run;
2
TT08
We can add to the final display data set with a new variable named say n_to_scr2pag
whose value is _N_ and optionally assign to it the artificial format, scr2pag.
Show Result
Use a Proc Tabulate or Proc Report to show results with the output data set and the
artificial format.
proc tabulate format= scr2pag.;
class gender treatment;
var n_to_scr2pag;
table gender
, n_to_scr2pag * treatment;
run;
Average Score
Treatment
A
Gender 3.7
B
5.9
Female
Male
3.4
5.9
Note each number has a hyperlink for drill down.
Conclusion
We have shown how to allow for drill down on numeric data. A number can be average,
Chi-square, or anything of importance at the moment. They may be generated from
sophisticated statistical procedures. In order to make numeric data dillable, our technique
is to first create an output data set then manipulate it with an artificial format and finally
display the drillable result. The final display can be done with Proc Tabulate or Proc
Report and doesn’t have to be the original statistical procedure. Without this technique,
we were limited to show drill down on unique character data. We now have the tool to
make our presentation more lively and interactive.
Contact Information
Dong-min Shen
Schering-Plough
Kenilworth, NJ
Dong-min.shen@spcorp.com
908-740-5668
Hsiwei Yu
Ming Tech
Edison, NJ
Hsiwei_yu@yahoo.com
718-908-7681
3
TT08
Sample Program
data numeric_data;
input gender $ treatment $ score;
cards;
Male
B
5.9
Female
B
5.9
Male
A
3.4
Female
A
3.7
;
proc print data= numeric_data;
run;
ods listing close;
ods html file= '.\1 original result without drill down.htm';
ods proclabel= ' ';
title ;
proc tabulate data= numeric_data;
var score;
class gender treatment;
table gender= 'Gender'
, score= 'Average Score' * mean= ' ' * format= 3.1 * treatment= 'Treatment';
run;
ods html close;
ods listing;
ods listing close;
proc tabulate data= numeric_data out= data_for_artificial_format;
class gender treatment;
var score;
table gender, treatment * score * mean;
run;
ods html file= '.\2 data to build artificial format.html';
ods listing close;
proc print data= data_for_artificial_format( drop= _type_ _page_ _table_ );
run;
ods html close;
ods listing;
data data_for_artificial_format;
set data_for_artificial_format;
4
TT08
n_to_scr2pag= _n_;
run;
filename fmt "%sysfunc(pathname(work))\artificial_format_stmts.sas";
data _null_;
set data_for_artificial_format end= EOF;
file fmt;
attrib href length= $ 1024;
if _n_ = 1 then do;
put 'proc format;' / 'value scr2pag';
end;
href= '<a href='
|| '"./Gender-' || trim( gender ) || ' Treatment-' || trim( treatment )
|| '.htm'
|| '">' || trim( left( put( score_mean, 3.1 )))
|| '</a>';
put _n_ '= ''' href +(-1) '''';
if EOF then do;
put 'other= '' '';' / 'run;';
end;
run;
%inc fmt / source;
filename fmt;
/* ***
proc datasets lib= work nolist;
modify fun3;
format n_to_scr2pag scr2pag.;
run;
quit;
*** */
ods listing close;
ods html file= '.\3 result with hyper links.htm';
/* ***
title 'Proc Tabulate';
*** */
proc tabulate data= data_for_artificial_format format= scr2pag.;
class gender treatment;
var n_to_scr2pag;
table gender= 'Gender'
, n_to_scr2pag= 'Average Score' * sum= ' ' * treatment= 'Treatment';
run;
ods html close;
ods listing;
5
Download