Ten Things I Like About CS Graduate Research

advertisement

Ten Things I Like about CS

Graduate Research

(and Five I Don’t)

Bill Howe

Phd Student

Maseeh College of Science and Engineering

Portland State University

Portland, OR

My Background

1994-1999: BS in Industrial and Systems

Engineering from Georgia Tech

1999-2001: Worked as a consultant on projects for

Excel Telecommunications

Verizon Wireless

Siebel Systems

Microsoft

2001-present: working on a Phd in

Computer Science…

Outline

Graduate Research in CS: The Good

Graduate Research in CS: The Bad

Overview of my research on data management tools for Computational

Science

Graduate Degrees

CS Bachelor’s degree of Science (BS)

Programming

CS Master’s degree of Science (MS)

Advanced Programming

CS Doctor of Philosophy (Phd)

Research, Teaching

The Next 10 Years

120

100

0

-20

-40

80

60

40

20 me

BS

MS

Phd (Univ)

Phd (Lab)

1 2 3 4 5 year

6 7 8 9 10

Sources:

Starting salaries: 2004, U.S. Department of Labor, www.bls.org

Average raise: Software Developer Magazine 2004 Salary Survey, assumes 2003-2004

Also: Harder to outsource MSs, Phds harder still

A Question

Why do you study Computer Science?

Claim: All Science is Becoming

Computer Science

Biology

Ensembl, GenBank, HGP, SwissProt

Folding@Home

Bioinformatics

Astronomy

Sloan Digital Sky Survey, SETI

Oceanography

Environmental Observation and Forecasting

First comes X, then “Computational X”

Modern scientists are programmers!

Claim: Everything else is also

Computer Science (1/2)

Medicine

Medical Informatics, MRI Nobel Prize

Journalism, Broadcasting

Blogs, Podcasts, BitTorrent

Commerce, Advertising

 eBay, iTunes, Amazon, AdWords,Overture, Google

Communication

 email, IM, VoIP

Research, Publishing

Search Engines, Wikipedia, RSS, BioMed,

Citeseer

Claim: Everything else is also

Computer Science (2/2)

Economics

 eBay, Everquest, World of Warcraft

Entertainment

MP3, DVD, iTunes, GarageBand, P2P,

Pixar, LoTR, Star Wars, Gaming

Social

 friendster, MySpace, eHarmony, Match.com

Busy Work

Mechanical Turk

Travel

Write papers, submit them to conferences in exotic locales, and get the trip paid for!

Even with my stunted income, I’ve visited

Chicago (GGF 2001)

Paris (SIGMOD 2003)

Toronto (VLDB 2004)

Santa Barbara (SSDBM 2005)

Tokyo (ICDE 2005)

 ICDE 2006 is here in Atlanta April 2-7

Laziness is a Virtue

If it's boring, make the computer do it

Office: Filing, Sorting, Accounting, Reporting

Factory: Assembly, Warehousing

Science and Mathematics: Prime numbers,

Calculation

Housework? Flying? Driving?

Low-hanging fruit

We do 80% of the job with 20% of the effort

Programmers do this for their company/customers

CS Researchers try and do this for everybody

Get Paid to Think

My time involves equal parts

Reading

Writing

Coding

Staring into space

Plus

Conversing with colleagues

Preparing and giving talks

Administrative tasks

Focus > Brains

Edison

“Genius is 1% inspiration and 99% perspiration”

Pasteur

“Let me tell you the secret that has led me to my goal. My strength lies solely in my tenacity.”

“Chance favors the prepared mind”

Focus is not easy

It’s easy to drift through the day as a Phd student and accomplish very little

No one forces you to be productive – which is a bad thing!

Abundant Help

Every professor

 has gone through graduate school is willing to giving advice

 is expected to give advice is generally good at giving advice

At Portland State, for example, this means

130 paid experts to advise you plus 250 other students who may also have insight

Corporate “open door” policies do not quite compare

Fewer Jerks

Perhaps surprisingly, scientists tend to be open-minded

It’s ok to argue with the boss

But you must reciprocate!

Bosses (and customers) are not frequently known for open-mindedness

“Must Have Strong

Communication Skills”

You'll do enough writing and public speaking that you'll be an expert

All employers require these skills

Ironically, few employers give you the opportunity to learn them

Autonomy

Working from home

Setting your own schedule

Setting your own research agenda

Scheduling your own meetings

…anyone see a downside?

“Free” Training

Take classes in a variety of CS areas

Remember: Everything is Computer Science

I took Environmental Engineering courses for my research

How about Art? Music? Language?

“Order in Pollock’s Chaos”, Richard P. Taylor, Scientific

American, December 2002

Lilypond

"Chinese Character Synthesis Using Metapost", Candy YIU

L.K. and Wai WONG, TeX Users Group, July 2003

Self-study with Languages and Tools

I learned Haskell, C++, Python, Javascript, and countless other tools while in Graduate School

Outline

Graduate Research in CS: The Good

Graduate Research in CS: The Bad

Research Overview

Begging for Money

Funding comes primarily from gov. agencies

NSF, NIH, NASA, DARPA, DOE

Scientists write proposals to convince non-experts that their future research will be productive

Anything sound strange about this?

Funding not necessarily commensurate with the quality of the proposal

As a student, you are largely shielded from this process

Innovation not always rewarded

Best advice for writing papers and giving talks:

People like what they understand

But:

People understand best that which they already know!

Encourages incremental improvements rather than transformative ideas

Lots of papers on searching the web slightly faster,

Fewer papers on, e.g., self-driving cars or other really useful things

Intense Competition from

Industry

Everyone can potentially do “CS Research”

We don’t need particle accelerators

Commercialization is easier (witness Google)

Databases

MySQL, PostgreSQL, Oracle, Microsoft, IBM

Amateurs are empowered as well

Open-source, Mash-ups

Difficult to Make an Impact

Pressure to innovate means you won't always have time to follow through on ideas

Prototypes and experiments are common

Polished, deployable systems less so

Difficult to Distinguish Good

Research from Bad Research

There is no simple objective criteria

 sales figures? website hits? number of downloads?

Perhaps # of citations, but this number is zero for a lot of good work

There is a lot of mediocre work

Lesson I had to learn: If a paper doesn’t seem to make sense, it may not just be you

Summary

I like it!

Consider it!

Ambiguity is a Good Thing

Outline

Graduate Research in CS: The Good

Graduate Research in CS: The Bad

Overview of my research supporting

Computational Science

1994 Northridge aftershock, San Fernando valley http://www.cs.cmu.edu/~quake/quakeviz.html

Animations created by Greg Foss, Pittsburgh Supercomputing Center and CMU Quake project, from synthetic data generated by the CMU Quake project

Load

Inactive VTK control

“GridFields”

16.1

17.9

16.2

2.3

12.5

2.1

13.5

14.0

15.1

15.7

16.3

15.4

13.7

1.4

13.7

1.9

11.4

14.2

[0.0, 1.0)

[1.0, 2.0)

[2.0, 3.0)

[3.0, 4.0)

[4.0, 5.0)

[5.0, 6.0)

GridField Operators (1) gridfield, no data

Scan retrieve grid data by

(grid name, context)

Catalog Interface

Storage Subsystem gridfield, bound data

25

19

24

Bind

27

21

26 retrieve attribute data by

(grid, attr. name, context)

GridField Operators (2)

27

25

24

19

21

26

Restrict(>23)

27

25

24

26

Cross Product (1)

 =

 =

Cross Product (2)

2 x

3

A z y

0 w

1

4

= xw

30

20 x 0

3 w

2 w

21

A0

31 x 1

A1 z 1 z 0 y 0 y 1

Aw

40

4 w

41 yw zw

Describing Fancy Grids

(a) (b) (c) (d) (e)

(a) H

0: (x)

V

0: (y)

(b) H

0: ()

V

0: ()

 bind(0,x) bind(0,y)

(c) H

0: ()

V

0: ()

(d) H

0: ()

V

0: ()

 bind(0,x,y)

) bind(0,x,y bind(1,f)

Idea: Separate Data Manipulation from Data Visualization

Our work

Client Machine

Visualization

Isosurfaces, Streamlines,

Contours, Colors, Animation,

Volume Rendering

GridField

Expressions

Locating, Cutting, Slicing,

Aggregating, Combining

Cluster of Servers

GridField

Expressions

… push processing upstream

Objectives

Model generalized gridded datasets

Optimize computations over gridded datasets

Demo available after the talk

Demo: Architecture

Thick arrows are data, thin arrows are messages

Client

XMLRPC

Message

Web Service

Parse and

Optimize

Client App

Server

Cache

Evaluate

Client

Cache

XMLRPC

Response

Respond

Data

Repository

What do we use to hold these data?

Lists?

Arrays?

Relations?

Relations?

Analyzing the Results:

Data Products

Data Products

Computational Science

1. Governing Equations

2. Discretize the Domain

3. Convert Differential Equations into Algebraic Equations

4. Solve the Algebraic Equations

5. Analyze the Results

Graduate Degrees

CS Bachelor’s degree of Science (BS)

 year 1-2: Tester: Regression testing, write unit tests year 3-6: Programmer: Program to spec year 6+: Architect: Design, Code, Manage, Plan, Sell

CS Master’s degree of Science (MS)

2 additional years in school year 1-2: Programmer year 2+: Architect, Manager, etc.

CS Doctor of Philosophy (Phd)

4-9 additional years in school

 year 1-2: Postdoc, year 2+: University Faculty, Research Scientist OR Architect,

Manager, etc.

Getting a Phd is like Being all

7 Dwarves

At first you feel Dopey and Bashful

In the middle, you’re Sleepy, Sneezy, and Grumpy

At the end, they call you Doc, and you’re Happy

The Next 10 Years

CS Bachelor’s

Degree (BS)

Tester Programmer

CS Master’s

Degree (MS)

Student Programmer

Software Architect

Manager

Software Architect

Manager

CS Doctoral

Degree (Phd)

Student Research Assistant

Post

Doc

Univ. Faculty

Research Sci.

Architect

Manager

0 1 2

Usually earn a Master’s degree along the way

3 4 year

5 6 7 8 9 10

The Next 10 Years

160

140

120

100

80

60

40

20

0

-20

-40

1 2 3 4 5 6 7 8 9 10

BS

MS

Phd

Sources: Starting salaries: NACE, 2004, Average raise: NACE 2003-2004

Also: Harder to outsource MSs, Phds harder still

Download