Programming Languages Research and You: #1

advertisement
Programming Languages
Research and You:
What Miracles Are We Cooking Up These Days?
#1
Talk Outline
• Wes Weimer (also Knight, Reynolds, Evans, …)
• What is PL research in general?
• Possible cool senior thesis bits!
– Based on state-of-the-art modern research
• Hint: write down a key phrase, email me later …
projects.
Professor?
an Undergrad …
#2
Don’t We Already Have Compilers?
#3
Dismal View Of PL Research
C++
Java
(or C#)
#4
PL Research: Qu’est-ce que c’est?
• Study programs and
languages
• 2002 US Annual Cost of
Software Errors: $60B
– 0.6% of the GDP (NIST)
– Cost of 1 bug: $2k-$10k
• Programs as artifacts
– What should they be doing?
– Are they doing it?
– Are they making mistakes
instead?
– How might we fix them?
• Language Design
– Make some things easier
e.g., compare Ruby and Python to C++
#5
Program Analyses
• We write programs that analyze (or
transform) other programs
– cf. testing, >50% of a project’s budget
• Doomed in theory but successful in practice
Simplest
examples:
dataflow
analyses
and type
systems
#6
Domain-Specific
Bug-Finding
• Embedded components (e.g.,
cellphones) are programmed with
special languages
• Most large projects include their own
custom languages (e.g., simulations,
macros, mIRC scripts, game engines)
• These are harder to debug and have
special semantics (= meanings)
• Example: UnrealScript is C-ish but has
type qualifiers like transient and travel
It’s a fairly accurate
portrayal of college,
actually …
• Example: “Players of [The Sims 2] are
complaining that their artfully-crafted homes
and mansions are beginning to resemble the
Twilight Zone, thanks to an artifact of the
game's design that causes hacks to spread like
viruses from user to unwitting user.”
SecurityFocus 2004-2005
#7
Program Analyses For Security
• Don’t want rogue programs to send our info to MS
• Could we detect that (type systems for secure
information flow, format string vulnerabilities,
setuid analyses, …)?
• Could we prevent that (bytecode verifiers, proofcarrying code, “data execution prevention”, …)
#8
Helping Out Testing
• Finding bugs (e.g., bugs in Linux, bugs in Windows
device drivers, bugs in Java systems software, …)
• Preventing bugs (change the language, or add a
step to the “make” process, cf. PREfast)
• Automatically generating test cases
• Limiting test cases that must be run on a check-in
#9
Big Example #1: CCured
• Make systems programs as safe as Java but as fast as C
– Safe = memory safety and type safety
• Take an important C program (e.g., apache, bind, openssl)
• Run a program analysis to classify all of the pointers in that
program:
– Safe Pointer
– Sequence Pointer
– Wild Pointer
= no arithmetic, no casts
= pointer arithmetic (i++), no casts
= anything goes
• Take that classification and transform the program:
– Safe Pointer
– Sequence Pointer
– Wild Pointer
= add a null check
= add bounds (and null) checks
= add full dynamic type checking
• Resulting program is provably safe
• But is < 30% slower than the original (cf. Purify: 50x slower)
#10
Big Example #2: SLAM
• Verify critical properties of software or find bugs
• Take an important program (e.g., a device driver)
• Merge it with a property (e.g., no deadlocks, asynchronous
IRP handling, BSD sockets, database transactions, …)
• Transform the result into a boolean program
– Same control flow, but only boolean variables
• Use a model checker to explore the resulting state space
– Result 1: program provably satisfies property
– Result 2: program violates property right here on line 92,376!
#11
PL: Cosmic Mayonnaise
• Two favorite areas? No problem!
• Since most of computing involves programs, it’s easy to
form a research project that crosscuts PL and …
– Systems: analyze J2EE ecommerce apps, distributed peer-to-peer
programs, “managed code operating systems”, concurrency, etc.
– Security, Embedded Systems, Games: as before
– Databases: add transactional or ACID semantics to languages, verify
inlined SQL, support persistent objects, use DB techniques on
program traces, safely inject query plans
– Theory: we make heavy use of DFAs (lexing), PDAs (parsing), NFAs
(policies), linear logic (resource mgmt), temporal logic (fairness),
approximation algorithms (e.g., graph-coloring register alloc), …
– Machine Learning: specification mining, profiling, …
– Graphics: analyze the OpenGL or Direct3D aspects of programs,
provide better support for programming on graphics cards, …
– Other: out of space on the slide …
#12
Prerequisites
• You have been
pre-approved
to do PL
research!
“mathematical
maturity”
#13
The Breakfast of Champions
• At PL Research, we’ve pretty much got it all:
theory and practice, glitzy killer apps and hardcore fundamental problems. There’s a lot to do,
and that’s why we need people like you.
• Talk to your doctor of philosophy to see whether
PL® is right for you. Side effects were generally mild
and included reliable software, resistance to viruses,
increased hacking opportunities, decreased development
times, disappearing deadlocks and race conditions, ironclad
APIs, firmer theoretical bases, …
#14
Any Questions?
• Also, send me email. Even if you don’t care
about PL (sigh!) I would be happy to give
advice about CS research, industry and grad
school.
#15
Download