Polyglot: An Extensible Compiler Framework for Java

advertisement
Polyglot: An Extensible
Compiler Framework for
Java
Nathaniel Nystrom, Michael R. Clarkson, and Andrew C.
Myers
Presentation by Aaron Kimball & Ben Lerner
Purpose of Polyglot
 Allow easy extensions to Java language




Security
Support new language designs
Optimization, static analysis
Instructional uses
Polyglot Architecture
Polyglot base is a static checker for Java source code
Extensions add AST components and compiler passes
Size of extension code proportional to complexity of changes
Parse extended-language source code, and reduce code to Java AST
which is output as .java files
javac then compiles to final bytecode form
Grammars & Passes
 “PPG” parser generator used; provides
grammar inheritance
 Passes perform static analysis, type
checking, compilation steps; run in a
“scheduled” work queue
 AST Rewriting is entirely functional
Example: Coffer
tracked(F) class FileReader {
FileReader(File f) [] -> [F] throws IOException[] {
... }
int read() [F] -> [F] throws IOException[F]
{ ... }
void close() [F] -> [] { ...; free this; }
}
Language includes annotations on functions to enforce
linear use of “capability keys”; “free” statement destroys
the capability key for an object
Extensions & Delegates
Simple subclassing does not
provide rich-enough object
extension; code duplication still
happens
Extension objects allow
additional methods to be
attached to a node
Delegate objects allow overriding
of existing methods using userdefined dispatch protocols
Goal is “mixin extensibility”
Additional Notes & Results
 Separate compilation through serialized class
data
 Qualitative measure of “code required” vs.
“departure from Java” demonstrates simple
language changes require simple compiler
changes
 Many languages (PolyJ, JMatch, others)
implemented in Polyglot... Including Jif!
Untrusted Hosts and
Confidentiality: Secure
Program Partitioning
Steve Zdancewic, Lantian Zheng, Nathaniel
Nystrom, Andrew C. Meyers
Presentation by Aaron Kimball & Ben Lerner
Purpose of Program Partitioning
(PpoPP?)
 Run programs securely on trusted hosts
 Hard parts:


Not everyone trusts every host
Not everyone trusts each other
 How to ensure security is preserved?
Technical Terms
 Principal: a person, machine, or entity
 Authority: a set of principals who can perform
some action
 Confidentiality: data isn’t leaked to principals
who shouldn’t see it
 Integrity: data isn’t modified by principals who
shouldn’t do so
Security Labels
 Labels look like{o1 : r1 , r2 ,, rn ;}
 Data are tagged by labels
 Each owner o specifies a set of allowed
readers
 A principal can read data only if all owners
permit it

More owners  more restrictive policy
Confidentiality  Integrity
 Integrity constraints look like {?:r}

Data has no owner, but is trusted by r
 Confidentiality: owner trusts readers not to do
something bad
 Integrity: reader trusts owner hadn’t done
something bad
Side channels
 If (b_secret) then x = true else x = false
 This leaks the secret data!
 Security is label is restricted at every p.c.
Example: Oblivious transfer
 Variables are only
accessible by Alice
 Assignment ok
because of
authority clause
 Endorse lets Alice
blindly trust Bob’s
data
 Declassify lets
Bob read Alice’s
data
public
int
int
bool
class OTExample {
{Alice:; ?:Alice} m1;
{Alice:; ?:Alice} m2;
{Alice:; ?:Alice} isAccessed;
int{Bob:} transfer{?:Alice} (int{Bob:} n)
where authority(Alice) {
int tmp1 = m1;
int tmp2 = m2;
if (!isAccessed) {
isAccessed = true;
if (endorse(n, {?:Alice}) == 1)
return declassify(tmp1, {Bob:});
else
return declassify(tmp2, {Bob:});
}
return 0;
} }
How to split
 Each machine carries some security label
 All data carries some label
 “Just” split the computations such that

Label(f) <= Label(host)
 Host must be at least as confidential as the
data, and have at most as much integrity as
the data
Interesting bits: ICS
 Within a single host, security is easy to check
 Integrity Control Stack: security across hosts



Deep stack  many data dependencies
 lower data integrity
 less trusted data
 Stack policies enforced with nonce
capabilities
Interesting bits: Label inference
 Type checking extended to infer labels for all
data and constraints for all flows
 …where Polyglot is useful 
Future work
 Adding more interesting security relations



“Alice Actsfor Bob”
Dynamically generated labels – hard to split!
…label polymorphism
Download