Frameworks for Research in Code Generation and Execution Mark Lewin Program Manager

advertisement
Frameworks for Research in
Code Generation and Execution
Mark Lewin
Program Manager
External Research & Programs
Microsoft Research
Agenda
Phoenix
SSCLI – The Shared Source Common
Language Infrastructure
Operating Systems
PHOENIX
What Is Phoenix?
Phoenix is a framework providing a
unique and rich infrastructure for writing
compilers, development & analysis tools,
and plug-ins
Foundation of next generation of Microsoft
code generators, optimizers and analysis
tools
Platform for third-party plug-ins
Platform for research and teaching:
Phoenix Academic Program
Phoenix Goals
Provide industry leading compilation and
tools infrastructure: “VC++ and .NET
compilers and tools”
Build research/development community
around infrastructure: “the Phoenix
Platform”
Make the infrastructure scalable,
configurable, and extensible: “JIT to WPO,
compilation and analysis”
Make infrastructure quick to retarget and
rehost
Key Features
Written in C++, usable by any .NET language
Dual-Mode: entire platform compiles to run native or on
top of .NET
Phase & Plug-in model for third party extensions to:
VC++ Compiler, Binary reader/writer, Analysis Tools, …
Support for Multi-threaded clients
Support for Code and Data extensibility
A single, strongly typed, explicit dataflow/control flow IR
used throughout framework
IR & Type system capable of processing native and/or
managed code
Strong inter-phase consistency checking
Many diverse compilers and tools reuse the common
core
Compilation and analysis framework for 10+ years
.Net CodeGen
MSR Adv. Lang
JITs
PreJITs
OO and .Net
optimizations
Language Research
Direct xfer to Phoenix
Research Insulated
from code generation
Native CodeGen
Advanced C++/OO
Optimizations
FP optimizations
OpenMP
MSR Tools
Phoenix
Building Blocks
Academic DevKit
Retargetable
“Machine Models”
~3 months: -Od
~3 months: -O2
Built on Phoenix API’s
Both HL and LL API’s
Managed API’s
Program Analysis
Chip Vendor DevKit
~6 month ports
Sample port + docs
Key ports (Arm) done at
msft
Full sources
Managed API’s
IP as DLLs
Docs
Code Gen
Tools
Code Gen
LL Opts
LL Opts
LL Opts
HL Opts
HL Opts
HL Opts
Compilers
Browser
Visualizer
Lint
Formatter
Obfuscator
Refactor
Xlator
Profiler
Security
Checker
Phx APIs
Phoenix Core
AST
assembly
C#
VB
Delphi Cobol
C++
Native
Image
IR
Syms
Types
CFG
SSA
C++ IR
C++AST
Phx AST
C++
PREfast
Lex/Yacc
Eiffel
Tiger
Profile
Front-end : Language-specific
Decorated
AST
AST
for (row = 0; row <
sum += a[i];
}
Scan
for
(
=
0
Assem
row
Parse
CSA
Middle-end : Optimizations
Linear IR
CSE
Const
Prop
Dead
Code
OSR
Alg
Idents
Inline
Loop
Unroll
Hoist
Invars
Back-end : Codegen
Instr Selection
Image
RegAlloc
Schedule
Status
Phoenix VC++ compiler backend building/running
Windows XP (40+ million lines of code)
Phoenix .NET JIT compiler passing 95+% of production
JIT compiler tests
Phoenix static analysis being incorporated into current
Enterprise Development tools, some internal tools have
already been deployed
600+ RDKs being used by researches on a wide range
of projects, active community feedback
Working on improving optimizations to surpass current
VC++ compiler performance
Working on productization
Phoenix Academic Program
Phoenix RDK
Conventional EULA for non-commercial use.
Free to use for research and education
Free download.
Depends on Visual Studio 2005 technology.
Support forum for faculty and teaching
assistants.
Latest version, May 2006 RDK, in your
conference materials.
Research Support
17 Research projects funded through Rotor
research grants in 2006, 12 funded in 2005, 5
funded in 2004
Many Phoenix RFP project PI’s here at the
Summit – ask them about their experiences.
Academic Users of Phoenix
Constructing Compact Debugging Traces with Binary Code
Analysis and Instrumentation
Phoenix-Based Compiler Course Development
Compiler Backend Experimentation and Extensibility Using Phoenix
Adaptive Inline Substitution in Phoenix
Domain-Specific Language for Efficient Design-Rule Checking
Setpoint: An Aspect Oriented Framework Based on Semantic
Pointcuts
Phase Aware Profiling with Phoenix
Using Call Graph Analyses to Guide Selective Specialization in
Phoenix
Program Visualization with Fulcra and Phoenix
Navel: Automating Software Support Using Traces of Software
Behavior
Techniques and Tools for Software Assurance
Type-Checking the Intermediate Languages in the Phoenix JIT
Compiler
Introducing IMPACT Parallelism
Discovery and Visualization
Capabilities into Phoenix
Wen-mei Hwu
Sanders-AMD Endowed Chair
University of Illinois, Urbana-Champaign
Fulcra Integration
Phoenix
HIR
MIR
Type Check
SSA
LIR
Addr Mode
Executable
Scheduling
Phoenix to
IMPACT
C Source
Code
Profile
Inline
Pcode (AST)
IMPACT
Opti
Scheduling
Original
Ported
Fulcra
Fulcra
Lcode
(3 addr)
Executable
Visualizer
Pointer analysis performed in Pcode is based on source code
semantics. The result of the analysis is annotated into Pcode
Expressions which are translated to Lcode Operations during
PtoL.
We have ported our pointer analysis to work on our lower level
IR (Lcode).
Bridge from Phoenix to Lcode feeds MIR into Fulcra
Pointer analysis information is directly annotated to Lcode
operations for later visualization.
Pcode vs. Lcode
Interface between Lcode and Fulcra is adjusted, so that the
initial constraint graph containing intra-procedural pointer
information is correctly generated based on Lcode IR.
Building initial constraint graph from Pcode requires shrinking
complex expressions to a sequence of simpler expressions.
FSM required to keep track of pointer assignment status of a
Pcode expression is complicated.
In Lcode each operation is itself a very simple expression.
Only Lcode operations that can change the pointer
assignments are important.
ld, st, mov, add, jsr and ret
Having type information annotated from Pcode into operands
of each Lcode operation helps do a more efficient analysis.
Only Lcode operations that include operands of pointer type
are considered for building initial constraint graph.
Code Example: Image Interpolation
void InterpolateImage (Frame * small, Frame * large)
{
short *short_row, *large_row;
unsigned int i, j;
int height, width;
for (j = 0; j < small->height - 1; j++)
{
short_row = small->image + j * small->width;
large_row = large->image + 2 * j * small->width;
for (i = 0; i < small->width - 1; i++)
{
large_row[2*i] = short_row[i];
large_row[2*i+1] = ((short_row[i] +
short_row[i+2]) >> 1);
large_row[2*i+large->width] =
((short_row[i] + short_row[i+small->width])
>> 1);
large_row[2*i+large->width+1] =
((short_row[i] + short_row[i+small->width+1])
>> 1);
}
}
}
Constraint Graph in Pcode
Intra-procedural constraint graph
Fully solved constraint graph
FSM that derives intra-procedural pointer relationships
in Pcode can lead to a more conservative initial graph .
Constraint Graph in Lcode
Intra-procedural constraint graph
Fully solved constraint graph
The simpler FSM that derive intra-procedural pointer
relationships in Lcode provides more accurate results.
Code Example: Image Interpolation
void InterpolateImage (Frame * small, Frame * large)
{
short *short_row, *large_row;
If these are different objects, there
unsigned int i, j;
int height, width;
are not many false dependences left.
}
for (j = 0; j < small->height - 1; j++)
{
short_row = small->image + j * small->width;
large_row = large->image + 2 * j * small->width;
for (i = 0; i < small->width - 1; i++)
{
large_row[2*i] = short_row[i];
large_row[2*i+1] = ((short_row[i] +
short_row[i+2]) >> 1);
large_row[2*i+large->width] =
((short_row[i] + short_row[i+small->width])
>> 1);
large_row[2*i+large->width+1] =
((short_row[i] + short_row[i+small->width+1])
>> 1);
}
}
Remaining dependences
should be resolved
through array disambiguation.
Visualizer Mockup
iteration
1
…
2
large->width
(a) Conceptual access pattern of stores
large->width = 2 * small->width
2
iteration
1
2
1
(b) Actual access pattern (flattened array): large->width is 2*
small->width and stores do not overlap due to loop bounds
2 * small->width
POTENTIAL CONFLICT
2
iteration
1
2
1
(c) Compiler view: because large->width is an
unconstrained value, stores may overlap;
loop iterations will not be executed in parallel
Phoenix RDK Difficulties
Memory management can be tricky when connecting
managed Phoenix plugin to legacy C research infrastructure
Handoff of strings from Phoenix to IMPACT must be handled carefully
Updating between RDK releases
Feb 2005Nov 2005: memory management, plugin interface
changes
Nov 2005May 2006: minor changes to available functions
Documentation does not seem to have kept up with updates
Type information available to Phoenix phase is not complete
Structure types
Void pointers
Phoenix RDK Positives
Windows POSIX API makes it simple to migrate
Unix-based research infrastructure
Minor changes to a few function names
MSDN docs contained all necessary information to
do the port
Less than 50 lines (out of ~200,000) needed to be
updated
Phoenix documentation and sample code is
quite helpful
Phoenix Academic Forum is very responsive
Phoenix -- Downloads and Info:
http://research.microsoft.com/phoenix
SSCLI (Rotor)
What is SSCLI?
SSCLI is a shared source implementation of CORE
TECHNOLOGIES that underlie Microsoft’s .NET
architecture.
SSCLI is current with advances in the commercial
Common Language Runtime.
SSCLI is designed and documented for ACADEMIC
RESEARCH and TEACHING.
SSCLI Motivations & Milestones
Motivations:
Support ECMA/ISO standards efforts
Validate OS platform neutrality of .NET
Create an open laboratory for academic research
Track evolution of commercial CLR
Share!
Milestones:
SSCLI 1.0 released at OOPSLA 2002
Over 250,000 downloads through June, 2006
SSCLI 2.0 released March 2006!
Managed Code Execution
DEPLOYMENT
DEVELOPMENT
Source code
public static void Main(String[] args )
{ String usr; FileStream f; StreamWriter w;
try {
usr=Environment.GetEnvironmentVariable("USERNAME");
f=new FileStream(“C:\\test.txt",FileMode.Create);
w=new StreamWriter(f);
public static void Main(String[] args )
w.WriteLine(usr);
{ String usr; FileStream f; StreamWriter w;
w.Close();
try {
} catch (Exceptionusr=Environment.GetEnvironmentVariable("USERNAME");
e){
Console.WriteLine("Exception:"+e.ToString());
f=new FileStream(“C:\\test.txt",FileMode.Create);
}
w=new StreamWriter(f);
}
w.WriteLine(usr);
w.Close();
} catch (Exception e){
Console.WriteLine("Exception:"+e.ToString());
}
}
Compiler
Evidence
Host
Policy
Manager
Assembly
PE header + MSIL +
Metadata + EH Table
Assembly
Loader
Policy
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<mscorlib>
<security>
<policy>
<PolicyLevel version="1">
<CodeGroup class="UnionCodeGroup"
version="1"
PermissionSetName="Nothing"
Name="All_Code"
Description="Code group
grants no permissio
ns and forms the root of the code group tree.">
<IMembershipCondition clas
s="AllMembershipCondition"
version="1"/>
<CodeGroup class="UnionCodeGroup"
version="1"
PermissionSetName="FullTrust"
Granted
permissions
GAC,
app. directory,
download cache
Permission request
(class)
(method)
Class
Loader
NGEN
EXECUTION
Assembly info
Module
+ Class list
PEVerify
Vtable +
Class info
JIT +
verification
CLR Services
GC
Exception
Native code Class init
+ GC table
Security
The Shared Source CLI (SSCLI)
VS.NET
C#
JScript
VB
System.Web (ASP.NET)
UI
SessionState
HtmlControls
Caching
Security
WebControls
Configuration
Debugger
Simple Web Services
Protocols
Discovery
Description
Designers
System.Data (ADO.NET)
VC/MC++
ADO
SDK Tools
Design
SN
ILDAsm
Boot Loader
Threads
System.Drawing
Drawing2D
Printing
Imaging
Text
System.Xml
XSLT
Adapters
Serialization
XPath
System
Collections
IO
Security
Configuration
Net
ServiceProcess
Diagnostics
Reflection
Text
Remoting
Globalization
Resources
Threading
Serialization
Runtime
InteropServices
Common Language Runtime
MetaInfo
PEVerify
ComponentModel
GC
App Domain Loader
JIT
MSIL
Common Type System
Class Loader
Platform Adaptation Layer
Sync
Timers
Networking
Filesystem
ECMA CLI
ILDbDump
Design
SQL
CorDBG
ILAsm
System.WinForms
SSCLI: For Teaching and Research
Complete example to enable research in and support teaching of
modern programming languages, compiler design, and runtime
infrastructure.
SSCLI supports the ECMA standardization process with a real
implementation
Commercial grade code (but documented for academia)
SSCLI license allows for “safe” examination of code
For compiler writers who want to target CLI:
JScript compiler shows dynamic techniques (in C#)
C# compiler shows nearly all runtime features
IL Assembler demonstrates low-level API implementation and
use
How SSCLI Is Organized
Four major areas in source code
1.
2.
3.
4.
Runtime “execution engine”
Frameworks
Compilers and tools
Portability layer, tests, and build infrastructure
Other important points of interest
License
Documentation
Samples
Research Support
40 Research projects funded through Rotor research
grants in 2002, 40 funded in 2004
SSCLI RFP Capstone Workshop II held Fall 2005
Researchers from 18 countries
27 research and teaching projects presented
http://research.microsoft.com/workshops/SSCLI2005/
SSCLI RFP Capstone Workshop I held Fall 2003
Researchers from over 20 countries
16 refereed paper presented
IEE Journal: special Rotor research issue
Contracts for CLI with Rotor
Nam Tran
Program Manager
Phoenix
Microsoft Corporation
Research Interests
Programming languages and tools
Project on component contracts
At Monash University, Melbourne, Australia
Design-by-Contract & interaction constraints
Apply to binary components
Neutral of source languages and notations
Proof-of-concept system on the CLI
Extended the CLI platform
Implemented by extending Rotor
The Model
Contracts include
Preconditions, postconditions, invariants
Protocols, mandatory calls, forbidden calls
Binary components include
Contract representation as first class entities
Sufficient for run time monitoring
Execution environment
Aware of contracts
Provide built-in run time monitoring
The Implementation
Extend CLI specification
New metadata table to represent contracts
Auxiliary methods to evaluate assertions
Extended ilasm for source-neutral contracts
Execution Engine provides monitoring
Extend Rotor to implement extended CLI
Dynamic instrumentation of IL before JIT
Built-in capability to monitor
Protocols as state machines
Mandatory/forbidden calls by walking call stacks
CLI/Rotor Benefits
Open, extensible component platform
Components with contracts run-able on .NET
Full implementation with source
A great research platform
Only need to implement innovative features
E.g. fewer than 10K lines of code for contract
Thank you!
Books
Shared Source CLI Essentials – Dave Stutz,
Geoff Shilling, Ted Neward; O’Reilly (2003)
Compiling for the .NET Common Language
Runtime (CLR) – John Gough; Prentice Hall
(2002)
Inside Microsoft .NET IL Assembler – Serge
Lidin; Microsoft Press (2002, 2006)
The Annotated CLI Standard – Jim Miller,
Susann Ragsdale; Addison Wesley (2003)
Distributed Virtual Machines: Inside the Rotor
CLI – Gary Nutt; Addison Wesley (2005)
SSCLI -- Downloads and Info:
http://research.microsoft.com/sscli
Operating Systems
Windows Academic Program
Windows CE
Singularity
Windows Academic Program
Windows Operating
System Internals
Curriculum Resource
Kit (CRK) presentation slides,
experiments, labs,
quizzes and
assignments for
introducing case
studies from the
Windows kernel
into operating
system courses.
Windows Research
Kernel – the core kernel
CRK
WRK
ProjectOZ
sources and binaries
integrated with
an environment for
building and testing
experimental
versions of the
Windows kernel
for use in teaching
and research.
ProjectOZ - an operating
systems project
environment that uses the native kernel interfaces of Windows to provide simple,
clean, user-mode abstractions of the CPU, MMU, trap mechanism, and physical
memory that can be used to perform experiments in operating systems
principles.
WRK (Windows Research Kernel)
Source from the latest shipping Windows (NTOS) kernel
Version – Windows Server 2003 (x86/x64) and Windows
XP x64
Included sources – most everything in NTOS - processes,
threads, LPC, VM, scheduler, object manager, I/O
manager, synchronization, worker threads, kernel memory
manager, …
Excluded sources – plug-and-play, power-management,
and specialized code such as the driver verifier, splash
screen, branding, timebomb, etc.
Build environment – makefile-based with object library for
the excluded sources. Kernels boot on native hardware or
using VirtualPC.
Windows CE
For mobile and embedded devices
All The goodies: Kernel Library, File
System, Device Manager, Storage
Manager, HTTP Web Server, Explorer
Shell, SOAP Implementations, UPnP AV
toolkit, Infrared Data Association,
Microsoft Message Queuing, C run-time,
Binary Rom Image file system, Windows
Sockets Interface, Point to Point Protocol
Singularity
A multidisciplinary research OS,
Languages, and Tools project from MSR
Key approaches:
Pervasive use of safe and analyzable
programming languages
Improve system resilience despite software
errors
Design for verifiability
Microkernel architecture
Written in Spec# (talk later today)
Not a Windows/CLR replacement
Windows Academic Program
(Probert/Retik Brownbag at noon today!)
http://www.microsoft.com/resources/sharedsource/Licensing/
WindowsAcademic.mspx
Windows CE
http://www.microsoft.com/resources/sharedsource/Licensing/
WindowsCE_Academic.mspx
Singularity
(Larus/Hunt talk at 2:45 today!)
http://research.microsoft.com/os/singularity/
Thank you!
© 2006 Microsoft Corporation. All rights reserved.
Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation.
Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft,
and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Download