The University of Texas at Dallas

advertisement
Systems Group
Department of Computer Science
Erik Jonsson School of Engineering
and Computer Science
The University of Texas at Dallas
April 23, 2007
University of Texas at Dallas
Information about the Group
•
•
•
•
•
•
•
•
Over 10 members
Members of editorial boards of IEEE and ACM Transactions
Advisory boards (e.g., Purdue University CS Department)
Funding from NSF (including career awards), AFOSR, ARO, DoD,
NASA and Corporations
PhD form prestigious universities including Cornell, Princeton, USC,
Purdue, UNC
IEEE/AAAS Fellows, Senior Members, Awards
Keynote addresses at major conferences (e.g., ACM SACMAT 04,
PAKDD06, IEEE Policy 07)
Collaboration with Leading researchers
– Purdue, UMBC, U of VA, GMU, UIUC, U of MN, GATech etc.
University of Texas at Dallas
Technology Themes
• Our research is focusing on Core System areas such as
– Embedded Systems, Distributed Systems and Networks, Data
Management Systems
• We are also conducting extensive research in systems
applications including
– Data Mining, Visualization, Graphics, Bioinformatics, Multimedia
and Animation, Geospatial information management, and Wireless
Computing
• Security cuts across all areas
– Data and applications security, Network security, Data Mining for
Security Applications, Privacy, Secure languages, Embedded
systems security, Secure data grid
University of Texas at Dallas
Vision of the Systems Group
• Five Pronged Approach to R&D in Systems and
Applications
– 1. Basic research in systems ranging from complexity results to
systems design
• Funding from NSF, AFOSR, ARO, etc.
– 2. Applied research: Large scale design and implementation
projects (Alcatel, Raytheon, Nokia, Rockwell, etc.)
– 3. Technology Transfer: work with corporations such as Raytheon
to transfer the research to Operational programs
– 4. Standards – work with organizations such as OGC, W3C to
transfer research to standards
– 5. Commercialization: Work with Office of Sponsored Research to
commercialize our tools (e.g., Data Mining for security)
University of Texas at Dallas
Embedded Systems & Security
Edwin Sha
Billions of units produced yearly, versus millions
of desktop units
Application Specific: more parallel,
heterogeneous, networked
Tightly-constrained: low cost, low power, small
memory
Real-time & Secure
Need both hardware & software: need design
automation and optimization: compiler, OS,
hardware
Timing & Memory Optimization
Timing optimization for loops:
Develop retiming, MD retiming. All the instructions
in a loop nest can be executed in parallel.
Hiding memory latency:
CPU is fast; memory is slow. Prefetching data
before they are required. Combining with
partitioning and iterational retiming. Completely
hide memory latencies.
University of Texas at Dallas
Timing: all the instructions in a loop nest can
be executed in parallel
Power: switching activities is reduced by 42.8%
Program size: code-size reduction technique
reaches 50% reduction
Security: Hardware/Software Defender protects
systems from any buffer-flow attacks
http://www.utdallas.edu/~edsha
HW/SW for Security
Protection from buffer-overflow attacks
Problems: protection capability, overhead
Solution: Hardware/Software Defender
(HSDefender).
Intrusion Detection for known worms &
viruses
Problems: performance
Solution: very high-performance specialized
parallel architectures.
Visual Languages and Communications
Kang Zhang
Objectives
• Build a Theoretical Foundation for Visual
Specification and Reasoning
• Apply Visual Techniques to Data Engineering
• Enhance Information Access on Mobile Devices
• Promote Aesthetic Aspects of Visualization for
High Usability
 Funding: NSF ITR: 216K + proposal submitted;
Scholarship grants: NSF CSEMS, DoEdu GAANN
Scientific/Technical Approaches
 Develop a spatial graph grammar formalism with
efficient parsing
 Build a graph induction engine
 Add semantics to UML diagrams
 Design intuitive and effective graph visualization and
navigation algorithms (e.g. graph labeling, mobile
browsing)
 Learn from visual arts and design for aesthetic
information visualization and user-interfaces
University of Texas at Dallas
Visual Languages
(Graph Grammars)
Information
Visualization
Round-Trip Visual
Engineering
Visual Arts
& Design
Mobile Display
Model-Driven
Engineering
Applications
Multimedia
Authoring
Data Interoperation
Accomplishments
 Proposed a context-sensitive graph grammar
formalism with polynomial parsing speed
 Applied graphical specification and reasoning to
various application domains
 Developed a visual data clustering and noise
removal system
Challenges
 Measurement/evaluation of aesthetics and visual
effectiveness; Usability; Scalability
Next General Prolog Systems
Gopal Gupta
Objectives
• Develop the next generation of Prolog system
that integrates various recent advances:
•Finite Domain Constraints
•Tabled Logic Programming
•Coinductive Logic Programming
•Answer Set Programming (ASP)
•Deterministic coroutining
•Parallelism (via Multicores);o
Rationale
 Research in logic programming driven by quest to
find the optimal computation rule
-- select clauses in optimal order
-- select goals in optimal order
 Tabling/Parallelism allows optimal clause order
 Det. Coroutining/constraints allow optimal goal order
 Coinductive LP/ASP adds further power
University of Texas at Dallas
Approach
• Develop simple-to-implement approaches
(else impl. becomes too complex).
• Use an existing Prolog engine (GNU Prolog)
• Exploit parallelism on multicore machines
Applications
• Model checking and verification
• Non-monotonic reasoning
• Semantic web reasoning engines
Accomplishments
 Developed coinductive logic programming and
efficient ways to implement it.
 Developed scalable, easy-to-realize parallel
implementation on Beowulf arch.
 Developed easy-to-realize implementation for tabled
logic programming
 Developed methods for goal-directed execution of
answer set programs (non-monotonic reasoning).
Assured Information Sharing
Bhavani Thuraisingham, Latifur Khan, Murat Kantarcioglu
Objectives
• Develop a Framework for Secure and Timely
Data Sharing across Infospheres.
• Investigate Access Control and Usage Control
policies for Secure Data Sharing.
• Develop innovative techniques for extracting
information from trustworthy, semi-trustworthy
and untrustworthy partners.
 Funding: AFOSR: 306K + 120K + proposal
submitted; Matching funds from dean
Scientific/Technical Approach
 Conduct experiments as to how much information is
lost as a result of enforcing security policies in the
case of trustworthy partners
 Develop more sophisticated policies based on rolebased and usage control based access control models
 Develop techniques based on game theoretical
strategies to handle partners who are semi-trustworthy
 Develop data mining techniques to carry out
defensive and offensive information operations
University of Texas at Dallas
Data/Policy for Coalition
Publish Data/Policy
Publish Data/Policy
Publish Data/Policy
Component
Data/Policy for
Agency A
Component
Data/Policy for
Agency C
Component
Data/Policy for
Agency B
Accomplishments
 Developed an experimental system for determining
information loss due to security policy enforcement
 Developed a strategy for applying game theory for
semi-trustworthy partners; simulation results
 Developed data mining techniques for conducting
defensive operations for untrustworthy partners
Challenges
 Handling dynamically changing trust levels;
Scalability
Malicious Code Detection using Data Mining
Latifur Khan and Bhavani Thuraisingham
Objectives
• Develop a framework for Malicious code
detection
• Overcome shortcoming of Traditional
approaches--Signature based & Not effective
against “zero day” attacks
• Proposed Innovative Framework will be
deployed in untrustworthy partners
 Funding: AFOSR: 306K + proposal submitted;
Matching funds from dean
Scientific/Technical Approach
Develop a hybrid data mining approach to
detect malicious executables. Important
features of malicious and benign executables
are identified and trained classifiers
Three set of features are extracted: Binary
features are extracted from the binary
executables; assembly features are extracted
from disassembled executables; function call
features are extracted from program headers.
University of Texas at Dallas
Accomplishments
• Developed a tool that can detect malicious
executables in near real time.
Future Work
• Detect malicious executable in real time with a very
low false alarm rate
• Extend this work to detect buffer overflow by
discriminating messages containing code (i.e.,
attack message) from messages containing no code
(i.e., non attack message)
Geospatial Information Management for National Security
Latifur Khan and Bhavani Thuraisingham
Client
DAGIS
MatchObjectives
• Develop a framework for Geospatial Data integration to
incorporate geospatial data sources and other sources
• Framework will facilitate standard metadata that
describes geospatial repositories and a coherent
mechanism to connect repositories-- Seamless
integration of Geospatial and Non-Geospatial
information with minimal human intervention– (a
sample query “Find movie theaters within 30 miles of
75080” )
• Funding: Raytheon: 200K + proposal submitted;
Matching funds from dean
Scientific/Technical Approach
• Develop Semantic Web Services--Conjunction of two
powerful technologies : Semantic Web and Web
Services
• Semantic Web Services provide richer semantics
required for automation of service discovery, selection
and execution tasks
• Develop Geo Service Discovery and dynamic
compositions to integrate geospatial information
services by exploiting OWL-S to describe Web services
University of Texas at Dallas
Agent
Maker
DAGIS
Composer
3.
2. Service
Compose
Discovery
Selection
Composer
Profile 5.Return Dynamic
Sequencer
4.
Service URI
Construct
Richardson
Sequence
Zipcode
Theater
Finder
Finder
Theaters
TX
1.
Query
30 Miles
Accomplishments
 Developed a tool that can handle certain types of
queries with a limited number of geospatial and non
geospatial data sources
Future Work
• Complete toolkit that can handle a complex query
automatically and effectively on the fly from a
significant number of geospatial and non geospatial
data sources
• Extend this for national security data analysis
Securing Critical Information
I-Ling Yen
Objectives


Many data-intensive applications
hosting critical data
 Data grid
 Large-scale distributed database
How to secure these systems under
hostile Internet environment
 Secure storage
 Secure operations on the data
Problem Statements




Data Grid

Developed data grid storage systems
 Combine secure sharing and
replication to achieve security,
availability, and integrity
 Efficient data placement algo. for
allocating data shares and their
replicas to achieve the best access
performance
University of Texas at Dallas
No matter how good the intrusion
detection systems are, adversaries
always manage to penetrate the system
Need to support intrusion tolerance
Even if the system is compromised,
critical information can still stay secure
Simple encryption won’t work

In storage system: key management issues

In data applications: data need to be
decrypted when operated on
Operating on Encrypted Data



Developed search algorithm to support
the processing of search queries on
encrypted data
Developed new encryption algorithms
to allow secure computation on secret
data
Need to integrate these algorithms in
systems while ensuring overall system
security
Data Integrity, Quality and Provenance for Command
and Control Applications
Murat Kantarcioglu and Bhavani Thuraisingham
Objectives
• Reduce the complexity of the data integrity
assurance process
• Develop tools to decide whether to “admit” data
into a database
• Develop techniques to analyze the confidence of
query results based on data provenance
 Funding: AFOSR: 300K ; Matching funds from
dean (Joint work with Elisa Bertino from
Purdue University)
Scientific/Technical Approach
Develop integrity and provenance policy
language
Develop risk management based approach
that considers risks due data provenance
Apply game theoretical and incentive based
techniques to enforce honest behavior in
policy enforcement
Access
Request
Access
Control
Results
Access
Controller
Integrity Controller
Conven
tional
Access
Controll
er
Integrity
Validator
Integrity
Policy
Repository
Integrity
Metadata
Repository
Integrity
Policy
Supplier
Accomplishments
 Developed comprehensive architecture for an
integrity control system
 Developed integrity policy language
 Developed an initial approach to risk
evaluation
Challenges
 Developing techniques against malicious behavior
University of Texas at Dallas
Privacy-Preserving Data Mining
Murat Kantarcioglu and Bhavani Thuraisingham
Specific Secure Tools
Data Mining on Horizontally
Partitioned Data
Objectives
• Learn data mining results without disclosing the
private data
• Measure privacy loss due to data mining results
• Explore possible trade-offs between privacy,
efficiency and accuracy
• Devise techniques to use data mining results
privately
•Association Rule Mining
•Secure Sum
•Secure Comparison
•Decision Trees
•Secure Union
•EM Clustering
•Secure Logarithm
• Naïve Bayes Classifier
•Secure Poly. Evaluation
Scientific/Technical Approach
 Develop secure multi-party computation based
approaches for distributed data mining tasks under
different adversarial assumptions
 Develop perturbation based approaches for
individually adaptable privacy preservation
 Develop statistical methods to measure privacy
loss due to data mining results
 Develop cryptographic framework for using data
mining results privately
•K-NN Classifier
Accomplishments
 Showed that various distributed data mining
protocols could be implemented using few specific
secure protocols (see the figure above)
 Developed a perturbation technique that allows
individuals to choose their own privacy level
 Developed various secure tools for enabling privacy
preserving data mining.
Challenges
 Relative inefficiency of cryptographic techniques,
accuracy loss in perturbation based approaches
University of Texas at Dallas
Classification and Prediction Models for Mining Spatial Data
Weili Wu
•
Motivation and Application
Historical Examples:
–
London Asiatic Cholera 1854 (Griffith)
–
Dental health and fluoride in water,
Colorado early 1900s
Current Examples:
–
Crime hotspots (NIJ CML, police petrol )
–
Environmental justice (EPA), fair lending
practices
–
Location aware services (Defense: Sensor
networks, Mobile ad-hoc networks)
–
•
Ecology: Spatial habitat model
Funding
–
NSF 300K + Matching funds from dean
University of Texas at Dallas
•
Research Problem Formulation
Given:
S  {s1 ,...sn }
1. Spatial Framework
2. Explanatory functions:
f Xk : S  R
f
:
S

C  {c1 ,...cM }
3. A dependent class:
C
4. A family  of function mappings: R ...  R  C
Find: Classification model:
Objective: maximize classification_accuracy
fˆc exists

Constraints: Spatial Autocorrelation
•
( fˆc , f c )
Accomplishments:
–
–
–
Developed efficient spatial-temporal model to
analysis Geo-spatial data.
Developed new spatial similarity measure to
build a more advanced model.
Developed new efficient search algorithm.
Dependable Distributed Systems
Neeraj Mittal
Objectives


Develop novel algorithms for
monitoring executions of distributed
systems.
Develop new algorithms for effective
sharing of resources.
Challenges

Asynchronous system with no global
clock or shared memory.

Processes and channels may be
unreliable.

Processes may join and leave the
system at any time.
Scientific Accomplishments


Developed algorithms for detecting
stable properties (e.g., termination)
under a variety of conditions:
 processes may fail by crashing
 failed processes may recover
Develop efficient algorithms for group
mutual exclusion.
University of Texas at Dallas
Future Work

Monitoring algorithms when the
system is dynamic.

Resource management algorithms
when processes and/or channels may
be unreliable.
Key Management in Sensor Networks
Neeraj Mittal
Objectives


Develop novel schemes for securing
communication in sensor nodes
deployed in hostile territory.
Communication between two sensor
nodes may need to be protected
against snooping by another node.
Challenges

Sensor nodes have limited resources.

Wireless communication is vulnerable
to eavesdropping.

Sensor nodes are vulnerable to
physical captures.
Scientific Accomplishments

Developed novel schemes for predistributing keys among sensor nodes
under a variety of conditions:
 limited deployment knowledge is
available
 some sensor nodes may be
malicious
University of Texas at Dallas
Future Work

Dynamically refresh the keys stored at
uncompromised sensor nodes.

Protect against new malicious sensor
nodes joining the network.
Computational Systems Biology through Mining High
Throughput Data
Ying Liu
Objectives
• Design efficient algorithm for biological
network inference
• Integrate heterogeneous biological data
• Decompose Biological networks into functional
modules
• Discover functional hierarchy from biological
networks
Scientific/Technical Approaches
Biological networks are modular
Using random forest tree to integrate
heterogeneous data
New formulation for heavy sub-graph mining
Design graph mining algorithm
Propose new metrics to measure dense subgraphs
University of Texas at Dallas
Accomplishments
 Integrated 7 different types of data to construct
protein-protein interaction networks
 Formulate heavy sub-graph discovery problem as a
quadratic functions
 Proposed new graph mining algorithms based on
Evolutionary computing and neural network
Challenges
 Large-scale data size; Heavy sub-graph discovery
problem is NP-complete problem.
Physically-Based Deformable Models
Xiaohu Guo
Objectives
• Develop a physically-based simulation and
visualization platform for deformable models,
which can perform dynamic simulation,
collision detection, and material property
visualization, in real-time.
• Investigate physically-based deformable models
under a networking collaborative virtual
environment.
Scientific/Technical Approach
 Investigate the theoretical foundations for quasiconformal surface mapping and harmonic volumetric
mapping
 Based on the regular parametric domains included by
geometric mapping, develop a GPU-accelerated
framework including real-time PDE/ODE solver,
collision detection, and volume rendering
 Having the regular parametric domains (i.e. geometry
images), use image-based (2D/3D) compression and
streaming technique for efficient transmission of
deformable models.
University of Texas at Dallas
Harmonic Surface and Volumetric Mapping
GPU-Accelerated
PDE/ODE Solver
GPU-Accelerated
Collision Detection
GPU-Accelerated
Deformable Models
Geometry Images
GPU-Accelerated
Volume Rendering
Deformable Models
Compression and
Network Streaming
Potential Applications
 Surgical training and dynamic simulation of human
tissues/muscles under interactive manipulation
 3D model registration and target localization in
medical imaging, based on deformable models
Challenges
 Multiple users’ collaborative manipulation will
result in data incoherency at different client sites,
deformable model decomposition techniques can be
further investigated
Language-based Software Security
Kevin W. Hamlen
Objectives

Develop systems for safe execution of
mobile code from untrusted sources

Support low-level binary formats,
legacy languages, etc.

Provide formally provable security
guarantees (e.g., using type theory)

When source is untrusted, code signing
doesn’t help

Static analyses useful when possible,
but interesting security properties are
undecidable

In-lined Reference Monitors are
sufficiently powerful, but need formal
proof techniques to guarantee safety
Scientific Accomplishments

Developed the first certified In-lined
Reference Monitoring system
 fully automatic program-rewriter
for managed .NET
 all generated code has machinecheckable soundness proof
University of Texas at Dallas
Challenges
Future Work

Support lower level binary formats
(x86 machine code rather than .NET
bytecode)

Reduce disconnect between theory &
implementation by creating smaller
verifiers (e.g., logic programming)
Multimedia Systems and Networking
B. Prabhakaran (praba@utdallas.edu)
•
3D Motions: Motion capture and Gesture sensors data
•
•
•
•
•
•
3D Models: Educational instructions, Role playing games
For delivery (streaming): focus on wireless networks
Biomedical Applications
 Physical Medicine & Rehabilitation
 Parkinson’s and other Neurological Diseases Study
 Dynamics of Human Anticipation
Security Applications
 Emergency Handling: Streaming Animated Instructions Over PDAs,
Laptops on Wireless Ad-hoc Networks
 Optimal Sensor Placement, Suspicious activity identification.
Arts and Technology
 Copyright Protection: Watermarking of 3D Models and Captured Motions
 Reusability of Models and Motions
Funding from NSF Career and ARO
University of Texas at Dallas
Our Directions and Plans
• Each technology area is making very good technical progress
• Will continue to enhance our research and follow the five pronged
model
• Also plan on developing interdisciplinary projects
– within the Group
– Across the Groups
– Across UTD and Partners (e.g., School of Management, UT
Southwestern Medical Center)
• Continue to increase the number of Fellows, Board members, Keynote
talks etc.
• Center Scale Project is our major goal
University of Texas at Dallas
Download