501-overview

advertisement

CSE 501 Research Overview

Atri Rudra

atri@cse.buffalo.edu

Research Interests

 Theoretical Computer Science

Coding Theory

Algorithmic Game Theory

Sublinear algorithms

Approximation and online algorithms

Computational Complexity

The “algorithmic” side

2

Coding Theory

3

The setup

C(x) x

 Mapping C

Error-correcting code or just code

Encoding: x

C(x)

Decoding: y

X

C(x) is a codeword x y = C(x)+error

Give up

4

Different Channels and Codes

Internet

Checksum used in multiple layers of TCP/IP stack

Cell phones

Satellite broadcast

TV

Deep space telecommunications

Mars Rover

5

“Unusual” Channels

 Data Storage

CDs and DVDs

RAID

ECC memory

 Paper bar codes

UPS (MaxiCode)

Codes are all around us

6

Redundancy vs. Error-correction

 Repetition code : Repeat every bit say 100 times

Good error correcting properties

Too much redundancy

 Parity code : Add a parity bit

Minimum amount of redundancy

Bad error correcting properties

 Two errors go completely undetected

1 1 1 0 0 1

1 0 0 0 0 1

 Neither of these codes are satisfactory

7

Two main challenges in coding theory

 Problem with parity example

Messages mapped to codewords which do not differ in many places

 Need to pick a lot of codewords that differ a lot from each other

 Efficient decoding

Naive algorithm: check received word with all codewords

8

The fundamental tradeoff

 Correct as many errors as possible with as little redundancy as possible

Can one achieve the “optimal” tradeoff with efficient encoding and decoding ?

9

A “low level” view

 Think of each symbol in

 being a packet

 The setup

Sender wants to send k packets

After encoding sends n packets

Some packets get corrupted

Receiver needs to recover the original k packets

10

The Optimal Tradeoff

 C(x) sent, y received

 How much of y must be correct to recover x ?

At least k packets must be correct

 [ Guruswami , R.

STOC 2006 ]

An explicit code along with efficient decoding algorithm

Works as long as (almost) k packets are correct

11

So what is left to do?

 I cheated a bit in the last slide

 The result only holds for large packets

We do not know an “optimal” code over smaller symbols (for example bits)

12

Computational Complexity

 Collisions to lead to shallower decision trees

[ Aspnes, Demirbas, O’Donnell, R.

, Uurtamo 2008 ]

13

Wireless Sensor Networks

Murat Demirbas’ specialty

14

Compute Aggregate Functions

 Each mote has one bit of information

Does at least

2 motes have temperature at least 70F?

Is the temperature at least 70F?

15

One possible solution

 Ask each mote one at a time

Is your temp at least 70F?

In the worst case might have to ask

ALL motes

Is your temp at least 70F?

16

Can we do any better?

 Formalize/Generalize this question

 Decision Tree model

Inputs: x

1

, x

2

,…, x n in {0,1}

Function: f : {0,1} n

{0,1}

Minimum # queries to the input to determine f(x

1

, x

2

,…,x n

) in the worst case

This worst case number of queries is called the decision tree complexity of f

 Very well studied complexity measure of functions

17

Back to our ≥ 70F example

The 2threshold T

2,n function

Previously saw D(T

2,n

)

 n

In fact D(f)

 n

Also D(T

2,n

) ≥ n

For the t -threshold function, D(T t,n

) ≥ n

 Logical OR , Majority are a special case

18

So are we done?

 The central node can broad/multi-cast

Is your temp at least 70F?

ALL motes

No

Is your temp at least 70F?

19

Replies from the motes

 Answer back only if answer is yes

Is your temp at least

70F?

20

Scenario 1: all the answers are 0

Central node hears “silence”

Is your temp at least

70F?

21

Scenario 2: Only one answers is 1

Central node hears a “yes”

Is your temp at least

70F?

Yes

22

Scenario 3: ≥ 2 answers are 1

 There is a collision

Central node can detect it!

All done with ONE query!

Is your temp at least

70F?

Yes

Yes

23

Feedback to complexity theory

A new “decision tree” model

Inputs: x

1

, x

2

,…, x n in {0,1}

Function: f : {0,1} n

{0,1}

Minimum # queries to the input to determine f(x

1

, x

2

,…,x n

) in the worst case

Queries are more general

Query any subset of bits

Answer is 0 , 1 or 2 + depending on #ones in the subset

 k + decision trees

24

Our Results

D 2+ (T t,n

) is O(t log (n/t))

D 2+ (T t,n

) is



(t)

 More general results

Understand D 2+ (f) fairly well

25

Approximation Algorithms

 Ranking in Tournaments

[ Coppersmith, Fleischer, R.

SODA 2006 ]

26

US Open 2005

Venus Williams Maria Sharapova

#4

Kim Clijsters

#1

#3

#2

Nadia Petrova

 Everyone plays everyone

 Rank the players

 Min #upsets

 Rank by number of wins

Break ties

27

Ranking in Tournament results

 [ Coppersmith, Fleischer, R.

SODA 2006 ]

Ordering by number of wins is 5 -approx

Ties broken arbitrarily

Problem shown to be NP-hard in 2005

Application in Rank Aggregation

Gives provable guarantee for Borda’s method

(1781!)

Future Directions

Try and analyze (variants) of heuristics that work well in practice

28

Research Interests

 Theoretical Computer Science

Coding Theory

Algorithmic Game Theory

Sublinear algorithms

Approximation and online algorithms

Computational Complexity

The “algorithmic” side

29

For more information…

 My Office is Bell 123: drop by!

 atri@cse.buffalo.edu

 CSE 545 in Spring 09

Course on error correcting codes

30

Algorithmic Game Theory

 Online auction of digital goods

[ Blum, Kumar, R.

, Wu SODA 2003 ]

31

Online Auctions of Digital goods

 Say you want to sell mp3s of a song

Can make copies with no extra cost

 Buyers arrive one by one

Specify how much they are willing to pay

 You need to decide to sell or not

At what price ?

 You want to make lots of money

32

However…

 Why not just sell at the value specified by a buyer ?

Buyers are selfish

They will lie to get a better deal

Why not charge a single fixed price ?

Do not know best price in advance

The challenge

Build a online pricing scheme that gives buyers no incentive to cheat

Our work gives pricing scheme as good as best fixed price

[ Blum, Kumar, R.

, Wu SODA 2003 ]

33

Problems I am interested in

Problems motivated by game theory

Sometimes, “old” problem with a twist

What is the best way to pair up potential couples in a dating site?

 Twist on the classical graph matching problem

34

Sublinear Algorithms

 Data Streams

[ Beame, Jayram, R.

STOC 2007 ]

35

Data Streams (one application)

Databases are huge

 Fully reside in disk memory

Main memory

Fast, not much of it

Disk memory

Slow, lots of it

Random access is expensive

Sequential scan is reasonably cheap

Main memory

Disk Memory

36

Data Streams (one application)

Given a restriction on number of random accesses to disk memory

How much main memory is required ?

For computations such as join of tables

Answer: a lot

[ Beame, Jayram, R.

STOC

2007 ]

Open question: computing other functions?

Main memory

Disk memory

37

Download