Link Adaptation

advertisement
Learning Approach to Link
Adaptation in WirelessMAN
Avinash Prasad
Supervisor: Prof. Saran
Outline of Presentation








Introduction
Problem Definition
Proposed Solution & Learning Automaton.
Requirements
About Implementation
Results
Conclusions
References
Introduction
( Link Adaptation )
Definition
Link adaptation refers to a set of techniques
where modulation, coding rate and/or other
signal transmission parameters are changed on
the fly to better adjust to the changing channel
conditions.
Introduction

WirelessMAN requires high data rates over



( WirelessMAN )
channel conditions varying across different links.
channel conditions vary over time
Link Adaptation on per link basis is the most
fundamental step that this BWA system uses to
respond to these link to link variations, and
variations over time. There is an elaborate
message passing mechanism for exchanging
channel information at the MAC layer.
Problem Definition

Link adaptation requires us to know which
channel condition changes summon for a change
in transmission parameters.

Most commonly identified problem of link
adaptation
 How do we calculate the threshold values for various
channel estimation parameters, which shall signal a need
for change in channel transmission parameters.
Problem Definition ( Current Approaches)
Current methods for threshold estimation


Model Based




Statistical methods.




Requires analytical modeling.
How reliable is the model ?
Availability of appropriate model for the wireless scenario ?
Hard to obtain one channel conditions.
Doesn’t change with time, fixed.
Even a change in season may effect the best appropriate
values.
Heuristics based

Scope limited to very few scenarios.
Proposed Solution

( Aim )
Come up with a machine learning based method
such that




Learn the optimal threshold values as we operate over
the network.
No analytical modeling needed by the method in its
operation.
Should be able to handle noisy feedback from the
environment.
Generic enough to be able to learn different parameters
without much changes to the core.
Proposed Solution

( Idea )
Use stochastic learning automaton.


Informally – Essentially simulates animal learning ;
Repeatedly make your decisions based on your current
knowledge, and then refine your decision as per the
response from the environment.
Mathematically – Modifies the probability of selecting
any action based on how much reward do we get from
the environment.
Proposed Solution

( Exp. Setup )
Experimental setup used to study the stochastic
learning methods.

We learn the optimal SNR threshold values for switching
among coding profiles, such that the throughput is
maximized.

Threshold Ti decides when to switch from profile i to
profile i+1.

The possible values for Ti have been restricted to a
limited set of values, to facilitate faster learning, by
diminishing the number of options.
Proposed Solution
( Exp. Setup )
( Mathematical Formulation )



For N different profiles in use, the need to (N-1) thresholds to be
determined/learned.
At any instance these (N-1) thresholds Ti , i € {1,..,N-1}, form the input
to the environment.
In return the environment returns the reward
β( SNR estimate,< T1,..,TN>)= (1- SER)* ( K / Kmax)



K represents the information block size fed to the RS encoder in the
selected profile.
Kmax is the maximum possible value of K for any profile. This makes the
reward value lie in the range [0,1]
Clearly β is a measure of normalized throughput.
Proposed Solution (Learning Automaton)
( Formal Definition )
A learning automaton is completely given by
(A, B, LA, Pk )




Action set A (α1, α2,…, αr), we shall always assume this set to be
finite in our discussion.
Set of rewards B = [0,1]
The learning algorithm LA
State information Pk = [ p1k , p2k ,.., prk]
Proposed Solution (Learning Automaton)
( Why? )
Advantages:
 Complete generality of action set
 We can have a entire set of automaton , each
working on a different variable of a multivariable
problem, and yet they arrive at a Nash
Equilibrium, such that the overall function is
maximized, much faster than a single automaton.
 It can handle noisy reward values from the
environment


Perform long time averaging as it learns
But thus needs the environment to be stationary
Proposed Solution (Learning Automaton)
( Solution to Exp. setup )



Each threshold is learnt by an independent
automaton in the group, game , of automaton that
solves the problem.
We choose the smallest possible action set
depending that covers all possible variations in
channel conditions in the setup, for each of the
automaton .i.e. decide the possible range of
threshold values.
We decide on the learning algorithm to use.
Proposed Solution (Learning Automaton)
( Solution to Exp. setup )

For k being the instance of playoff. We do the
following




Each automaton selects an action (Threshold) based on its
state Pk, the probability vector.
Based on these threshold values, we select a profile for
channel transmission.
Get feedback from the channel in the form of the value of
normalized throughput defined earlier.
Use the learning algorithm to calculate the new state, set of
probabilities Pk+1 .
Proposed Solution (Learning Automaton)
( Learning Algorithms )

We have explored two different algorithms

LRI , Linear reward inaction.


Very much Markovian , just update the Pk+1 based on the last
action/reward pair

for α(k)=αi pi(k+1) = pi(k) + ∆*β(k)*(1- pi(k))
otherwise
pi(k+1) = pi(k ) - ∆*β(k)* pi(k )

∆ is a rate constant.
Pursuit Algorithm


Uses the entire history of selection and reward to calculate the
average reward estimates for all actions.
Aggressively tries to move towards the simplex solution, which
has probability 1 for action with highest reward estimate, say
action αM .
 P(k+1)= P(k) + ∆*( eM(k) – P(k))
Proposed Solution (Learning Automaton)
( Learning Algorithms cont.)




Both differ in the speed of convergence to the
optimal solution
The amount of storage required for each.
How much decentralized the learning setup,
game, can be
The way they approach their convergence point

Being a greedy method pursuit algorithm shows lots of
deviation in the evolution phase.
Requirements



802.16 OFDM Physical layer
Channel model (SUI model used)
Learning Setup
About Implementation
( 802.16 OFDM Physical Layer)




Implements OFDM physical layer from 802.16d.
Coded in MatlabTM
Complies fully to the standard, operations tested
with the example for pipeline given in the
standard.
No antenna diversity used, and perfect channel
impulse response estimation assumed.
About Implementation
( Channel Model )




We have implemented the complete set of SUI
models for omni antenna case.
The complete channel model consists of one of
the SUI models plus AWGN model for noise.
Coded in MatlabTM , thus completing the entire
channel + coding pipeline.
Results from this data transmission pipeline shall
be presented later.
About Implementation
( Learning Setup )



We implemented both the algorithms for
comparison.
Coded in C/C++.
A network model was constructed using the
Symbol error rate plots obtained form PHY layer
simulations to estimate the reward values.
Results
( PHY layer)
( BER plots for different SUI models)
Results
( PHY layer)
( SER plots for different SUI models)
Results
( PHY layer)
( BER plots for different Profiles at SUI2)
Results
( PHY layer)
(SER plots for different Profiles at SUI2)
Results
( PHY layer)
( Reward Metric for learning automaton )
Results
( Learning )
( Convergence curve; LRI (rate=0.0015)
Results
( Learning )
(Convergence curve; Pursuit (rate=0.0017)
Results
( Learning )
( Convergence curve; LRI (rate=0.0032)
Results( Learning : 4 actions per Thresh )
( Convergence curve; LRI (rate=0.0015)
Conclusions

Our plots suggest the following



Learning methods are indeed capable of arriving at the
optimal values for parameters in the type of channel
conditions faced in WirelessMAN.
The rate of convergence depends on
 rate factor (∆)
 size of the action set
 How much do the actions differ in terms of the reward
that they get from the environment.
 The learning algorithm
Although we have worked with a relatively simple setup
with assumption that SNRestimated is perfect and available
complete generality of the action set ensures that we can
work with other channel estimation parameters as well.
References



V. Erceg and K. V. Hari, Channel models for fixed wireless applications.
IEEE 802.16 broadband wireless access working group.2001
Daniel S. Baum, Simulating the SUI models. IEEE 802.16 broadband
wireless access working group,2000
M. A. L. Thathachar and P.S. Shastry. Network of learning automata
techniques for online stochastic optimization, Kluwer Academic
Publication,2003
Thanks
Thanks
Download