Online Dynamic Value System for Machine Learning

Fourth International Symposium on Neural Networks (ISNN)

June 3-7, 2007, Nanjing, China

Online Dynamic Value System for Machine Learning

Haibo He, Stevens Institute of Technology

Janusz A. Starzyk, Ohio University

Outline



Introduction;



Online curve fitting principles ;



Network architecture and operation;



Simulation analysis;



Conclusion and future research;

2/22

Introduction: Why value system is important?

From traditional AI to the embodied intelligence:

Rat Neurons can fly F- 22 jet

Intelligent machine

Picture source: www.space.com

State

S t

Reward r t r t



1

S t



1

Environment

Action a t

 Make value judgments according to received information;

 Develop sensory-motor coordination to actively interaction with environment;

 Develop internal value system and apply it to decision making;

3/22

Introduction: What is the value signal?

Different applications will have different definition of value signal, but we define the value signal as an expected reward or desired objective for machine’s action.

Motivation: Goal-driven learning

To provide a mechanism for the intelligent machines to be able to dynamically estimate the value function in reinforcement learning

(specify “good” from “bad”), therefore guiding the machines to adjust its actions to achieve the goal.

Source: Biologically inspired robot at CWRU http://biorobots.cwru.edu/

4/22

Introduction: self-organizing learning array

(SOLAR)

Characteristics:

* Self-organization

* Sparse and local interconnections

* Dynamically reconfigurable

* Online data-driven learning

System clock Remote neurons

Nearest neighbour neuron

Other Neurons

II: information index

ID: information deficiency

5/22

How can value system help here?

Supervisor is not always available in the learning environment

–

Uncertain (no prior knowledge) external environment

Supervisor is not always necessary in the learning environment

– How learning happens in a one-year old baby

Source:

Sociable humanoid robots: Kismet at MIT Artificial Intelligence Lab

6/22

The challenges



Unstructured environment/uncertain information



Limited availability of information;



Information ambiguity and redundancy;



High dimensionality of the data set;



Time variability of the information;

7/22

Outline



Introduction;

Online Curve Fitting Principles;










8/22

Online dynamic curve fitting

Consider dynamic adjustment of the fit function described by a linear combination of the selected base functions:

Y

 a

1

 

1

 a

2

 

2



......

 a q

  q





 a a







...

a q

1

2

















 T 





1  T

Y

Y







1







 i n





1







 i n





1 i n





1





1 i



1 i

1 i

...







1 i

2 i qi

Storage requirements:



2

...

 q











 a a

2

...

a q

1











 

* A i n





1



1 i



2 i i n





1



2 i



2 i

...

i n





1



2 i

 qi

...

...

...

...

i n





1



1 i

 qi i n





1



2 i

...

 i n





1

 qi

 qi qi

















1









 i n





1







 i n





1

...

i n





1



1 i

Y i





2 qi i

Y i

Y i















 s

 q ( q



1 )

 q

2

9/22

Value

Three curve fitting versus single curve fitting

A

Value A

Upper Curve

Neutral Curve

Lower Curve

B

B

Data dimension

Data dimension

Three curve fitting:

 Neutral Curve: a least square fit (LSF) fits to all the data samples in the space

 Upper Curve: only fits to the data points which are above the neutral curve.

 Lower Curve: only fits to the data points which are below the neutral curve

10/22

Differential Based Voting: d

1 i

 v ni

 v ui d

2 i d i



 d v ni

1 i



 v li d

2 i

2

Decision integration

Value

Vui

Vni

Vli

Upper Curve

Neutral Curve d1i d2i

Input data w i



1 d i v vote

 i k 



1

 v ni w i i k 



1 w i



Lower Curve

Input

11/22

Implementation of TCF

Value

V_true

Vni

New received point

Upper Curve-after modification

Upper Curve-before the new point is received

Neutral Curve-after modification

Pseudo code:

Lower Curve-keep unchanged

Neutral Curve-before the new point is received

Data dimension

{New data sample comes;

Modify the neutral curve;

Difference = v ni

 v true

If (Difference >= 0)

{ Modify the lower curve;

Keep the upper curve unchanged;} else

{ Modify the upper curve;

Keep the lower curve unchanged;} end end}

12/22

Outline



Introduction;











13/22

Value system architecture

A pipelined dynamic architecture:

To all the processing elements in each layer

Value

C h a n n e l

Data samples

Bidirectional signal channel

Vn1

W1

Vn2

W2

C h a n n e l

Vni

Wn v

1 w

1





  w i v i



1

  w i



1

Data PE

Information PE

C h a n n e l v l w l



  w i v i l





  w i l



 v l



1

 w l



1

IPN a n n e l

C h

Communication Channel

Bidirectional signal channel

DPN

Final

Value

14/22

Inside a value system

Value

Input 1

Input 2

Processing

Element

Fitted value To Differential Voting

Transform function output

To another PE’s input

Input space transform function

Curve fitting

Fitted value

Transform function output

15/22

Outline



Introduction;











16/22

Simulation analysis

Financial data analysis bank prime loan rate prediction

Data sets are available from: www.forecasts.org

Input:

 Monthly bank prime loan rate;

 Discount rate;

 Federal funds rate;

 Ten-year treasury constant maturity rate;

“market is unpredictable”

 Random Walk Hypothesis;

 Efficient Market Hypothesis;

Output:

 Next month’s bank prime loan rate

Training period:

 January 1995 to December 2000

Testing period:

 February 2001 to September 2002

17/22

Prediction results

Bank prime loan rate prediction by value system

(February 2001 to September 2002)

18/22

Result comparison: MSE error

Performance comparision

0.6

0.5

0.4

MSE error 0.3

0.2

0.1

0

Learning accuracy Prediction accuracy

Hybrid iterative evolutionary fuzzy neural network in [8]

Genetic fuzzy neural learning algorithm in [9]

Proposed value system

19/22

Outline



Introduction;











20/22

Conclusion and future research

 Provide a mechanism for the intelligent machines to be able to dynamically estimate the value function;

 Dynamic online data driven learning;

 No backpropagation required;

 Three curve fitting method;

 General framework for different implementations

21/22

Future research

 Dynamically self-reconfigurable;

 Investigate different input transformation and base functions;

 Hardware implementation;

 Facilitate goal-driven learning;

 Integration with reinforcement learning within a realistic environment;

A promising future?

Ray Kurzweil predicted:

We achieve one Human Brain capability for $1,000 around the year 2023, for one cent around the year 2037;

We achieve one Human Race capability for $1,000 around the year 2049, for one cent around the year 2059.

---from

“The Law of Accelerating Returns” by Ray Kurzweil

Source: www.kurzweilai.net

22/22

Online Dynamic Value System for Machine Learning

Introduction: Why value system is important?

Introduction: What is the value signal?

Introduction: self-organizing learning array

How can value system help here?

The challenges

Online dynamic curve fitting

Three curve fitting versus single curve fitting

Decision integration

Implementation of TCF

Value system architecture

Inside a value system

Simulation analysis

Prediction results

Result comparison: MSE error

Related documents

Products

Support

Online Dynamic Value System for Machine Learning

Introduction: Why value system is important?

Introduction: What is the value signal?

Introduction: self-organizing learning array

How can value system help here?

The challenges

Online dynamic curve fitting

Three curve fitting versus single curve fitting

Decision integration

Implementation of TCF

Value system architecture

Inside a value system

Simulation analysis

Prediction results

Result comparison: MSE error

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib