Zhi-Ming Ma

advertisement
Web Markov Skeleton
Processes and Applications
Zhi-Ming Ma
10 June, 2013, St.Petersburg
Email: mazm@amt.ac.cn
http://www.amt.ac.cn/member/mazhiming/index.html
• Y. Liu, Z. M. Ma, C. Zhou:
Web Markov Skeleton Processes and Their
Applications, Tohoku Math J. 63 (2011), 665695
• Y. Liu, Z. M. Ma, C. Zhou:
Further Study on Web Markov Skeleton
Processes, in Stochastic Analysis and
Applications to Finance,World Scientific,2012
• C. Zhou: Some Results on Mirror SemiMarkov Processes, manuscript
Web Markov Skeleton Process
Markov Chain
conditionally independent given
Define
WMSP
by :
Simple WMSP:
Many simple WMSPs are
Non-Markov Processes
[LMZ2011a,b]
Mirror Semi-Markov Process
Mirror Semi-Markov Process is not a
Hou-Liu’s Markov Skeleton Process,
i.e. it does not satisfy
Multivariate Point Process
associated with WMSP
WMSP
Let
Consequently
Define
We can prove that
where
where
Time-homogeneous mirror
semi-Markov processes
are all independent of n
More property of of time homogeneity
Renewal Theory
Contribution probability
Staying times and first entry times
Limit distribution for
semi-Markov process
Limit distribution for
mirror semi-Markov processes
Reconstruction of Mirror
Semi-Markov Processes
Why it is called a Web
Markov Skeleton Process?
Page Rank, a
ranking algorithm used by the
Google search engine.
1998, Sergey Brin and Larry Page ,
Stanford University
From probabilistic point of view,
PageRank is the stationary
distribution of a Markov chain.
A simple Markov Skeleton Process
Markov chain describing
surfing behavior
Markov chain describing
surfing behavior
Web surfers usually have two basic
ways to access web pages:
1. with probability α, they visit a web
page by clicking a hyperlink.
2. with probability 1-α, they visit a
web page by inputting its URL
address.
where
Weak points of PageRank
• Using only static web graph structure
• Reflecting only the will of web managers,
but ignore the will of users e.g. the staying
time of users on a web.
• Can not effectively against spam and junk
pages.
BrowseRankSIGIR.ppt
Data Mining
Browsing Process
• Markov property
• Time-homogeneity
Computation of the Stationary Distribution
– Stationary distribution:
   P(t )
–
is the mean of the staying time on page i.
The more important a page is, the longer
staying time on it is.
–
is the mean of the first re-visit time at page i.
The more important a page is, the smaller the revisit time is, and the larger the visit frequency is.
BrowseRank: Letting Web Users Vote
for Page Importance
Yuting Liu,
Bin Gao, Tie-Yan Liu, Ying Zhang,
Zhiming Ma, Shuyuan He, and Hang Li
July 23, 2008, Singapore
the 31st Annual International ACM SIGIR
Conference on Research & Development on
Information Retrieval.
Best student paper !
• Browse Rank the next
PageRank
says Microsoft
•jerbrows
er.wmv
• Browsing Processes will be a
Basic Mathematical Tool in
Internet Information Retrieval
Beyond:
--General fromework of Browsing
Processes?
--How about inhomogenous process?
--Marked point process
--Mobile Web: not really Markovian
ExtBrowseRank and
semi-Markov processes
MobileRank and
Mirror Semi-Markov Processes
MobileRank and
Mirror Semi-Markov Processes
Web Markov Skeleton Process
[10] B. Gao, T. Liu, Z. M. Ma, T. Wang, and H. Li
A general markov framework for page
importance computation, In proceedings of
CIKM '2009,
[11] B. Gao, T. Liu, Y. Liu, T. Wang, Z. M. Ma and H. LI
Page Importance Computation based on
Markov Processes, Information Retrieval
online first:
<http://www.springerlink.com/content/7mr7526x21671131
Research on Random Complex
Networks and Information Retrieval:
In recent years we have been involved in the research
direction of Random Complex Netowrks
and Information Retrieval.
Below are some of the related outputs by our group
(in collaboration with Microsoft Research Asia)
More property of time homogeneity
right continuous, piecewise
constant functions
Theorem [LMZ 2011a]
for all n
Theorem [LMZ 2011b] General case
The statistical properties of a time
homogeneous mirror semi-Markov
process is completely determined by:
Reconstruction of Mirror
Semi-Markov Processes
Given:
,
,
Theorem [LMZ 2011b]
We can construct
such that
uniformly
Limit distribution for
semi-Markov process
Limit distribution for
mirror semi-Markov processes
Staying times and first entry times
Staying time on the state j:
Distribution
Expectation
First entry time into the state k:
where
into k
Distribution
Expectation
Contribution probability
from state i to state j:
Renewal Theory
Proposition
Renewal Equation [LMZ2011a]
Renewal functional:
where
Below are the resuls on the renewal functional
[LMZ2011a]
Thank you !
Time Homogeneous WMSP
right continuous, piecewise
constant functions
More property of of time homogeneity
Theorem [LMZ 2011b]
for all
Reconstruction of WMSP
[LMZ2011b]
Write
is expressed as
Ranking Websites,
a Probabilistic View
Internet Mathematics, Volume 3 (2007), Issue 3
Ying Bao, Gang Feng, Tie-Yan Liu, Zhi-Ming Ma,
and Ying Wang
AggregateRank:
Bring Order to Web Sites
29th Annual International Conference on Research &
Development on Information Retrieval (SIGIR’06).
G.Feng, T.Y. Liu, Ying Wang, Y.Bao, Z.M.Ma et al
Download