PLSA建模思想分析 张小洪 Contents 什么是建模 LSA思想方法 PLSA图像建模 PLSA建模的应用条件和假设 PLSA应用及发展 建模是什么 软件开发中的建模 业务建模 需求模型 设计模型 实现模型 数据库模型 词法分析→提取对象→刻画对象(属性或 方法) →对象关系 模型反映了事物或对象之间的关系 模型是什么:例子 模型是什么:例子 映射 建筑 汽车 电话 人像 自行车 书 树木 模型是什么:例子 模型是什么:例子 映射 人 手 马 龟 象 犬 鳄 模型是什么:例子 映射 建模是什么 x y 目标函数 x G S y LM 机器学习 映射或函数 y 建模是什么 数学建模 模型 函数 泛函 求满足目标和条件的函数过程 基于经验数据的建模 机器学习问题 学习问题是指依据经验数据选取所期望的依赖 关系的问题 学习过程是一个从给定的函数集中选择一个适 当函数的过程。 模式识别 函数值Y:指标集 建模是什么:模式识别 采集数 据 选取特 征 选择模 型 函数集 训练分 评价分 类器 类器 选择函 数的过 程 LSA方法 问题:如何分类文章 Technical Memo Titles c1: Human machine interface for ABC computer applications c2: A survey of user opinion of computer system response time c3: The EPS user interface management system c4: System and human system engineering testing of EPS c5: Relation of user perceived response time to error measurement m1: The generation of random, binary, ordered trees m2: The intersection graph of paths in trees m3: Graph minors IV: Widths of trees and well-quasi-ordering m4: Graph minors: A survey LSA方法 如何表示文章:Vector Space Model 1 单 词 本 human interface computer user system response time EPS survey trees graph minors 问题? c1 1 1 1 0 0 0 0 0 0 0 0 0 c2 0 0 1 1 1 1 1 0 1 0 0 0 r (human.user) r (human.minors) c3 0 1 0 1 1 0 0 1 0 0 0 0 c4 1 0 0 0 2 0 0 1 0 0 0 0 c5 0 0 0 1 0 1 1 0 0 0 0 0 = -.378 = -.378 m1 0 0 0 0 0 0 0 0 0 1 0 0 m2 0 0 0 0 0 0 0 0 0 1 1 0 m3 0 0 0 0 0 0 0 0 0 1 1 1 m4 0 0 0 0 0 0 0 0 1 0 1 1 2 统 计 词 频 LSA方法:SVD Singular Value Decomposition A=USVT Dimension Reduction {~A}~={~U}{~S}{~V}T LSA方法:SVD {U} = 0.22 0.20 0.24 0.40 0.64 0.27 0.27 0.30 0.21 0.01 0.04 0.03 -0.11 -0.07 0.04 0.06 -0.17 0.11 0.11 -0.14 0.27 0.49 0.62 0.45 降至2维 0.29 0.14 -0.16 -0.34 0.36 -0.43 -0.43 0.33 -0.18 0.23 0.22 0.14 -0.41 -0.55 -0.59 0.10 0.33 0.07 0.07 0.19 -0.03 0.03 0.00 -0.01 -0.11 0.28 -0.11 0.33 -0.16 0.08 0.08 0.11 -0.54 0.59 -0.07 -0.30 -0.34 0.50 -0.25 0.38 -0.21 -0.17 -0.17 0.27 0.08 -0.39 0.11 0.28 0.52 -0.07 -0.30 0.00 -0.17 0.28 0.28 0.03 -0.47 -0.29 0.16 0.34 -0.06 -0.01 0.06 0.00 0.03 -0.02 -0.02 -0.02 -0.04 0.25 -0.68 0.68 -0.41 -0.11 0.49 0.01 0.27 -0.05 -0.05 -0.17 -0.58 -0.23 0.23 0.18 LSA方法:SVD {S} = 降 至 2 维 3.34 2.54 2.35 1.64 1.50 1.31 0.85 0.56 0.36 LSA方法:SVD {V} = 降 至 2 维 0.20 -0.06 0.11 -0.95 0.05 -0.08 0.18 -0.01 -0.06 0.61 0.17 -0.50 -0.03 -0.21 -0.26 -0.43 0.05 0.24 0.46 -0.13 0.21 0.04 0.38 0.72 -0.24 0.01 0.02 0.54 -0.23 0.57 0.27 -0.21 -0.37 0.26 -0.02 -0.08 0.28 0.11 -0.51 0.15 0.33 0.03 0.67 -0.06 -0.26 0.00 0.19 0.10 0.02 0.39 -0.30 -0.34 0.45 -0.62 0.01 0.44 0.19 0.02 0.35 -0.21 -0.15 -0.76 0.02 0.02 0.62 0.25 0.01 0.15 0.00 0.25 0.45 0.52 0.08 0.53 0.08 -0.03 -0.60 0.36 0.04 -0.07 -0.45 LSA方法:SVD 同 义 词 问 题 c1 c2 c3 c4 c5 m1 m2 m3 m4 human 0.16 0.40 0.38 0.47 0.18 -0.05 -0.12 -0.16 -0.09 interface 0.14 0.37 0.33 0.40 0.16 -0.03 -0.07 -0.10 -0.04 computer 0.15 0.51 0.36 0.41 0.24 0.02 0.06 0.09 0.12 user 0.26 0.84 0.61 0.70 0.39 0.03 0.08 0.12 0.19 system 0.45 1.23 1.05 1.27 0.56 -0.07 -0.15 -0.21 -0.05 response 0.16 0.58 0.38 0.42 0.28 0.06 0.13 0.19 0.22 time 0.16 0.58 0.38 0.42 0.28 0.06 0.13 0.19 0.22 EPS 0.22 0.55 0.51 0.63 0.24 -0.07 -0.14 -0.20 -0.11 survey 0.10 0.53 0.23 0.21 0.27 0.14 0.31 0.44 0.42 trees -0.06 0.23 -0.14 -0.27 0.14 0.24 0.55 0.77 0.66 graph -0.06 0.34 -0.15 -0.30 0.20 0.31 0.69 0.98 0.85 minors -0.04 0.25 -0.10 -0.21 0.15 0.22 0.50 0.71 0.62 r (human.user) = .94 r (human.minors) = -.83 LSA方法:SVD LSA Titles example: Correlations between titles in raw data c2 c3 c4 c5 m1 m2 m3 m4 c1 -0.19 0.00 0.00 -0.33 -0.17 -0.26 -0.33 -0.33 c2 c3 c4 c5 0.00 0.00 0.58 -0.30 -0.45 -0.58 -0.19 0.47 0.00 -0.21 -0.32 -0.41 -0.41 -0.31 -0.16 -0.24 -0.31 -0.31 -0.17 -0.26 -0.33 -0.33 m1 m2 m3 0.67 0.52 -0.17 0.77 0.26 0.56 1.00 1.00 1.00 1.00 1.00 1.00 Correlations in first-two dimension space c2 c3 c4 c5 m1 m2 m3 m4 0.91 1.00 1.00 0.85 -0.85 -0.85 -0.85 -0.81 0.91 0.88 0.99 -0.56 -0.56 -0.56 -0.50 1.00 0.85 -0.85 -0.85 -0.85 -0.81 0.81 -0.88 -0.88 -0.88 -0.84 -0.45 -0.44 -0.44 -0.37 LSA 方法:讨论 SVD方法为何能有效?其假设是什么? LSA does not define a properly normalized probability distribution No obvious interpretation of the directions in the latent space From statistics, the utilization of L2 norm in LSA corresponds to a Gaussian Error assumption which is hard to justify in the context of count variables Polysemy problem 怎样可视化SVD的结果? PLSA:问题 建筑 汽车 电话 人像 自行车 书 树木 PLSA:问题 问题 图像怎样表示成特征向量? 特征向量怎样构成“图像单词”? 训练图像集怎样表示成共生矩阵(词频矩阵)? 模型选择? frequency PLSA:问题 ….. codewords PLSA:问题 Object Bag of ‘words’ learning 1.feature detection & representation recognition 2.codewords dictionary 3.image representation category models (and/or) classifiers category decision PLSA:Feature detection and representation PLSA:Feature detection and representation Compute SIFT descriptor Normalize patch [Lowe’99] Detect patches [Mikojaczyk and Schmid ’02] [Mata, Chum, Urban & Pajdla, ’02] [Sivic & Zisserman, ’03] Slide credit: Josef Sivic PLSA:Feature detection and representation … PLSA:Codewords dictionary formation … PLSA:Codewords dictionary formation … Vector quantization PLSA:Codewords dictionary formation frequency PLSA:Image representation ….. codewords Representation 2. 1. feature detection & representation image representation 3. codewords dictionary Learning and Recognition codewords dictionary category models (and/or) classifiers category decision PLSA Learning and Recognition 1. Generative method: - graphical models 2. Discriminative method: - SVM category models (and/or) classifiers generative models 1. Naïve Bayes classifier Csurka Bray, Dance & Fan, 2004 2. Hierarchical Bayesian text models (pLSA and LDA) Background: Hoffman 2001, Blei, Ng & Jordan, 2004 Object categorization: Sivic et al. 2005, Sudderth et al. 2005 Natural scene categorization: Fei-Fei et al. 2005 First, some notations wn: each patch in an image wn = [0,0,…1,…,0,0]T w: a collection of all N patches in an image w = [w1,w2,…,wN] dj: the jth image in an image collection c: category of the image z: theme or topic of the patch Case #1: the Naïve Bayes model c w N c arg max c Object class decision N p(c | w) p(c) p( w | c) p(c) p( wn | c) n 1 Prior prob. of the object classes Image likelihood given the class Csurka et al. 2004 Case #2: Hierarchical Bayesian text models Probabilistic Latent Semantic Analysis (pLSA) d D z w N “face” Sivic et al. ICCV 2005 The pLSA model K p(wi | d j ) p( wi | zk ) p( zk | d j ) k 1 Observed codeword distributions Codeword distributions per theme (topic) Theme distributions per image Slide credit: Josef Sivic Recognition using pLSA z arg max p( z | d ) z Slide credit: Josef Sivic Learning the pLSA parameters Observed counts of word i in document j Maximize likelihood of data using EM M … number of codewords N … number of images Slide credit: Josef Sivic PLSA:讨论 数据的特征,PLSA应用条件和假设? Not a well-defined generative model of documents; d is a dummy index into the list of documents in the training set (as many values as documents) No natural way to assign probability to a previously unseen document Number of parameters to be estimated grows with size of training set PLSA的应用及发展 图像分 类 文本分 类 人脸识 形状分 类 视频行 PLSA 别 人脸检 为分析 …… 测 日志分 类