Understanding SMT without the “S” (Statistics) Robert Frederking Carnegie Mellon School of Computer Science Statistical modelling • Think about statistical modelling as fitting a curve to data points – Start with parameterized function, error metric, and data points – After fitting the function to data using parameters, you can make predictions Carnegie Mellon School of Computer Science Carnegie Mellon School of Computer Science Carnegie Mellon School of Computer Science y = a*x + b Err = sqrt(sum(di^2)) Carnegie Mellon School of Computer Science y = a*x + b Y X Carnegie Mellon School of Computer Science Carnegie Mellon School of Computer Science Carnegie Mellon School of Computer Science y = a*x + b Err = sqrt(sum(di^2)) Carnegie Mellon School of Computer Science y = a*x + b Y?? X Carnegie Mellon School of Computer Science Y2 (Y-y0)^2/a + (X-x0)^2/b = r^2 Y1 Err = sqrt(sum(di^2)) X Carnegie Mellon School of Computer Science Statistical modelling • Think about statistical modelling as fitting a curve to data points – Parameterized function, error metric, data points – After fitting parameters, you can make predictions – But you will get some fit for any data set • Human researchers need to come up with “good” family of functions, and error metric, for the data you see – Want low error number, good predictions – Tractable, both in training and decoding • including data availability, sparseness issues Carnegie Mellon School of Computer Science