2/26/16 TimeandLoca@on CS6140:MachineLearning Spring2016 • Time:Thursdaysfrom6:00pm–9:00pm • Loca)on:BehrakisHealthSciencesCntr325 Instructor:LuWang CollegeofComputerandInforma@onScience NortheasternUniversity Webpage:www.ccs.neu.edu/home/luwang Email:luwang@ccs.neu.edu CourseWebpage • hRp://www.ccs.neu.edu/home/luwang/ courses/cs6140_sp2016.html Prerequisites • Programming – Beingabletowritecodeinsomeprogramming languages(e.g.Python,Java,C/C++)proficiently • Courses – Algorithms – Probabilityandsta@s@cs – Linearalgebra • Someknowledgeinmachinelearning • BackgroundCheck TextbookandReferences • MainTextbook – KevinMurphy,"MachineLearning-aProbabilis@c Perspec@ve",MITPress,2012. • Othertextbooks – ChristopherM.Bishop,"PaRernRecogni@onand MachineLearning",Springer,2006. – TomMitchell,"MachineLearning",McGrawHill,1997. • Machinelearninglectures ContentoftheCourse • Regression:linearregression,logis@cregression • DimensionalityReduc)on:PrincipalComponentAnalysis(PCA),Independent ComponentAnalysis(ICA),LinearDiscriminantAnalysis • Probabilis)cModels:NaiveBayes,maximumlikelihoodes@ma@on,bayesian inference • Sta)s)calLearningTheory:VCdimension • Kernels:SupportVectorMachines(SVMs),kerneltricks,duality • Sequen)alModelsandStructuralModels:HiddenMarkovModel(HMM), Condi@onalRandomFields(CRFs) • Clustering:spectralclustering,hierarchicalclustering • LatentVariableModels:K-means,mixturemodels,expecta@on-maximiza@on (EM)algorithms,LatentDirichletAlloca@on(LDA),representa@onlearning • DeepLearning:feedforwardneuralnetwork,restrictedBoltzmannmachine, autoencoders,recurrentneuralnetwork,convolu@onalneuralnetwork • andothers,includingadvancedtopicsformachinelearninginnaturallanguage processingandtextanalysis 1 2/26/16 TheGoal • Notonlywhat,butalsowhy! Grading • Assignments – 3assignments,10%foreach • Exam – 1exam,30% – March,31,2016 • Project – 1project,35% • Par@cipa@on – 5% – Classes – Piazza Exam • Openbook • March31,2016 CourseProject • Op@on1:Amachinelearningrelevant researchproject • Op@on2:YelpChallenge • 2-3studentsasateam ResearchProject YelpChallenge • Machinelearningrelevant – Naturallanguageprocessing – Computervision – Robo@cs – Bioinforma@cs – Healthinforma@cs – … • Novelty 2 2/26/16 YelpChallenge YelpChallenge CourseProjectGrading CourseProjectGrading • Wewanttoseenovelandinteres@ng projects! – Theproblemneedstobewell-defined,novel, useful,andprac@cal – machinelearningtechniques – Reasonableresultsandobserva@ons SubmissionandLatePolicy • Eachassignmentorreport,bothelectronic copyandhardcopy,isdueatthebeginningof classonthecorrespondingduedate. • Electronicversion – Onblackboard • Hardcopy • Threereports – Proposal(5%) – Progress,withcode(10%) – Final,withcode(10%) • Onepresenta@on – Inclass(10%) SubmissionandLatePolicy • Assignmentorreportturnedinlatewillbe charged10points(outof100points)offfor eachlateday(i.e.24hours). • Eachstudenthasabudgetof5days throughoutthesemesterbeforealatepenalty isapplied. – Inclass 3 2/26/16 Howtofindus? • Coursewebpage: – hRp://www.ccs.neu.edu/home/luwang/courses/ cs6140_sp2016.html • Officehours – LuWang:Thursdaysfrom4:30pmto5:30pm,orby appointment,448WVH – GabrielBakiewicz(TA),MondaysandTuesdaysfrom 4:00pmto5:00pm,362WVH WhatisMachineLearning? • “Asetofmethodsthatcanautoma@cally detectpaRernsindata,andthenusethe uncoveredpaRernstopredictfuturedata,or toperformotherkindsofdecisionsmaking undercertainty.” • Piazza – hRp://piazza.com/northeastern/spring2016/cs6140 – Allcourserelevantques@onsgohere RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons 4 2/26/16 RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons Rela@onswithOtherAreas • NaturalLanguageProcessing • ComputerVision • Robo@cs • Alotofotherareas… 5 2/26/16 Today’sOutline • Basicconceptsinmachinelearning Supervisedvs.UnsupervisedLearning • Supervisedlearning • K-nearestneighbors • Linearregression • Ridgeregression Trainingset Trainingsample Gold-standardlabel - Classifica)on,ifcategorical - Regression,ifnumerical SupervisedLearning SupervisedLearning SupervisedLearning SupervisedLearning • Goal: – Generalizabletonewinputsamples – Overfiungvs.underfiung – weuseprobabilis@cmodels • Typicalsetup: – Trainingset,testset,developmentset – Features – Evalua@on 6 2/26/16 SupervisedLearning SupervisedLearning SupervisedLearning SupervisedLearning • Regression – Predic@ngstockprice – Predic@ngtemperature – Predic@ngrevenue … Supervisedvs.UnsupervisedLearning • UnsupervisedLearning UnsupervisedLearning • Dimensionreduc@on – Principalcomponentanalysis • Moreabout“knowledgediscovery” 7 2/26/16 UnsupervisedLearning • Clustering(e.g.graphmining) UnsupervisedLearning • Topicmodeling RolX:RoleExtrac.onandMininginLargeNetworks,byHendersonetal,2011 Parametricvs.Non-parametricmodel • Fixednumberofparameters? – Ifyes,parametricmodel • Numberofparametersgrowwiththeamount oftrainingdata? – Ifyes,non-parametricmodel Anon-parametricclassifier:K-nearest neighbors(KNN) • Basicidea:memorizeallthetrainingsamples – Themoreyouhaveintrainingdata,themorethe modelhastoremember • Nearestneighbor: – Tes@ngphase:findclosetsample,andreturn correspondinglabel • Computa@onaltractability Anon-parametricclassifier:K-nearest neighbors(KNN) • Basicidea:memorizeallthetrainingsamples – Themoreyouhaveintrainingdata,themorethe modelhastoremember • K-Nearestneighbor: – Tes@ngphase:findtheKnearestneighbors,and returnthemajorityvoteoftheirlabels 8 2/26/16 AboutK AboutK • K=1:justpiecewiseconstantlabeling • K=N:globalmajorityvote(class) ProblemsofkNN • Canbeslowwhentrainingdataisbig – Searchingfortheneighborstakes@me • Needslotsofmemorytostoretrainingdata ProblemsofkNN • Distancefunc@on – Euclideandistance • Needstotunekanddistancefunc@on • Notaprobabilitydistribu@on ProblemsofkNN • Distancefunc@on – Mahalanobisdistance:weightsoncomponents Probabilis@ckNN • Wepreferaprobabilis@coutputbecause some@meswemaygetan“uncertain”result – 99samplesas“yes”,101samplesas“no”à? • Probabilis@ckNN: 9 2/26/16 Probabilis@ckNN Smoothing • Class1:3,class2:0,class3:1 • Originalprobability: – P(y=1)=3/4,p(y=2)=0/4,p(y=3)=1/4 • Add-1smoothing: – Class1:3+1,class2:0+1,class3:1+1 – P(y=1)=4/7,p(y=2)=1/7,p(y=3)=2/7 3-classsynthe@ctrainingdata Sowmax • Class1:3,class2:0,class3:1 • Originalprobability: Aparametricclassifier:linear regression • Assump@on:theresponseisalinearfunc@on oftheinputs – P(y=1)=3/4,p(y=2)=0/4,p(y=3)=1/4 • Redistributeprobabilitymassintodifferent classes – Defineasowmaxas Innerproductbetween inputsampleXand weightvectorW Residualerror:difference betweenpredic@onand truelabel Aparametricclassifier:linear regression Innerproductbetween inputsampleXand weightvectorW Residualerror:difference betweenpredic@onand truelabel • Assumeresidualerrorhasanormal distribu@on 10 2/26/16 Aparametricclassifier:linear regression Aparametricclassifier:linear regression • Wecanfurtherassume • Basicfunc@onexpansion LearningwithMaximumLikelihood Es@ma@on(MLE) • MaximumLikelihoodEs@ma@on(MLE) LearningwithMaximumLikelihood Es@ma@on(MLE) • Log-likelihood • Maximizelog-likelihoodisequivalentto minimizenega@velog-likelihood(NLL) LearningwithMaximumLikelihood Es@ma@on(MLE) • Withournormaldistribu@onassump@on Deriva@onofMLEforLinear Regression • Rewriteourobjec@vefunc@onas Residualsumofsquares(RSS) àWewanttominimizeit! 11 2/26/16 Deriva@onofMLEforLinear Regression Deriva@onofMLEforLinear Regression • Rewriteourobjec@vefunc@onas • Rewriteourobjec@vefunc@onas • Getthederiva@ve(orgradient) • Getthederiva@ve(orgradient) • Setourderiva@veto0 Ordinaryleastsquaressolu)on GeometricInterpreta@on GeometricInterpreta@on • Rememberwehave • Thereforetheprojectedvalueofyis àThiscorrespondstoanorthogonalprojectof yontothecolumnspaceofX Overfiung Overfiung 12 2/26/16 APriorontheWeight • Zero-meanGaussianprior APriorontheWeight • Zero-meanGaussianprior • Newobjec@vefunc@on APriorontheWeight • Zero-meanGaussianprior RidgeRegression • Wewanttominimize • Newobjec@vefunc@on RidgeRegression RidgeRegression • Wewanttominimize • Wewanttominimize • Newes@ma@onfortheweight • Newes@ma@onfortheweight L2regulariza)on 13 2/26/16 Whatwelearned • Basicconceptsinmachinelearning • K-nearestneighbors–non-parametric • Linearregression–parametric Homework • ReadingMurphych1,ch2,andch7 • SignupatPiazza • hRp://piazza.com/northeastern/spring2016/cs6140 • Startthinkingaboutcourseprojectandfinda team! – ProjectproposaldueJan28th • Ridgeregression–parametric 14