CS 6140: Machine Learning Time and Locafon Course Webpage

2/26/16 TimeandLoca@on CS6140:MachineLearning Spring2016 •  Time:Thursdaysfrom6:00pm–9:00pm •  Loca)on:BehrakisHealthSciencesCntr325 Instructor:LuWang CollegeofComputerandInforma@onScience NortheasternUniversity Webpage:www.ccs.neu.edu/home/luwang Email:luwang@ccs.neu.edu CourseWebpage •  hRp://www.ccs.neu.edu/home/luwang/ courses/cs6140_sp2016.html Prerequisites •  Programming –  Beingabletowritecodeinsomeprogramming languages(e.g.Python,Java,C/C++)proficiently •  Courses –  Algorithms –  Probabilityandsta@s@cs –  Linearalgebra •  Someknowledgeinmachinelearning •  BackgroundCheck TextbookandReferences •  MainTextbook –  KevinMurphy,"MachineLearning-aProbabilis@c Perspec@ve",MITPress,2012. •  Othertextbooks –  ChristopherM.Bishop,"PaRernRecogni@onand MachineLearning",Springer,2006. –  TomMitchell,"MachineLearning",McGrawHill,1997. •  Machinelearninglectures ContentoftheCourse •  Regression:linearregression,logis@cregression •  DimensionalityReduc)on:PrincipalComponentAnalysis(PCA),Independent ComponentAnalysis(ICA),LinearDiscriminantAnalysis •  Probabilis)cModels:NaiveBayes,maximumlikelihoodes@ma@on,bayesian inference •  Sta)s)calLearningTheory:VCdimension •  Kernels:SupportVectorMachines(SVMs),kerneltricks,duality •  Sequen)alModelsandStructuralModels:HiddenMarkovModel(HMM), Condi@onalRandomFields(CRFs) •  Clustering:spectralclustering,hierarchicalclustering •  LatentVariableModels:K-means,mixturemodels,expecta@on-maximiza@on (EM)algorithms,LatentDirichletAlloca@on(LDA),representa@onlearning •  DeepLearning:feedforwardneuralnetwork,restrictedBoltzmannmachine, autoencoders,recurrentneuralnetwork,convolu@onalneuralnetwork •  andothers,includingadvancedtopicsformachinelearninginnaturallanguage processingandtextanalysis 1 2/26/16 TheGoal •  Notonlywhat,butalsowhy! Grading •  Assignments –  3assignments,10%foreach •  Exam –  1exam,30% –  March,31,2016 •  Project –  1project,35% •  Par@cipa@on –  5% –  Classes –  Piazza Exam •  Openbook •  March31,2016 CourseProject •  Op@on1:Amachinelearningrelevant researchproject •  Op@on2:YelpChallenge •  2-3studentsasateam ResearchProject YelpChallenge •  Machinelearningrelevant –  Naturallanguageprocessing –  Computervision –  Robo@cs –  Bioinforma@cs –  Healthinforma@cs –  … •  Novelty 2 2/26/16 YelpChallenge YelpChallenge CourseProjectGrading CourseProjectGrading •  Wewanttoseenovelandinteres@ng projects! –  Theproblemneedstobewell-defined,novel, useful,andprac@cal –  machinelearningtechniques –  Reasonableresultsandobserva@ons SubmissionandLatePolicy •  Eachassignmentorreport,bothelectronic copyandhardcopy,isdueatthebeginningof classonthecorrespondingduedate. •  Electronicversion –  Onblackboard •  Hardcopy •  Threereports –  Proposal(5%) –  Progress,withcode(10%) –  Final,withcode(10%) •  Onepresenta@on –  Inclass(10%) SubmissionandLatePolicy •  Assignmentorreportturnedinlatewillbe charged10points(outof100points)offfor eachlateday(i.e.24hours). •  Eachstudenthasabudgetof5days throughoutthesemesterbeforealatepenalty isapplied. –  Inclass 3 2/26/16 Howtofindus? •  Coursewebpage: –  hRp://www.ccs.neu.edu/home/luwang/courses/ cs6140_sp2016.html •  Officehours –  LuWang:Thursdaysfrom4:30pmto5:30pm,orby appointment,448WVH –  GabrielBakiewicz(TA),MondaysandTuesdaysfrom 4:00pmto5:00pm,362WVH WhatisMachineLearning? •  “Asetofmethodsthatcanautoma@cally detectpaRernsindata,andthenusethe uncoveredpaRernstopredictfuturedata,or toperformotherkindsofdecisionsmaking undercertainty.” •  Piazza –  hRp://piazza.com/northeastern/spring2016/cs6140 –  Allcourserelevantques@onsgohere RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons 4 2/26/16 RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons RealWorldApplica@ons Rela@onswithOtherAreas •  NaturalLanguageProcessing •  ComputerVision •  Robo@cs •  Alotofotherareas… 5 2/26/16 Today’sOutline •  Basicconceptsinmachinelearning Supervisedvs.UnsupervisedLearning •  Supervisedlearning •  K-nearestneighbors •  Linearregression •  Ridgeregression Trainingset Trainingsample Gold-standardlabel -  Classifica)on,ifcategorical -  Regression,ifnumerical SupervisedLearning SupervisedLearning SupervisedLearning SupervisedLearning •  Goal: –  Generalizabletonewinputsamples –  Overfiungvs.underfiung –  weuseprobabilis@cmodels •  Typicalsetup: –  Trainingset,testset,developmentset –  Features –  Evalua@on 6 2/26/16 SupervisedLearning SupervisedLearning SupervisedLearning SupervisedLearning •  Regression –  Predic@ngstockprice –  Predic@ngtemperature –  Predic@ngrevenue … Supervisedvs.UnsupervisedLearning •  UnsupervisedLearning UnsupervisedLearning •  Dimensionreduc@on –  Principalcomponentanalysis •  Moreabout“knowledgediscovery” 7 2/26/16 UnsupervisedLearning •  Clustering(e.g.graphmining) UnsupervisedLearning •  Topicmodeling RolX:RoleExtrac.onandMininginLargeNetworks,byHendersonetal,2011 Parametricvs.Non-parametricmodel •  Fixednumberofparameters? –  Ifyes,parametricmodel •  Numberofparametersgrowwiththeamount oftrainingdata? –  Ifyes,non-parametricmodel Anon-parametricclassifier:K-nearest neighbors(KNN) •  Basicidea:memorizeallthetrainingsamples –  Themoreyouhaveintrainingdata,themorethe modelhastoremember •  Nearestneighbor: –  Tes@ngphase:findclosetsample,andreturn correspondinglabel •  Computa@onaltractability Anon-parametricclassifier:K-nearest neighbors(KNN) •  Basicidea:memorizeallthetrainingsamples –  Themoreyouhaveintrainingdata,themorethe modelhastoremember •  K-Nearestneighbor: –  Tes@ngphase:findtheKnearestneighbors,and returnthemajorityvoteoftheirlabels 8 2/26/16 AboutK AboutK •  K=1:justpiecewiseconstantlabeling •  K=N:globalmajorityvote(class) ProblemsofkNN •  Canbeslowwhentrainingdataisbig –  Searchingfortheneighborstakes@me •  Needslotsofmemorytostoretrainingdata ProblemsofkNN •  Distancefunc@on –  Euclideandistance •  Needstotunekanddistancefunc@on •  Notaprobabilitydistribu@on ProblemsofkNN •  Distancefunc@on –  Mahalanobisdistance:weightsoncomponents Probabilis@ckNN •  Wepreferaprobabilis@coutputbecause some@meswemaygetan“uncertain”result –  99samplesas“yes”,101samplesas“no”à? •  Probabilis@ckNN: 9 2/26/16 Probabilis@ckNN Smoothing •  Class1:3,class2:0,class3:1 •  Originalprobability: –  P(y=1)=3/4,p(y=2)=0/4,p(y=3)=1/4 •  Add-1smoothing: –  Class1:3+1,class2:0+1,class3:1+1 –  P(y=1)=4/7,p(y=2)=1/7,p(y=3)=2/7 3-classsynthe@ctrainingdata Sowmax •  Class1:3,class2:0,class3:1 •  Originalprobability: Aparametricclassifier:linear regression •  Assump@on:theresponseisalinearfunc@on oftheinputs –  P(y=1)=3/4,p(y=2)=0/4,p(y=3)=1/4 •  Redistributeprobabilitymassintodifferent classes –  Defineasowmaxas Innerproductbetween inputsampleXand weightvectorW Residualerror:difference betweenpredic@onand truelabel Aparametricclassifier:linear regression Innerproductbetween inputsampleXand weightvectorW Residualerror:difference betweenpredic@onand truelabel •  Assumeresidualerrorhasanormal distribu@on 10 2/26/16 Aparametricclassifier:linear regression Aparametricclassifier:linear regression •  Wecanfurtherassume •  Basicfunc@onexpansion LearningwithMaximumLikelihood Es@ma@on(MLE) •  MaximumLikelihoodEs@ma@on(MLE) LearningwithMaximumLikelihood Es@ma@on(MLE) •  Log-likelihood •  Maximizelog-likelihoodisequivalentto minimizenega@velog-likelihood(NLL) LearningwithMaximumLikelihood Es@ma@on(MLE) •  Withournormaldistribu@onassump@on Deriva@onofMLEforLinear Regression •  Rewriteourobjec@vefunc@onas Residualsumofsquares(RSS) àWewanttominimizeit! 11 2/26/16 Deriva@onofMLEforLinear Regression Deriva@onofMLEforLinear Regression •  Rewriteourobjec@vefunc@onas •  Rewriteourobjec@vefunc@onas •  Getthederiva@ve(orgradient) •  Getthederiva@ve(orgradient) •  Setourderiva@veto0 Ordinaryleastsquaressolu)on GeometricInterpreta@on GeometricInterpreta@on •  Rememberwehave •  Thereforetheprojectedvalueofyis àThiscorrespondstoanorthogonalprojectof yontothecolumnspaceofX Overfiung Overfiung 12 2/26/16 APriorontheWeight •  Zero-meanGaussianprior APriorontheWeight •  Zero-meanGaussianprior •  Newobjec@vefunc@on APriorontheWeight •  Zero-meanGaussianprior RidgeRegression •  Wewanttominimize •  Newobjec@vefunc@on RidgeRegression RidgeRegression •  Wewanttominimize •  Wewanttominimize •  Newes@ma@onfortheweight •  Newes@ma@onfortheweight L2regulariza)on 13 2/26/16 Whatwelearned •  Basicconceptsinmachinelearning •  K-nearestneighbors–non-parametric •  Linearregression–parametric Homework •  ReadingMurphych1,ch2,andch7 •  SignupatPiazza •  hRp://piazza.com/northeastern/spring2016/cs6140 •  Startthinkingaboutcourseprojectandfinda team! –  ProjectproposaldueJan28th •  Ridgeregression–parametric 14

CS 6140: Machine Learning Time and Locafon Course Webpage

Related documents

Products

Support

CS 6140: Machine Learning Time and Locafon Course Webpage

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib