Introduction to Git and Discussion on assignment 1 Gang Luo

advertisement

Introduction to Git and

Discussion on assignment 1

Gang Luo

Sept. 14, 2010

Git

• Source code management

• Version control

• Enable team collaboration

– One central repository, multiple local copies

– Synchronize local copy with the central one to ensure everybody see the latest modification

You should access the central repository from linux.cs.duke.edu, instead of hadoop21.cs.duke.edu

Before we start

• Install Git

– PuTTY + Git (for windows)

– Eclipse + EGit (for windows/Linux)

– linux.cs.duke.edu (Git already installed )

– apt-get install git-core (for Ubuntu Linux)

– yum install git-core (for Federa/Other Linux)

• Initilization

– Set user name, email and color to highlight something

• Clone

– Localize a copy of remote repository

– git clone ssh://USERNAME@linux.cs.duke.edu/usr/research/proj/git/cps216/USE

RNAME.git

Using Git

• Adding files

– git add . (don’t forget the dot which means all)

• Commit changes

– git commit –m “message” –a

– “message” could be anything you want to appear in the log

• Synchronize with remote repository

– git push

• Push your modification to the central repository

– git pull

• Update your local copy from the central repository

Convention for you submission

• Put you code in the appropriate directories

– e.g. cps216/assignemnt1/parta

• Give README file

– Briefly shows the organization of your code, the meaning of each class and instructions on how to run your code

• Demo Time

Some issues for assignment 1

• Output key/value type setting

– setOutputValueClass() and setOutputKeyClass() cover both map and reduce output key/value type.

• What if your mapper output types are different from reducer?

– Specify map input/output by setMapOutputValueClass() and/or setMapOutputKeyClass()

Some issues for assignment 1

• Input/output types for combiner

– Input types should be the same as map output types. (Obviously)

– Output types should be also the same as map output types. (why?)

• Combiner is not called on every record. If you have a different output types in combiner, you will end up with having two different types at reducer.

(K1, V1) → (K2, V2) → (K2, V2) → (K2, V2) → (K2, V2) → (K3, V3)

Mapper Combiner Reducer

Some issues for assignment 1

• Separate a string by separator “|”

– If “|” doesn't work, try “\\|”

• Need to ship more than one value in one value object?

– Implement you own Writable type, or

– Use Text. “23#16#87” contains three values in one string!

• configure(JobConf conf)

– Put your initialization in this method

– Good place to retrieve some parameters from

JobConf. ( conf.getXXX() )

Download