2012/06/22 Email: nomura@am.ics.keio.ac.jp Contents GPU (Graphic Processing Unit) CUDA Programming Target: Clustering with Kmeans How to use toolkit1.0 Towards the fastest program GPU (Graphic Processing Unit) GPU SM SM Multicore processor SM … Several handreds cores SP: Core in GPU Global Memory SM: Composed of SPs High memory bandwidth Table: Specification of GeForce280 SP SP 30 (Each of them has 8 SP) SP SP 141.7 GB/s SP SP SP SP SP 240 SM Memory Bandwidth SM SP: Streaming Processor SM: Streaming MultiProcessor Flow of CUDA Program Allocate GPU memory 1. Array Host cudaMalloc() CPU Transfer input data 2. input 2 output 2 … … input N output N … Global Memory Data Transfer SP Data Transfer cudaFree() SP Kernel SP Kernel 5. Device (GPU) Kernel 4. output 1 cudaMemcpy() Execute kernel Transfer result data Free GPU memory 3. Main Memory input 1 Array input 1 output 1 input 2 output 2 … … input N output N Target application:clustering with Kmeans A famous method for clustering A program with kmeans method for a host processor is given. Modify it so that it works on GPU as fast as possible. GeForce Tesla (GTX280) in Amano Lab. can be used for this contest. Kmeans method(1/5) Initial state: STEP1: Nodes in a certain color is distributed Centre of gravity is computed for randomly. each colored node set. (Here, 100nodes with 5 colors are (X in the figure is each centre) shown) Reference URL: http://d.hatena.ne.jp/nitoyon/20090409/kmeans_visualise Kmeans method(2/5) STEP2 The color of each node is changed into that of the nearest centre. STEP1: Again, the centre of gravity is computer in node set with the same color. Kmeans method(3/5) STEP2: Again, the color of each node is changed into that of the nearest centre. STEP1: Again, the centre of gravity is computer in node set with the same color. Kmeans method(4/5) STEP2: Again, the color of each node is changed into that of the nearest centre. STEP1: Again, the centre of gravity is computer in node set with the same color. Kmeans method(5/5) STEP2: Again and again, the color of each node is changed into that of the nearest centre. Terminate Condition: The color of all nodes are the same as the color of the centre, thus, there is no need to change the color. →Terminate. How to start ssh 131.113.69.98 for login. Your account has been available. If you have not received mail about account, please send mail to nomura@am.ics.keio.ac.jp . Download kmeans.tar.gz and ungip. There are useful sample codes in kmeans. Mission1:Make GPU version based on CPU version. Describe gpuKMeans in kmeans.cu cpuKMeans in main.cu is a CPU version for reference. Mission2:Optimize the CPU code so that it runs as fast as possible. Toolkit1.0 kmeans.cu To describe K-means program for GPU Please modify this file main.cu To read input data, describe CPU program Modification forbidden check.c To visualize output data by OpenCV gen.c To generate input data Makefile data/ Input data result/ Output data How to use Toolkit1.0 $ make Compile $ make gpu Execute GPU Program $ make cpu Execute CPU Program $ ./gen SEED (SEED = 0,1,2,…) Generate input data Sample Code Vector addition program for GPU $ make : Compile $ ./main : Program run Point Memory allocation on GPU ○ cudaMalloc(), cudaFree() Data transfer between CPU and GPU ○ cudaMemcpy() Format of GPU kernel function Towards the fastest program Minimum requirement Implementation K-means program on GPU Parallelizing STEP1 or STEP2 in K-means How to optimize program Parallelizing both of STEP1 and STEP2 Shared memory, Constant memory Coalesced Memory Access etc Web Site NVIDIA GPU Computing Document: http://developer.nvidia.com/nvidia-gpu-computingdocumentation Fixstars CUDA Infromation Site: http://gpu.fixstars.com/index.php/ Announcement: If you have not an account mail to nomura@am.ics.keio.ac.jp Your name should be included in the mail. Deadline:7/22 (Fri) 24:00 Copy follows in ~/comparch Source code and simple report Please check the web site. Additional information will be on it. If you have any question about the contest, please send mail to: nomura@am.ics.keio.ac.jp