Lecture2

advertisement
WSE 187:
INTRODUCTION TO PARALLEL
PROGRAMMING*
*Prepared with the help of free online resources.
Lecture 2
Jesmin Jahan Tithi
LOGIN TO SSH
Steps (Windows)
 Connect to the host
 Give provided password when
prompted
For the first time users:
 Accept security keys
 Change password:
 First provide the Old password
 Then type new password
 Repeat new password
Do not afraid if you do not see any character
On screen. But be careful when you are
Typing.
 Save the host info in SSH when prompted.
 Mac uses:

Use terminal to directly connect to the server.

Other steps are the same.

You may try to use Filezilla to transfer file from mac to server.
HELLO PARALLEL WORLD!
Intel Cilk Plus
CILK PLUS
Intel® Cilk™ Plus = add-on to the C and C++, implemented by the Intel® C++ Compiler
3 keywords to C and C++: cilk_for, cilk_spawn, and cilk_sync
cilk_spawn - Specifies that a function call can execute asynchronously, without requiring the
caller to wait for it to return. This is an expression of an opportunity for parallelism, not a
command that mandates parallelism. The Intel Cilk Plus runtime will choose whether to run the
function in parallel with its caller.
cilk_sync - Specifies that all spawned calls in a function must complete before execution
continues. There is an implied cilk_sync at the end of every function that contains a
cilk_spawn.
cilk_for - Allows iterations of the loop body to be executed in parallel.
cilk_spawn and cilk_for keywords express opportunities for parallelism.
CILK_SPAWN
Compile: icc –O3 –o hello Hello_parallel_world.cpp
Run: ./hello
Run: CILK_NWORKERS=4 ./hello
#include <stdio.h>
#include <cilk/cilk.h>
static void hello(){
int i=0;
for(i=0;i<1000000;i++)
printf("");
printf("Hello ");
}
static void world(){
int i=0;
for(i=0;i<1000000;i++)
printf("");
printf("world! ");
}
int main(){
cilk_spawn hello();
cilk_spawn world();
//cilk_sync;
printf("Done! ");
}
CILK_SPAWN EXERCISE
#include <stdio.h>
#include <cilk/cilk.h>
Order of placement
-------------------------Wheels,
Chassis,
-----------------------Engine,
Frame,
------------------------
void make(char* str){
int i=0;
for(i=0;i<1000000;i++)
printf("");
printf("%s has/have been created.\n",str);
}
void place(char* str){
int i=0;
for(i=0;i<1000000;i++)
printf("");
printf("%s has/have been placed.\n",str);
}
Steering wheel
int main(){
//Place your code here
}
CILK_FOR
for (int i = 0; i < 8; ++i) {
cilk_spawn do_work(i);
}
cilk_sync;
A better approach is to use a cilk_for loop:
cilk_for (int i = 0; i < 8; ++i)
{
do_work(i);
}
#include <stdio.h>
#include<iostream>
#include <cilk/cilk.h>
#include "cilktime.h"
using namespace std;
#define n 16384
int main(){
// First input vector.
int A[n];
// Second input vector.
int B[n];
// Sum vector.
int C[n];
// Initialize
cilk_for (int
A[i] =
B[i] =
}
the vectors or arrays with input.
i = 0; i <= n; i++){
i;
i+1;
// Compute the sum
unsigned long long tstart = cilk_getticks(); //beginning time stamp
cilk_for (int i = 0; i <= n; i++){
C[i] = A[i] + B[i];
}
unsigned long long tend = cilk_getticks(); //end time stamp
cout<<"Time to run:"<<cilk_ticks_to_seconds(tend-tstart)<<endl;
// Check the sum to verify.
int pos;
cout<<"Enter position of element to inspect"<<endl;
cin>>pos;
cout<<C[pos]<<endl;
return 0;
}
CILK_FOR
for (int i = 0; i < 8; ++i) {
cilk_spawn do_work(i);
}
cilk_sync;
A better approach is to use a cilk_for loop:
cilk_for (int i = 0; i < 8; ++i)
{
do_work(i);
}
#include <stdio.h>
#include <cilk/cilk.h>
int main(){
long int sum = 0;
cilk_for (int i = 0; i <= 100000000; i++)
sum += i;
printf("%ld\n",sum);
return 0;
}
//wrong! race condition
CILK_FOR
Several ways of dealing with race conditions.
First option: Use locks!
We will learn more later.
#include <stdio.h>
#include <cilk/cilk.h>
#include <pthread.h> //pthread library
int main(){
long int sum = 0;
pthread_mutex_t m; //define the lock
pthread_mutex_init(&m,NULL); //initialize the
lock
cilk_for (int i = 0; i <= 1000000; i++){
pthread_mutex_lock(&m); //lock - prevents other
threads from running this code
sum += i;
pthread_mutex_unlock(&m); //unlock - allows
other threads to access this code
}
printf("%ld\n",sum);
}
CHANGING NUMBER OF CORES/THREADS
Run with: CILK_NWORKERS=4 ./executable
Or change inside the main program:
if (0!= __cilkrts_set_param("nworkers","16"))
{
cout<<"Failed to set worker count\n"<<endl;
return 1;
}
Check to verify:
int num_threads =__cilkrts_get_nworkers();
cout<< num_threads <<endl;
Download