Jones H. Cruzado

Parallel Programming in Artifitial Neural Networks By Harold Cruzado J. Contents  Overview of Artificial Neural Network (ANN)  Why Parallelizing?  Parallelizing by use MPI on C/C++: An perceptron, back propagation, some functions useful, phases in the training, C structure to MPI structure  Conclusions Artifitial Neural Network (ANN) The ANN (or net) is used to solve many problems sometimes that not have direct solutions. The ANN is composed by perceptrons ( neurons). Each percetron (also called neuron) have input output and an constant called bias Why to parallelize? An ANN have a high time consuming in the process called training (also called learning ), which is achieved mean some algorithms, so as back propagation and others; those algorithms involve the matrix operations. An Perceptron and its training X1 : : Xn Training theperceptron, lets t a target, then w1 : : wn Σ f y Θ bias forward do { 1 : y  f ( XW   ) 2 : update W , and  }while | t  Y |  backward y  f (w1 x1  ...  wn xn   )  f (WX   ) - Forward phase (1): Calculate the actual outputs - Backward phase (2): Adjust the weights to reduce error An net of three layers Layer 1 Layer 2 Layer 3 y1 x1 : : : y1 : xn : : y1 forward backward Back Propagation Algorithm 1. 2. 3. 4. 5. Define Net Initialize Net Forward: Propagate net Backward: Adjust weights on net Send (from process root) new Net all process 6. If SSE>e repeat from step 3 to 6 Parallelizing by using MPI forward if SSE>e : backward Function main void main( int argc, char *argv[] ) { Ctype declations... MPI_Init(&argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &rank); MPI_Comm_size( MPI_COMM_WORLD, &size); Ctype to MPItype … Initializing the Net … Forward fase… … Backward fase … MPI_Finalize(); exit(0); } Dartitioning data The following function in C define the corresponding part of an interval (for example, indices of rows of the matrix ) for each process Void Partition(int n,int rank,int d,int I[2]) { int r=n%d,q; d=rank+1; q=(int)n/d; I[0]=q*(i-1); I[1]=q*i; if(i<=r){ I[0]+=i-1; I[1]+=i;} if(i>r){ I[0]+=r; I[1]+=r; } } Product matrix-vector void Ax(int A[f][c],int x[c],int I[2],int rank,int size,int y[f]) { int *c,aux[f]; for(int i=I[0];i<I[1];i++){ y[i-I[0]]=0; for(int j=0;j<4;j++){ y[i-I[0]]+=A[i][j]*x[j]; } } } C type structure to MPI type structure typedef struct { double Y[N+1]; double Delta[N]; double Weights[N][N+1]; Int bias } LAYER typedef struct { LAYER Layer[L]; double Target[N][N]; double Eta; double SSE; double x[DataRows][DataCols]; } NET; C structure to MPI structure int blocklengths[2]; MPI_Aint displacements[2],double_extent,layer_extent; MPI_Datatype dtypes[2] = {MPI_DOUBLE,MPI_INT}, mpi_layer; MPI_Type_extent(MPI_DOUBLE,&double_extent); displacements[0] =0; displacements[1] =(processors*(2*N+1)+ N*(N+1))*double_extent; blocklengths[0]=processors*(2*N+1)+N*(N+1); blocklengths[1]=1; MPI_Status status; MPI_Type_struct(2, blocklengths, displacements, dtypes, &mpi_layer); MPI_Type_commit(&mpi_layer); C structure to MPI structure MPI_Type_extent(mpi_layer,&layer_extent); MPI_Datatype dtypes1[2] = {mpi_layer,MPI_DOUBLE}, mpi_net; displacements[0] =0; displacements[1] =L*layer_extent; blocklengths[0]=L; blocklengths[1]=2+N*N+DataRows*DataCols+processors; MPI_Type_struct(2, blocklengths, displacements, dtypes1, &mpi_net); MPI_Type_commit(&mpi_net); Initializing the Net .. if(rank==0){ LoadData("TrainingData.txt",Rows,Cols,Net.X); SetBias(&Net,size); GenerateTarget(&Net); RandomWeights(&Net); Net.Eta=0.02; MPI_Bcast(&Net,1,NET,0,MPI_COMWORLD); } .. Forward phase .. Partition(f,rank,size,I,n_loc); for (int i=0;i<L,i++){ Ax(Net,Layer[l].Weights,Net.x,I,rank,size,y_loc); if (rank==0) MPI_Gather(y_loc,n_loc,MPI_INT,Net.Layer[l].y,n_loc, MPI_INT,0,MPI_COMM_WORLD); } if(rank==0) MPI_Bcast(&Net,1,NET,0,MPI_COMWORLD); EvaluateSSE(&Net,0,rank,size); .. Backward phase .. DeltaAdjust(&Net,Net.Targets[i],rank,size); WeightsAdjust(&Net,rank,size); if(rank==0) MPI_Bcast(&Net,1,NET,0,MPI_COMWORLD); .. Conclusions The MPI is good alternative to implemented the parallel code It is possible to parallelize the serial code which use defined types by programmers Questions?

Jones H. Cruzado

Related documents

Products

Support

Jones H. Cruzado

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib