PGENESIS Tutorial – GUM’02 Greg Hood Pittsburgh Supercomputing Center What is PGENESIS? • Library extension to GENESIS that supports communication among multiple processes – so nearly everything available in GENESIS is available in PGENESIS • Allows multiple processes to perform multiple simulations in parallel • Allows multiple processes to work together cooperatively on a single simulation • Runs on workstations or supercomputers History • PGENESIS developed by Goddard and Hood at PSC (1993-1998) • Current contact: pgenesis@psc.edu Tutorial Outline • Installation • What PGENESIS provides • Using PGENESIS for parallel parameter searching • Using PGENESIS for simulating large networks more quickly • Scaling up for large runs • A comparison of PGENESIS with alternatives PGENESIS Installation Installation: Requirements At least 1 Unix-like computer on which GENESIS will run. Same account name on all computers. If multiple machines are to be used together, then it is best if they are all on the same network segment (e.g. same 100Mbit/s Ethernet switch). Installation: GENESIS 1. Install regular (serial) GENESIS: a. Make sure you have configured serial GENESIS to include all libraries that you will ever want to use with PGENESIS. b. make all; make install c. make nxall; make nxinstall if you want an Xodus-less version of PGENESIS Installation: ssh 2. Configure ssh to allow process startup across machines without password entry: a. You probably already have ssh/sshd. If not, download from http://www.openssh.org and install according to instructions. b. Run ssh-keygen –t rsa on each machine from which you will launch PGENESIS to generate private/public keys. c. Append all of the public keys (stored in ~/.ssh/id_rsa.pub) to ~/.ssh/authorized_keys on all host on which you want to run PGENESIS processes. d. Test: ssh remote_host_name remote_command should not ask you for a password. Installation: PVM 3. Install PVM message passing library a. Download from http://www.csm.ornl.gov/pvm b. Modify .bashrc to set PVM_ROOT to where PVM was installed: export PVM_ROOT=/usr/share/pvm3 c. Modify .bashrc to set PVM_RSH to the ssh executable: export PVM_RSH=/usr/bin/ssh d. Build PVM (“cd $PVM_ROOT; make”) e. Test PVM % pvm pvm> add otherhost pvm> halt Installation: PGENESIS 3. Install PGENESIS package a. Download from http://www.genesis-sim.org b. cp Makefile.dist Makefile c. Edit Makefile d. make install e. make nxinstall for Xodus-less version Installation: Simple • • • Cluster of similar machines Shared filesystem Home directory is located on shared filesystem Installation: Complex • • • • Heterogeneous cluster Novel processor/OS No shared filesystems Custom libraries linked into GENESIS Recommended approach: • Install on each machine independently and make sure PGENESIS works locally before trying to use all machines together The "pgenesis" Startup Script (1) Purpose: checks that the proper PVM files are in place, starts the PVM daemon, then starts the appropriate PGENESIS executable. Basic syntax: pgenesis scriptname.g The "pgenesis" Startup Script (2) Options: -config <filename> where <filename> contains a list of hosts to use -debug <mode> where <mode> is one of the following: tty dbx gdb -nox do not use Xodus -v verbose mode -help list the valid pgenesis script flags PGENESIS Functionality How PGENESIS Runs in Parallel This lecture will introduce you to PGENESIS, which is a derivative of GENESIS that has been augmented with additional commands to support parallelism. Contents • What is PGENESIS? typically one process starts Workstation: How PGENESIS Runs and then spawns n-1 other processes in Parallel Nodes and Zones • mapping of processes to processors is Who am I? often 1 to 1, but may be many to 1 during Threads Synchronization debugging Remote Function Calls Asynchronous Calls Useful Commands for How PGENESIS Runs in Parallel Massively parallel machines: all n processes are started simultaneously by the operating system mapping of processes to processors is nearly always 1 to 1 On both: every process runs same script this is not a real limitation Nodes and Zones Each process is referred to as a "node". Nodes may be organized into "zones". A node is fully specified by a numeric string of the form “<node>.<zone>”. Simulations within a zone are kept synchronized in simulation time. Each node joins the parallel platform using the paron command. Each node should gracefully terminate by calling paroff Every node in its own zone Simulations on each node are not coupled temporally. Useful for parameter searching. We refer to nodes as “0.0”, “0.1”, “0.2”, … All nodes in one zone Simulations on each node are coupled temporally. Useful for large network models Zone numbers can be omitted since we are dealing with only one zone; we can thus refer to nodes as “0”, “1”, “2”, … Hybrid schemes Parameter searching on large network models Example: The network is partitioned over 8 nodes; we run 16 simulations in parallel to do parameter searching on this model, thus using a total of 128 nodes. Nodes have distinct namespaces /elem1 on node 0 refers to an element on node 0 /elem1 on node 1 refers to an element on node 1 To avoid confusion we recommend that you use distinct names for elements on different nodes within a zone. GENESIS Terminology GENESIS Object Element Message Value Computer Science = = = = Class Object Connection Message Who am I? PGENESIS provides several functions that allow a script to determine its place in the overall parallel configuration: mytotalnode - # of this node in platform mynode - # of this node in this zone myzone - # of this zone ntotalnodes - # of nodes in platform nnodes - # of nodes in this zone nzones - # of zones npvmcpu - # of processors in configuration mypvmid - PVM task identifier for this node (all numbering starts at 0) Styles of Parallel Scripts – Each node executes the same script commands. Symmetric – One node (usually node 0) coordinates processing and issues commands to the other nodes. Master/Worker Explicit Synchronization barrier - causes thread to block until all nodes within the zone have reached the corresponding barrier barrier -wait at default barrier barrier 7 -wait at named barrier barrier 7 100000 -timeout is 100000 seconds barrierall - causes thread to block until all nodes in all zones have reached the corresponding barrier barrierall -wait at default barrier barrierall 7 -wait at named barrier barrierall 7 100000 -timeout is 100000 sec Implicit Synchronization Two commands implicitly execute a zone-wide barrier: step - implicitly causes the thread to block until all nodes within the zone are ready to step (this behavior can be disabled with “setfield /post sync_before_step 0”) reset - implicitly causes the thread to block until all nodes have reset These commands require that all nodes in the zone participate, thus the barrier. Remote Function Calls (1) An "issuing" node directs a procedure to run on an "executing" node. Examples: some_function@2 params... some_function@all params... some_function@others params... some_function@0.4 params... some_function@1,3,5 params... Remote Function Calls (2) Each remote function call causes the creation of a new thread on the executing node. All parameters are evaluated on the issuing node. Example: if called from node 1, some_function@2 {mynode} will execute some_function 1 on node 2 Remote Function Calls (3) When does the executing node actually perform the remote function call, since we don't use hardware interrupts? While waiting at barrier or barrierall. While waiting for its own remote operations to complete, e.g. func@node, raddmsg When the simulator is sitting at the prompt waiting for user input. When the executing script calls clearthread or clearthreads. Threads A thread is a single flow of control within a PGENESIS script being executed. When a node starts, there is exactly one thread on it – the thread for the script. There may potentially be many threads per node. These are stacked up, with only the topmost actually executing at any moment. clearthread – yield to one thread awaiting execution (if one exists) clearthreads – yield to all threads awaiting execution Asynchronous Calls (1) The async command allows a script to dispatch an operation on a remote node without waiting for its completion. Example: async some_function@2 params... Asynchronous Calls (2) One may wait for an async call to complete, either individually, future = {async some_function@2 ...} ... // do some work locally waiton {future} or for an entire set: async some_function@2 ... async some_function@5 ... ... waiton all Asynchronous Calls (3) Asynchronous calls may return a value. Example: int future = async myfunc@1 // start thread on node 1 … // do some work locally int result = waiton {future} // wait for thread's result Thus the term "future" - it is a promise of a value some time in the future. waiton calls in that promise. Asynchronous Calls (4) async returns a value which is only to be used as the parameter of a waiton call, and waiton must only be called with such a value. Remote function calls from a particular issuing node to a particular executing node are guaranteed to be performed in the sequence they were sent. There is no guaranteed order among calls involving multiple issuing or executing nodes. Advice about Barriers (1) It is very easy to reach deadlock if barriers are not handled correctly. PGENESIS tries to warn you by printing a message that it is waiting at a barrier. Examples of incorrect barrier usage: Each node executes: barrier {mynode} Each node executes: barrier@all A single node executes: barrier@others; barrier; However: async barrier@others; barrier will work! Advice about Barriers (2) Guideline: if your script is operating in the symmetric style (all nodes execute all statements), never use barrier@ If your script is operating in the master-worker style, master must ensure it calls a function on each worker that executes a barrier before it enters the barrier barrier; async barrier@others work. will not Commands for Network Creation Several new commands permit the creation of "remote" (internode) messages: raddmsg /local_element /remote_element@2 \ SPIKE rvolumeconnect /local_elements \ /remote_elements@2 \ -sourcemask ... -destmask ... \ -probability 0.5 rvolumedelay /local_elements -radial 10.0 rvolumeweight /local_elements -fixed 0.2 rshowmsg /local_elements Parallel I/O: Display How can one display from more than one node? 1. Use an xview object. 2. Add an index field to the displayed elements. 3. Use the ICOORDS and IVAL1 ... IVAL5 messages instead of the COORDS and VAL1 .. VAL5 messages: raddmsg /src_elems /xview_elem@0 \ ICOORDS io_index_field x y z raddmsg /src_elems /xview_elem@0 \ IVAL1 io_index_field Vm Interaction with Xodus Xodus introduces another degree of parallelism via the X11 event processing mechanism. PGENESIS periodically instructs the X Server to process any X events. Some of those events may result in some script code being run. Race condition: processing order is unpredictable. Safe 1: ensure all affected nodes are at a barrier (or equivalent) Safe 2: ensure mouse/keyboard events do not cause remote operations that require the participation of another node. Parallel I/O: Writing a File How can one write a file from more than one node? 1. Use a par_asc_file or par_disk_out object. 2. Add an index field to the source elements. 3. raddmsg /src_elems \ /par_asc_file_elem@0 \ SAVE io_index_field Vm Tips for Avoiding Deadlocks Use lots of echo statements. Use barrier IDs. Do not execute barriers remotely (e.g., barrier@all). Remember that step usually does an implicit barrier. Have each node do its own step command, or have one controlling node do a step@all. (similarly for reset) Do not use the stop command. Keep things simple. Motivation Parallel control of setup can be hard. Parallel control of simulation can be hard. Debugging parallel scripts is hard. How PGENESIS Fits into Schedule Schedule controls the order in which GENESIS objects get updated. At beginning of step, all internode data is transferred. There will be equivalence to serial GENESIS only if remote messages do not pass from earlier to later elements in the schedule. How PGENESIS Fits into Schedule addtask Simulate /##[CLASS=postmaster] -action PROCESS addtask Simulate /##[CLASS=buffer] -action PROCESS addtask Simulate /##[CLASS=projection] -action PROCESS addtask Simulate /##[CLASS=spiking] -action PROCESS addtask Simulate /##[CLASS=gate] -action PROCESS addtask Simulate /##[CLASS=segment][CLASS!=membrane]\ [CLASS!=gate][CLASS!=concentration] -action PROCESS addtask Simulate /##[CLASS=membrane] -action PROCESS addtask Simulate /##[CLASS=hsolver] -action PROCESS addtask Simulate /##[CLASS=concentration] \ -action PROCESS addtask Simulate /##[CLASS=device] -action PROCESS addtask Simulate /##[CLASS=output] -action PROCESS Adding Custom "C" Code Uses: data analysis interfacing custom objects PGENESIS allows user's custom libraries to be linked in, similarly to GENESIS We recommend that you first incorporate your custom library into serial GENESIS, before trying to use it with PGENESIS. Modifiable Parameters /post/sync_before_step – boolean (default: 1) /post/remote_info – boolean (default 1) enables rshowmsg /post/perfmon – boolean (default 0) enables performance monitoring /post/msg_hang_time – float (default 120.0) seconds before giving up on remote operation /post/pvm_hang_time – float (default 3.0) seconds between printing dots while waiting for a message /post/xupdate_period – float (default 0.01) seconds between checking for X events when at barrier Limitations of PGENESIS No rplanarweight, rplanardelay – use corresponding 3-D routines rvolumeweight, rvolumedelay Cannot delete remote messages getsyncount, getsynindex, getsyndest no longer return the correct values. Parameter Searching with PGENESIS Model Characteristics The following are prerequisites to use PGENESIS for optimization on a particular parameter searching problem: Model must be expressed in GENESIS. Decide on the parameter set. Have a way to evaluate the parameter set. Have some range for each of the parameter values. The evaluations over the parameter-space should be reasonably well-behaved. Stopping criterion Trivial Model Rather than do a simulation, we will just optimize a function f of four parameters a, b, c, and d: f(a, b, c, d) = 10.0 – (a-1)*(a-1) – (b-2)*(b-2) – (c-3)*(c-3) – (d-4)*(d-4) Evaluation of the model: fitness = f(a, b, c, d) Range of parameters: -10 < a,b,c,d < 10 Evaluation is definitely well-behaved. Stopping criterion: Stop after 1000 individuals. Master/Worker Paradigm (1) Master/Worker Paradigm (2) All nodes in a separate zone. Node 0.0 will control the search. Nodes 0.1 through 0.{n-1} will run the model and perform the evaluation. Commands for Optimization Typically these are organized in a master/worker fashion with one node (the master) directing the search, and all other nodes evaluating parameter sets. Remote function calls are useful in this context for: sending tasks to workers: async task@{worker} param1... having workers return evaluations to master: return_result@{master} result Choose a Search Strategy Genetic Search Simulated Annealing Monte Carlo (for very ill-behaved search spaces) Nelder-Mead (for well-behaved search spaces) Use as many constraints as you can to restrict the search space Always do a sanity check on results A Parallel Genetic Algorithm We adopt a population-based approach as opposed to a generation-based one. We will keep a fixed population "alive" and use the workers to evaluate the fitness of candidate individuals. If a candidate turns out to be better than some member of the current population, then we replace the worst member of the current population with the new individual. Parameter Representation We represent the set of parameters that define an individual as a string of bits. Each 16-bit string (one "gene") is interpreted as a signed integer and then divided by 1000.0 to yield the floating point value. To generate a new candidate from the existing population: 1. 2. Pick a member of the population at random. Go through each bit of the bit string, and mutate it with some small probability. Main Script paron -farm -silent 0 -nodes {n_nodes} \ -output o.out -executable nxpgenesis barrierall if ({mytotalnode} == 0) search end barrierall 7 1000000 paroff quit Master Conducts the Search function search int i init_search init_farm for (i = 0; i < individuals; i = i + 1) if (i < population) init_individual else mutate_individual {rand 0 actual_population} end delegate_task {i} {bs_a} {bs_b} {bs_c} {bs_d} end finish end Master Conducts the Search function delegate_task while (1) if (free_index >= 0) async worker@0.{getfield \ /free[{free_index}] value} \ {bs_a} {bs_b} {bs_c} {bs_d} free_index = free_index - 1; return else clearthreads end end end Workers Evaluate Individuals function worker (bs_a, bs_b, bs_c, bs_d) int bs_a, bs_b, bs_c, bs_d float a, b, c, d, fit a = (bs_a - 32768.0) / 1000.0 b = (bs_b - 32768.0) / 1000.0 c = (bs_c – 32768.0) / 1000.0 d = (bs_d – 32768.0) / 1000.0 fit = {evaluate {a} {b} {c} {d}} return_result@0.0 {mytotalnode} {bs_a} {bs_b} \ {bs_c} {bs_d} {fit} end Workers Evaluate Individuals function evaluate (a, b, c, d) float a, b, c, d, fit fit = 10.0 – (a-1)*(a-1) – (b-2)*(b-2) \ - (c-3)*(c-3) – (d-4)*(d-4) return {fit} end Master Integrates the Results (1) function return_result (node, bs_a, bs_b, bs_c, bs_d, fit) int node, bs_a, bs_b, bs_c, bs_d float fit if (actual_population < population) least_fit = actual_population min_fitness = -1e+10 actual_population = actual_population + 1 end Master Integrates the Results (2) if (fit > min_fitness) setfield /population[{least_fit}] fitness setfield /population[{least_fit}] a_value setfield /population[{least_fit}] b_value setfield /population[{least_fit}] c value setfield /population[{least_fit}] d value if (actual_population == population) recompute_fitness_extremes end end free_index = free_index + 1 setfield /free[{free_index}] value {node} end {fit} {bs_a} {bs_b} {bs_c} {bs_d} A More Realistic Model We have a one compartment cell model of a spiking neuron. Dynamics are probably well-behaved. Parameters are the conductances for the Na, Kdr, Ka, and KM channels. We know the conductance values to be in the range from 0.1 to 10.0 a priori. We write spike times to a file, then compare this using a C function, spkcmp, to "experimental" data. Stop when our match fitness exceeds 20.0 Improved Parameter Representation As before, we still represent the set of parameters that define an individual as a string of bits. However, now each 16-bit string will logarithmically map into the range of 0.1 to 10.0 so that we will have increased resolution at the low end of the scale. Crossover Mutations 1. 2. 3. Pick a member of the population at random. Decide whether to do crossover according to the crossover probability. If we are doing crossover, pick another random member of the current population, and combine the "genes" of those individuals. If we aren't doing crossover, just copy the bits of the original individual. Go through each bit of the bit string, and mutate it with some small probability. Main Script (1) int n_nodes = 4 int individuals = 1000 int population = 10 float stopping_criterion = 20.0 float crossover_prob = 0.5 float bit_mutation_prob = 0.02 Main Script (2) include population.g // functions for GA // population-based // parameter searches // model-specific files include siminit.g // defines parameters of // simulation include fI.g // sets up table of currents include channels.g // defines the channels include simcell.g // functions to load in the // cell model include eval.g // functions to evaluate the model Main Script (3) paron -farm -silent 0 -nodes {n_nodes} \ -output o.out -executable nxpgenesis barrierall if ({mytotalnode} == 0) init_master pb_search {individuals} {population} else init_worker end barrierall 7 1000000 paroff Parameters Are Customizable function setfield setfield setfield setfield setfield setfield setfield setfield end init_params /params[0] label "Na" scaling "log“ /params[0] min_value 0.1 max_value 10.0 /params[1] label "Kdr" scaling "log“ /params[1] min_value 0.1 max_value 10.0 /params[2] label "Ka" scaling "log“ /params[2] min_value 0.1 max_value 10.0 /params[3] label "KM" scaling "log“ /params[3] min_value 0.1 max_value 10.0 Worker Evaluates Individuals (1) function evaluate float match, fitness // first run the simulation newsim {getfield /params[0] {getfield /params[1] {getfield /params[2] {getfield /params[3] runfI call /out/{sim_output_file} value} \ value} \ value} \ value} FLUSH Worker Evaluates Individuals (2) // then find the simulated spike times gen2spk {sim_output_file} {delay} \ {current_duration} {total_duration} // then compare the simulated spike // times with the experimental data match = {spkcmp {real_spk_file} \ {sim_spk_file} -pow1 0.4 -pow2 0.6 \ -msp 0.5 -nmp 200.0} fitness = 1.0 / {sqrt {match}} return {fitness} end Tuning Search representation parameter selection generation vs population-based approach generation/population size crossover probability crossover method mutation probability initial ranges Large Networks with PGENESIS Parallel Network Creation In parallel network creation make sure elements exist before connecting them up, e.g. create_elements(...) barrier create_messages(...) Goals of decomposition Keep all processors busy all the time on useful work Use as many processors as are available Key concepts are: Load-balancing Minimizing communication Minimizing synchronization Scalable decomposition Parallel I/O Load balancing Attempt to parcel out the modeled cells such that each CPU takes the same amount of time to simulate one step This is static load balancing - cells do not move Dedicated access to the CPUs is required for effective decomposition Easier if identically configured CPUs. PGENESIS provides no automated loadbalancing but there are some performance monitoring tools. Minimizing communication Put highly connected clusters of cells on the same PGENESIS node. Think of each synapse with a presynaptic cell on a remote node as expensive. The same network distributed among more nodes will result in more of these expensive synapses; hence, more nodes can be counterproductive. The time spent communicating can overwhelm the time spent computing. Orient_tut Example Non-scalable decomposition Scalable decomposition (1) Goal: as the number of available processors grows, your model naturally partitions into finer divisions Scalable decomposition (2) Scalable decomposition (3) To the extent that you can arrange your decomposition to scale with the number of processors, it is a very good idea to create the scripts using a function of the number of nodes anywhere that a node number must be explicitly specified. E.g., createmap /library/rec /retina/recplane \ {REC_NX / n_slices} {REC_NY} \ -delta {REC_SEPX} {REC_SEPY} \ -origin {-REC_NX * REC_SEPX / 2 + \ slice * REC_SEPX * REC_NX / n_slices} \ {-REC_NY * REC_SEPY / 2} Case Study: Cerebellar Model Howell, D. F., Dyhrfjeld-Johnsen, J., Maex, R., Goddard, N., De Schutter, E., “A large-scale model of the cerebellar cortex using PGENESIS”, Neurocomputing, 32/33 (2000), p. 1041-1046. 16 Purkinje cells embedded in an environment of other simpler, but more numerous cells Simulated on 128 processors of PSC’s Cray T3E Cell Populations & Connectivity 3-D Representation of Network Model Partitioning Timings on 128 Processors of T3E Timings vs. Model Size Timings on Workstation Network Significant Overhead on Cluster Scaling Up Getting Cycles NSF-Funded Supercomputing Centers Pittsburgh Supercomputing Center (http://www.psc.edu) – PGENESIS installed on 512 processor Cray T3E NPACI (http://www.npaci.edu) – Worked on MPI-based PGENESIS Alliance (http://www.ncsa.uiuc.edu) The High End 3000-processor Terascale computer at PSC Parallel Script Development 1. Develop single cell prototypes using serial GENESIS. 2. (a) For network models, decide partitioning and develop scalable scripts. (b) For parameter searches, develop scripts to run and evaluate a single individual, and a scalable script that will control the search. 3. Try out scripts on single processor using the minimum number of nodes. Parallel Script Development 4. Try out scripts on single processor but increase the number of nodes. 5. Try out scripts on small multiprocessor platform. 6. Try out scripts on large multiprocessor platform. Resource Limits and Other Tips On the Cray T3E set PVM_SM_POOL to ensure adequate PVM buffer space. This should be set to the maximum number of messages that might arrive at any PE before it gets a chance to process them. On other machines, you may need to set PVMBUFSIZE to address similar issues. When debugging interactively, set the timeout so that other nodes do not timeout: setfield /post msg_hang_time 10000.0 Reducing Synchronization Delay In network models, axonal delays L are large compared to the simulation time step. A spike generated at simulation time T on one node need not be physically delivered to the destination synapse on another node until simulation time T+L. PGENESIS can use this to reduce unnecessary waiting. Node B can get ahead of node A by the minimum of all the axonal delays on the connections from cells on A to synapses on B. This is called the lookahead of B with respect to A. You must set /post/sync_before_step to 0 for this to allow this looser synchronization. Reducing Synchronization Delay A goal when you are partitioning a network across nodes is to make the lookahead between any pair of nodes large. PGENESIS provides the setlookahead command for you to inform it of the lookahead between nodes: setlookahead 0.01 // sets lookahead to 10 mS setlookahead 3 0.01 // sets lookahead to 10 mS w.r.t. node 3 The getlookahead command reports the current setting with respect to a particular node, and the showlookahead command reports the minimum lookahead to all other nodes: getlookahead 3 // gets lookahead with respect to node 3 showlookahead // get lookahead with respect to all nodes Parallel I/O Currently the I/O facilities (disk elements and Xodus elements) are tightly synchronized with the simulation (no lookahead). Therefore sending messages to Xodus objects or disk objects on remote nodes usually slows the simulation to a crawl. Use Xodus only for post-processing. Try to arrange input and output to be via local elements. On workstations it is preferable to access local disk. If access is via a shared file system (e.g., NFS, AFS), use different output disk files for different nodes, and amalgamate the data after the simulation is over. Performance Monitoring (1) A script can turn on performance monitoring with setfield /post perfmon 1 and turns it off with setfield /post perfmon 0 Whenever performance monitoring is active, the categories listed below accumulate time. To ignore the time involved in construction of a model, do not activate performance monitoring until just prior to the first simulation step. The accumulated time values can be dumped to a file with the command perfstats This writes a file to /tmp (usually a local disk) called pgenesis.ppp.nnn.acct where ppp is the process id and nnn is the node number. Each time perfstats is called it dumps the accumulated values, but it does not reset them. Performance Monitoring (2) The monitoring package tracks the amount of time in various operations: PGENESIS_PROCESS_SNDREC_SND time sending data to other nodes PGENESIS_PROCESS_SNDREC_REC time receiving data from other nodes PGENESIS_PROCESS_SNDREC_GETFIELD time spent gathering local data for transmission to other nodes PGENESIS_PROCESS_SNDREC time spent in sending and receiving data not accounted for by the three preceding categories PGENESIS_PROCESS_SYNC time spent explicitly synchronizing nodes prior to each step Performance Monitoring (3) PGENESIS_PROCESS time spent in parallel overhead of exchanging data with other nodes which is not accounted for by the preceding categories PGENESIS_EVENT time spent handling incoming spikes PGENESIS time spent in PGENESIS not accounted for by the preceding overhead categories. (In other words the time spent doing useful work.) Comparisons and Summary Alternatives to PGENESIS (1) Batch scripts (Perl, Python, bash) for parameter searching Incurs GENESIS process startup and network setup overheads If simulations are long, and evaluation step is done externally already, may be simpler NEURON Parallel parameter searching (talk with Mike Hines) Vectorized NEURON if you happen to have a vector machine handy Alternatives to PGENESIS (2) NEOSIM (http://www.neosim.org/) Prototype stage (Java kernel released) Integration with NEURON simulation engine Supports automatic network partitioning Modular architecture Designed for scalability Hand-coded simulation (Java, C++, C, Fortran) Very time-consuming (especially parallel coding) Difficult to share models Specialized code can run much faster Possibly appropriate for large, but simple models (e.g. connectionist-style approaches) Summary PGENESIS is a GENESIS extension which can let you use multiple computers to: Perform large parameter searches much more quickly Simulate large network models more quickly Discussion References • Goddard, N.H. and Hood, G., Large-scale simulation using parallel GENESIS, The Book of GENESIS, 2nd ed., Bower, J.M. and Beeman, D. (Eds), Springer-Verlag, 1998. • Goddard, N.H. and Hood, G., Parallel Genesis for large scale modeling, Computational Neuroscience: Trends in Research 1997, Plenum Publishing, NY, 1997, p. 911-917. • Howell, D. F., Dyhrfjeld-Johnsen, J., Maex, R., Goddard, N., De Schutter, E., A large-scale model of the cerebellar cortex using PGENESIS, Neurocomputing, 32/33 (2000), p. 1041-1046.