usarray_robs_rules_matlab

advertisement
Robert’s Rules of Matlab
Disclaimer:
My usual processing flow is to use c or fortran codes linked
together via tcsh scripts.
I find Matlab to be really great for easy to code routines
which require user interaction. Instances include SplitLab
and FuncLab as you will learn later this week, quality
control of ambient noise correlations, and determination of
discontinuities from 1D shear velocity and
Receiver Functions.
Poker
Matlab is not only for data processing.
From the Matlab command window, type “poker” to launch
a very simple draw poker game I wrote in 2004.
What is this code?
Input user parameters
Set the random number generator
While (player has chips or decides not to cash out)
Generate a hand
display hand
allow user to exchange cards
display new hand
analyze results and pay out
End while
Getting help
This is the first block of comments in the .m file displayed when the user types:
>> help command.m
Pre-allocate your arrays
30x speedup!
Matrices and array notation
Matlab is built on the idea that everything is a matrix (or vector).
Much faster
than nested
for loops
The cell array
A cell array can be seen as an array of matrices. Each matrix can be its
own size and type. Where matrix/vector elements can be accessed
by enclosing the indices in parentheses ( ), cell arrays use squiggly
brackets { }.
Structures
variable.element
think of structures as a handy way
to keep track of, pass, and handle
‘objects’ with common
properties. In the example here, I
use a ‘station’ with some
common parameters. I can then
operate on a structure variable if I
know what all properties it
contains.
Avoid the parfor temptation
To parallelize a Matlab for loop, all you have to do is:
1.Change “for” to “parfor”
2.Open a pool of Matlab works to run the parallel loop
BUT I suggest avoiding this.
Why?
Matlab relies on built-in algorithms that are already optimized and
often are already parallel. Therefore, if you override that optimized
parallel code with a non-optimized block, you’re likely to see drops
in performance because the resources are not available for the
optimized function.
Furthermore, there is a cost of parallelization – namely message passing.
Each time a parallel block is entered, the processor has to figure out
and send information to the nodes before the parallel job even begins.
Avoid toolboxes in distributed code
Toolboxes are great extensions to Matlab, but for a single user license they cost
~$45 each in addition to the ~$150 base cost. Institutional costs are order of
$500 for the base and ~$100 per toolbox. Therefore you can’t assume other users
of your code will have access to the same set of toolboxes.
Need a toolbox function?
Is it simple?
Write it yourself.
Is it not simple?
Google the usage; sometimes you will find a drop in replacement
For instance, in some of the SRF codes we use at USC will rely on reckon.m from
the mapping toolbox. We recently found geodreckon.m, which is nearly a drop
in replacement freely available online from the author (Charles Karney).
Pandora’s box: UI control
uimenu sets menu items such as file->open or edit->select all
uipanel creates a grouped background useful for arranging user control
uicontrol creates an instance of some user interactive widget such as a button,
popup menu, editable text box, radio buttons, etc… When one of these items is
activated (ie button pressed) it enacts what is called a ‘callback function’.
Making a GUI is all about designing the button layout and then coding the set of
callback functions to process the data as input by the user.
Search for existing code!
The real value of Matlab is its extensive code/user base.
Developers from across the sciences all use
Matlab and contribute new code which may
help you make a new discovery. Many of these
codes can be found on
http://www.mathworks.com/matlabcentral/file
exchange
http://github.com
Think “Processing Flow” rather than just processing
Who wants to click once per seismogram?
When you’re dealing with big data, its only practical to have the data (not
just waveform) flow through filters and processes to develop the final
product.
You will see this in FuncLab and SplitLab. Half of the work is just setting up
the workflow and then hitting a button to actually run the processing flow.
Things like sac or the matlab command line are great for interactively
experimenting with small data. But once you want to scale up, you need
to think through as many possibilities as you can.
Some useful packages:
m_map
seizmo
the waveform suite
coral
processRFmatlab
FuncLab
SplitLab
irisFetch.m + IRIS-WS jar
Download