Using the “Matrix Laboratory” aka MATLAB to Process Seismological Datasets Andy Frassetto, Bob Woodward Instrumentation Services, IRIS USArray Data Processing and Analysis Short Course August 4-8, 2014 Northwestern University Evanston, Illinois Courtesy Maggie Benoit (Friend of 2 IRIS, NSF Program Officer) The “Bad” News… • Matlab is a commercial product: – Considerable startup cost for license/toolboxes – Yearly maintenance fees • Matlab can be clunky – Slow to open, difficult to 3use with strings The “Good” News… • Matlab is a commercial product: – Major discounts for students, universities often purchase group licenses, etc. – Extensive resources, support from the source – Widespread, multi-disciplinary base of users – Established, prominent, well-supported in its usage within the seismological community 4 The Basics • • • • • Data acquisition Data file format Reading data Data preprocessing Analysis – Speed, speed, speed • The importance of sanity checks – Use the plotting tools! • Results – Use the plotting tools! 5 Data Files • Data file formats Gotcha Byte order – SAC files are traditionally used – Changing trends: • Chad’s talk tomorrow covers reading data directly via irisFetch.m • With time and effort you can read any ASCII / binary file format • Data reading – Use wildcarding of file names to subset the # of files that needs to be read? – Typically make a pass to read headers – Allocate memory for data Gotcha – Read data into one matrix SNCL KNETWK, KCMPNM, KSTNM, KHOLE 6 Prepping the Data • Organize data in matrix – Seismograms down columns • Fastest for working with single seismograms • Facilitates using MATLAB functions Gotcha MATLAB epochal time different from UNIX • Use epochal time for all time-related arithmetic • Decimate as you go – To minimize total memory required • Time series toolbox contains common filtering and tapering functions • Consider using the MATLAB “save” command to save a pre-processed (e.g., collated, decimated, etc.) copy of the data – Can save significant time when doing large numbers of runs to explore other parameters or to debug 7 Analysis • Typically, this is all about speed – See “resources” section at end of slides • Rich built-in function environment – Build your own functions for code re-use • Use development tools, e.g. checkcode – http://www.mathworks.com/help/matlab/ref/checkcode.html • Use the simple plot commands to sanity check your results – – – – This is one of the key benefits of developing in MATLAB Debugging code Gotcha Debugging algorithm Label those axes Especially useful: • “hist” to make a histogram of just about anything • “plot” – time series and other continuous functions • “scatter” – discrete x – y data points8 MATLAB Resources • MATLAB Central – Functions and help threads for all types of calculations, great for custom needs – http://www.mathworks.com/matlabcentral/ • The GISMO Waveform Suite – Celso Reyes – Seismic analysis and display – http://www.giseis.alaska.edu/Seis/EQ/tools/GISMO/ • SEIZMO – Garrett Euler – http://epsc.wustl.edu/~ggeuler/codes/m/seizmo/ • M_Map – R. Pawlowicz – Mapping tools – http://www.eos.ubc.ca/~rich/map.html 9 MATLAB Resources • Libmseed – IRIS DMC (C. Trabant) – Comprehensive library of miniSEED manipulation functions – C code, with a MATLAB interface module • matTaup – a MATLAB version of the TauP toolkit • Ttbox – travel time toolbox – M. Knapmeyer – http://www.dlr.de/pf/en/desktopdefault.aspx/tabid4880/8104_read-36233/ • Function library – Frederik Simons – http://geoweb.princeton.edu/people/simons/software.html • Accelerating MATLAB – Loren Shure’s blog on accelerating MATLAB • http://blogs.mathworks.com/loren/2008/06/25/speeding-up-matlab-applications/ – Summary / tutorial on accelerating MATLAB • http://www.getreuer.info/matopt.pdf?attredirects=0 10 Or maybe don’t use MATLAB at all…enter octave, an open source alternative, http://www.gnu.org/software/octave/ 11