Python for Science Shane Grigsby What is python? Why python? • Interpreted, object oriented language • Free and open source • Focus is on readability • Fast to write and debug code • Large community – Lots of documentation – Lots of packages… • General purpose language The python scientific stack: Python: Fast to write, slow to run? • Depends on how you use it– if something is slow there is probably a faster way to do it! Are you using a numeric library? Are you slicing through arrays, or looping lists? Is your code vectorized? Numpy calls fortran code to do array operations Many other libraries call C for operations… …or have functions written in both C and python e.g., scipy.spatial.kdtree vs. scipy.spatial.cKDTree How is python different from MATLAB? • Indexing starts at 0 • Space delimited • Default behavior is element-by-element when dealing with arrays • Functions use ()’s, indexes use []’s, tuples and dictionaries use {}’s • You don’t have to use a ‘;’ on every command • Object oriented See also: http://mathesaurus.sourceforge.net/matlab-python-xref.pdf Today’s tutorial • Intro to the scientific stack • Importing modules • Intro to the development environments: – Spyder – iPython • • • • • • Notebooks: • SARP python tutorial Indexing • Prism Data Defining functions • Regression loops Plotting and graphics Files: Intro to data structures • Leaf_Angles.csv Basic looping (maybe…) • good_12_leafangle.h5 Additional pandas • monthly.nc # Optional – data import from clipboard • *_spec.txt – time series (on your own) Terminals and Prompts • We’ll use python and python tools from three different ‘prompts’: – The system prompt (i.e., cmd) – From spyder – From the iPython Notebook • Note that these will all run the same version of python, but with slightly different behaviors Notebooks: • SARP python tutorial • Prism Data • Regression loops Files: • Leaf_Angles.csv • good_12_leafangle.h5 • monthly.nc # Optional Imports Basic python is sparse… …but we can import! import tables import numpy as np from pylab import * From scipy import ones, array Example time: Using the iPython notebook • Notes: – ‘%’s are specific to the iPython NB’s; they won’t work in your terminal – We’ll use: %pylab inline • This doesn’t work for 3D or interactive plots (yet) • Use spyder or ipython (without the notebook) to access interactive graphics. Imports • Pull from the python install directory first – i.e., lib/python2.7/site-packages • Pull from the current directory second • Conflicting imports are replaced with the last import Defining a function from scipy.constants import * • Multiline comment • Keyword arguments def Xwave(wavelength, temp, unit=1): – can use default values • Definition syntax – return is optional • Constants defined at the top of the script • Top line brings in physical constants, so we don’t have to define them ourselves… X_wave = (h*c)/(k*(wavelength*unit)*temp) return X_wave def Lwave(wavelength, temp, unit=1): """Calculates L given wavelength and Temp To get M, multiply by pi Note that units are: W * m**-2 * sr**-1 * m**-1 I.e, units are given in meter of spectrum multiply by nm to get: W * m**-2 *sr**-1 nm**-1""” ang = 1E-10 nm = 1E-9 um = 1E-6 Cm = 1E-2 hH = 1.0 kH = 1E3 mH = 1E6 gH = 1E9 tH = 1E12 X_funct= Xwave(wavelength, temp, unit) L=(2*h*(c**2))/(((wavelength*unit)**5)*(exp(X_funct)-1)) return L Defining Functions • Functions are defined using ‘def’, function name, ‘()’’s with parameters, and an ending ‘:’ • The function body is demarcated using white space (as in for loops) • Functions are called using the function name, ‘()’’s, and input parameters – Note that the input parameters don’t have to match the names that requested… Looping in python for item in list: print item for i in range(len(list)): # NO… print list[i] # executes, but is wrong • ‘item’ is a variable name; it is not declared in advance; it is arbitrary (i.e., could be ‘i’, ‘point’, or some other label). There could also be more them one variable here—see next slide… • ‘for’ and ‘in’ are syntactically required; they bracket our variables. • ‘list’ could be another data structure (dictionary, array, etc.), or could be a function in advanced use. • Note: else and elseif are not required, but can be used • Note: the white space is required—either a tab or four spaces • Note: we don’t need to know how many items are in the data structure A more advanced loop: from liblas import file import scipy f = file.File('/Users/grigsbye/Downloads/Alameda_park_trees_pts.las',mode='r') treeData = scipy.ones((len(f),3)) for i, p in enumerate(f): treeData[i,0], treeData[i,1], treeData[i,2] = p.x,p.y,p.z • • • • • First line imports a special module to read .las files Third line reads a .las file into a python object we can loop over Fourth line creates a scipy/numpy array to hold our data ‘Enumerate’ returns an index number for each point that we are looping over The last line assigns the x, y, and z values to our array, incrementing the index by one with each loop • For a more complete guide to looping, see: http://nedbatchelder.com/text/iter.html