Lesson 18 Using text files to share data with other programs Python Mini-Course University of Oklahoma Department of Psychology 1 Python Mini-Course: Lesson 18 5/07/09 Lesson objectives 1. Identify and describe the common file formats for data exchange, including text files, csv files, and tab-delimited files. 2. Read and write delimited text files in Python 3. Use a spreadsheet to read and create delimited text files 2 Python Mini-Course: Lesson 18 5/07/09 Data exchange Key principle: Data must be stored in a common format Industry standard formats for encoding text ASCII, Unicode, etc. Industry standard formats for exchanging data Delimited text files 3 Python Mini-Course: Lesson 18 5/07/09 Delimiting Method of marking the boundaries between data fields in databases, spreadsheets, and tabular data 4 Python Mini-Course: Lesson 18 5/07/09 Common delimited text files White-space delimited Any non-printable character code (space, tab, newline, etc) Common in older systems and for numeric data (e.g. SAS data files) 5 Python Mini-Course: Lesson 18 5/07/09 Common delimited text files Comma delimited Most common format Files are designated as csv files (comma-separated values) Example: USF norms http://w3.usf.edu/FreeAssociation/ 6 Python Mini-Course: Lesson 18 5/07/09 Common delimited text files Tab delimited Excellent format for tabular data Can be read directly by Excel Sometimes called tsv files Example: Substance Abuse and Mental Health Data Archive http://www.icpsr.com/SAMHDA/index.html 7 Python Mini-Course: Lesson 18 5/07/09 Example: sqrtable.py # Create a table of squares and square roots import math start, stop, step = 1, 100, 1 delimiter = '\t' filename = 'squares.txt' num, sqr, sqr_rt = [], [], [] for val in range(start, stop+1, step): num.append(val) sqr.append(val**2) sqr_rt.append(math.sqrt(val)) 8 Python Mini-Course: Lesson 18 5/07/09 Example: sqrtable.py # Save to a delimited text file fout = open(filename, 'w') hdr = 'Num%sSquare%sSqrRoot\n' % (delimiter, delimiter) print hdr fout.write(hdr) for row in zip(num, sqr, sqr_rt): line = '%d%s%d%s%2.4f\n' % \ (row[0], delimiter, row[1], delimiter, row[2]) print line fout.write(line) fout.close() 9 Python Mini-Course: Lesson 18 5/07/09 Exercises 1. Open the file square.txt with a text editor 2. Open the file square.txt with a spreadsheet application (e.g., Excel) 3. Create a similar spreadsheet and save as a text file 10 Python Mini-Course: Lesson 18 5/07/09 Example: readsqr.py delimiter = '\t' filename = 'squares.txt' num, sqr, sqr_rt = [], [], [] fin = open(filename, 'r') hdr = fin.readline().strip() for row in fin: line = row.strip() entries = line.split(delimiter) num.append(int(entries[0])) sqr.append(int(entries[1])) sqr_rt.append(float(entries[2])) fin.close() 11 Python Mini-Course: Lesson 18 5/07/09 Example: readsqr.py # Print a table of values print hdr for row in zip(num, sqr, sqr_rt): line = '%d%s%d%s%2.4f' % \ (row[0], delimiter, row[1], delimiter, row[2]) print line 12 Python Mini-Course: Lesson 18 5/07/09