Chapter 5

advertisement
Computer Science 121
Scientific Computing
Winter 2012
Chapter 5
Files and Scripts
Files and Scripts
●
●
File (non-technical): (Word) document, image,
recording, video, etc.
File (technical): a named collection of bytes on
disk.
●
ASCII vs. Binary
●
●
“ASCII file” means “file that can be viewed as text by a
program (Notepad) that interprets each byte as an ASCII
code”.
Binary file is anything that cannot be viewed that way
●
●
●
●
“JPEG file” means “file that can be viewed as an image by using a
program (Photoshop) that interprets the bytes as JPEG-encoded
image.
“MP3 File” means “file that can be watched/heard as a video/audio
recording by using a program that interprets the bytes as an MP3encoded video / audio stream”.
“Foo File” means “file whose contents can be experienced by using
a program that interprets the bytes as a Foo encoding”.
XML (eXtensible Markup Language) is an attempt to
compromise between binary and ASCII: make all data
human-readable
5.1 Filenames
●
●
●
General format: name . extension
For historical reasons, extension is usually three
characters.
Extension tells OS what program to use to open
file (MS Word, Excel, Matlab, ...)
Aside: File Deletion
●
Q.: What happens when you “delete” a file?
sort.m
foo.m
OMFG.jpg
hamlet.doc
●
011010
110101
000100
111011
(Drag OMFG.jpg to trash and empty trash…)
Aside: File Deletion
●
A.: What appears to happen...
foo.m
sort.m
hamlet.doc
011010
110101
111011
Aside: File Deletion
●
A.: What actually happens ...
foo.m
sort.m
011010
110101
000100
hamlet.doc
●
111011
Then use WinUnDelete (e.g.) to get back
OMFG.jpg
Directory Structure
●
●
●
Directories (folders) are organized hierarchically
(one inside another)
So we are forced to choose a single organization
method (like library with card catalog indexed
only by author)
But we can use links (shortcuts) to add additional
organization, without copying files.
Pathnames
●
Pathname is “full name” of directory in a linear
form
– e.g., C:\MyDocuments\cs121\myproj\new\
●
Complete filename includes path
– e.g.,
C:\MyDocuments\cs121\myproj\new\myprog.m
●
This becomes important because of the ...
Working Directory
>> pwd % print working directory
ans = C:\MATLAB\work
●
Without extra effort, we can only access files in our
working directory
>> myprog % run myprog.m script
ERROR: myprog? LOL!!
Working Directory
●
Solutions
●
●
●
Make shortcuts from working directory
(annoying)
>> cd('C:\MyDocuments\cs121\myproj\new\')
>> myprog
ERROR: Can't find someOther.m… loser!
Use Matlab File menu to add paths:
File / Set Path...
Set Path
How Matlab Uses Paths
When we type a name foo into the interpreter,
Matlab follows this sequence:
●
1. Looks for foo as a variable. If not found, ...
2. Looks in the current directory for a file named
foo.m. If not found, ...
3. Searches the directories on the MATLAB search
path, in order, for foo.bi (built-in function) or
foo.m. If not found, ...
4. Reports ERROR
5.2 File operators
●
●
File write/read operators allow us to save/restore
values from previous Matlab sessions.
File / Save Workspace As... is simplest way to do
this – saves everything to a .mat file
●
If we want to save/restore specific variables, we
can use the save and load commands:
5.2 File operators
>>
>>
>>
>>
>>
a = 'foo'; b = 2; c = pi;
save myvariables a b
clear
load myvariables
who
Your variables are:
a
b
–I never use the other syntax
( >> save('myvariables', 'a',
'b' )
5.3 Importing and Exporting Data
• Often want to get data from other programs
(Excel, LabView, text editor) into Matlab, and
save data in a format that other programs can
read.
• Excel saves data in binary, proprietary (of
course!) .xls format
5.3 Importing and Exporting Data
• Generally, other formats will all be textbased (ASCII)
–.csv : comma-delimited values (no
commas in vals)
–.dlm : other delimiter (allows commas in
vals)
–.xml : eXtensible Markup Language
(newer)
Spreadsheet data should have all cells filled (“flat
format”), or Matlab will get confused:
YES
NO
5.3 Importing and Exporting Data
csvread operator allows us to read numerical data, but we
need to cut off the header in the file:
Remove it by hand from the file:
>> d = csvread('sunspots-noheader.csv');
Specify # of lines to cut ignore in cvsread:
>> d = csvread('sunspots.csv', 1);
% ignore first line
5.3 Importing and Exporting Data
>> d = csvread('sunspots.csv', 1)
d = 1749 1 58
1749 2 62.6
1749 3 70
etc.
5.3 Importing and Exporting Data
importdata command is useful for
heterogeneous data.
● Returns a data structure:
●
>> d = importdata('sunspots.csv')
d = data: [2820x3 double]
textdata : {'Year', 'Month', ...
colheaders : {'Year', 'Month',
...
Non-numerical ASCII Files
txt files : anything we want to treat as text
(ASCII characters)
•
>>
>>
>>
>>
fid = fopen('mobydick.txt');
s = fread(fid);
fclose(fid)
s
s =
32
67
97 ...
% need to munge this
Non-numerical ASCII Files
>> s = char(s') % transpose, textify
ans = Call me Ishmael. Some years agonever mind how long precisely -having
little or no money in my purse, and
nothing particular to interest me on
shore, I thought I would sail about a
little and see the watery part of the
world....
textread does this for us, and tokenizes words
into cell array:
>> s = textread('mobydick.txt‘, ‘%s’)
s = {‘Call’, ‘me’, ‘Ishmael.’, …
Treat as
strings
5.4 Scripts
●
●
You know most of this stuff already ☺
You can run a script (e.g., myprog.m) from the
interpreter:
>> myprog
●
Tips
− Don't name any variables myprog
− Don't use any blank spaces in script names
− Re-read search path stuff from a few pages
back
5.5 Scripts as Computations
Scripts are (mostly) like typing directly
into the interpreter – so variables can get
overwritten
● This also means that there is no ans value:
●
>> x = myprog
ERROR: loser trying to execute
SCRIPT myprog as a program.
●
Nor can we pass arguments:
>> myprog(7)
ERROR: My name is Donnie, and you
suck at Matlab.
Download