Stata Housekeeping

advertisement
Session 2
Housekeeping:
Variable labels, value labels,
calculations and recoding
1
Review

You have used Stata




Largely through the menus and dialogues
But also with a few commands
We hope you found it (surprisingly?) easy
Discuss


what you liked
And difficulties so far
2
Housekeeping tasks
By housekeeping, we mean


the small jobs to organise and add labels to the data
They make life easier later.
This includes:





labelling and adding notes to datasets;
labelling variables
labelling categories (or values) taken by the variable
recoding variables and dealing with codes for missing
values
using log files to keep a record of what you have done.
3
Labels and notes
Open the file named E_HouseholdComposition.dta
Use Data  Labels  Label dataset
4
Dialogue for labelling data set
Type in dialogue as below or use the command
label data “Young Lives Study……”
5
4
Labelling variables
Use the menu sequence
Data  Labels  Label variable as shown below
Or type the command:
label variable relcare "What is your relationship to child?“
6
Defining value labels
Use:
Data  Labels  Label values
 Define or modify value labels
and complete the dialogue box that follows.
The corresponding commands show that two
steps are needed to label the values.
• First, a label must be defined,e.g.
label define sexlabel 1 "male" 2 "female"
• Then this label is attached to the variable,
• e.g. for the variable called sex use the command
label values sex sexlabel
7
Your turn

Work through Section 4.1of the Stata Guide

Note down any difficulties you have and
clarify your difficulties with a resource person
8
Recoding a variable
Also use
options
to define
a new
variable
Data  Create or change variables
 Other variable transformation commands
 Recode categorical variable
9
Information on the recoded variable

Always safer to recode into a new variable,
e.g. seedad2.

The effect of the recoding can be seen by typing
codebook seedad2

If seedad is later no longer needed, it can be
dropped.

Use File  Save, to save information on the new
variable in the data set.
10
Your turn again

Work through Section 4.2 of the Stata Guide

Note down any difficulties you have and
clarify your difficulties with a resource
person
11
Missing values

Symbols for missing values in Stata:
.
and .a .b .c
and so on, up to
.z

These are used to distinguish between the different
reasons for values to be missing.

When making calculations, comparisons or sorting, the
following rules are observed:

all non-missing numbers are less than
.
 . is less than .a
 .a is less than .b, and so on, up to .z
12
Memory





The initial memory in Stata is 1 megabyte
This can be changed, but first type Clear to clear memory
To increase the current memory to 20 mbytes, type
set memory 20m
For setting Permanent memory, use
set memory 20m, permanently
For problems processing large datasets, use the compress
command.
13
Log files

To keep a record of the output, while using Stata


This opens a dialogue



Open a log file by clicking on the Log icon.
In your working directory
so you can name the log file
It suggests an extension smcl

.smcl stands for Stata Markup and Control Language.
•Log files in Stata record both commands and output.
14
Remarks

You can change the extension to “log” to produce a
simple ASCII file

Other packages use the idea of a log file to record just
the command – not the output as well
You can do this in Stata (but not from the menus)




Do the same again, but using


Notice that the command Stata used for its log file was
. log using “name of file”
. cmdlog using “name of file”
If at a later stage you need to append or replace this
file, add the option replace or append at the end of the
above commands.
15
Your turn

Practice the above ideas by working through
Sections 4.6, 4.7, 4.8 of the Stata Guide.

Then either read your own data into Stata


and perform some simple analyses using methods
covered so far
Or use a dataset suggested by the resource persons.
16
So if you have a dataset…

Open, within Stata, the data file in Stata format
that you created in the previous session.

Identify the key variables in your data set and set up
labels for each of these variables.

Identify any categorical variables in your data set.
Then define, and set value labels that describe the
levels for each categorical variable.

Finally, re-save your data file.
17
Download