cpslink is a STATA program for matching adjacent year records from the Current Population Survey. The cps has an in-4-out-8-in-4 rotation where there is a household interview in four successive months, an eight month rest that is followed by interviews in four successive months. With notable exceptions when household identifiers have been scrambled to prevent matching, households can be linked across surveys. The cross year link follows from the observation that the second set of four interviews are in the same calendar months as the first four interviews. Thus if there is no attrition, in any given survey, half of the respondents would have been interviewed the previous year and the other half will be interviewed the next year. For the March Surveys (which includes the Annual Demographic Supplement) a person identifier, the line number, was added in 1979. In principle, the line number does not change once records for an individual are added to the survey. With attrition, about 25-percent of the households cannot be matched across years and household membership varies so based on counts alone, only about 95 percent of the individuals can be matched. The matching program begins with data in long form where matching months for two successive years are stacked. For example in 1980 and 1981, observations from month-in-sample(mis) 1 from 1980 can be matched with the 1981 data for mis 5. The same is true of mis 2 with 6, 3 with 7, and 4 with 8 where the lower values of mis refer to the first year of the adjacent pair. The input data are a set of separate data sets where each includes the stacked records for one month of an adjacent year pair. The cpslink input data must include a variable named "year" (the final two digits of the survey year) and another named "mis". mis for second-year records should exceed mis for first- year records by 4. The output data retains mis and year, but mis is reduced by 4 in the secondyear records. The output data are like the input data, the records for the two years are stacked and each data set refers to one month- in-sample pair for adjacent years. The individual records carry two new variables, level and match_id. level is -1 for unmatched households and 0 for unmatched records in matched households. For matched records, match_id with mis and year, provides a unique identifier. The matching process consists of a user-specified series of steps. At each step, the user specifies a list of variables for matches in that step. The data are then sorted into cells distinguished by a user supplied household identifier and the variables in the step's match list. Matches occur when a cell contains one or more first-year and one or more second-year records. If a match is not unique, a tie-breaking phase of the step is entered. The user specifies a secondary list of variables for breaking ties. The tie-breaker first checks for matches at the most detailed partition permitted by the variables given. (In this round, secondary ties or multiple matches are separated randomly.) If a match is not found, the list of variables is trimmed by dropping the leftmost variable and the process is repeated. For matched records, level indicates the step where the match occurred. With step, s, level=2*s-1 for unique matches and level=2*s for matches formed at the tie breaker part of the step. There is an option to retain unmatched records so that attributes and behavior of the matched individuals can be compared to the unmatched ones. There is a companion set of programs, makedwide and utilities, with the suffix .wid that rearrange the matched records into wide form with the first and second year variables in the same record. (See makewide.doc) The cpslink programs carry the suffix,.cps. Components are: cpslink.cps, getyrmo.cps, testmacs.cps, first.cps, match.cps, resdups.cps, this.cps, that.cps, other.cps, last.cps, cleanup.cps, and macclear.cps. cpslink requires three user supplied do-files, addvars, dropvars, and setmacs. addvars and dropvars can be dummies including the single command, <exit> if no action is to be taken. addvars is a program to construct variables that are used in matching. (There is an example included where "compage" is constructed; compage is age for first-year records and it is age-1 for secondyear records.) dropvars drops variables not intended for the permanent file. setmacs.do is a list of controlling macros. In addition to the do-files, there is a user supplied list, yearlist.raw, that lists the years for which data are to be matched and, if a subset of months-in-sample are used, the months as well. setmacs.do includes: logname (required) datain (required) dataout (required) temploc (optional) bloat (optional) compress (optional) unix (required only if $bloat or $compress) retain (optional) family (required) steps (required) match1 (required) . (required) . (required) match#steps (required) break1 (required) . (required) . (required) break#steps (required) logname the location and name of the log file to be created. (The log will replace a pre-existing file of the same name.) datain the location & prefix for the input data, /cpsdata/cps, for example, dataout the location & prefix for the output, e.g., /cpsdata/matched/mth, NOTE: if $datain==$dataout is not permitted. temploc the directory used for working space, e.g., /usr/temp. (If temploc is not specified, the current directory is assumed.), bloat 1(the input data are compressed) or 0 (the input data are not compressed.) compress 1(compress output) or 0 (do not compress). The default for bloat and compress is 0 so they are required only if compressed input is used or compressed output is desired. NOTE: compress and bloat are unix options only. unix 1(the os is unix) retain Followed by one token in double quotes, retain defines the records to be retained in output. The default (retain not specified) is matched records only. "none" also retains only the matched records. Other options "all" (keep all matched and unmatched records), "hhmatch" keeps all matched and unmatched records from the matched households, "year1" keeps all first year records plus all second year records from the matched households (whether matched or not), finally "year2" adds the second year records from the unmatched households. retain is cumulative. Matched records are always retained. "hhmatch" adds the unmatched records from the matched households, "year1" adds the first year records from the unmatched households and "year2" adds the second year records from the unmatched households. NOTE: "year2" is equilivent to "all". family One or more variable names, in double quotes, to specify the variables that uniquely identify the household. steps An integar indicating the number of levels or steps where matches will be selected. For each step there are two macros, match{n} & break{n} where {n} is the step. match{n} is a list of variables, in double quotes. Matches occur within matched households when first year records match second year records for the variables specified in this list. break{n} includes a supplementary list of variables, in double quotes, for breaking ties. The tie breaking process is sequential. Observations that match on the $match{n} variables are then matched on $break{n} when second tier ties occur, they are broken by random assignment. If second tier matches do not occur, the macro break{n} is reduced by dropping one variable (from the left) and the process is repeated. yearlist.raw includes a list of first years (one to a line) of the adjacent year matches. Although the default is to match all four pairs of monthsinsample, there is an option to match only selected months. In this case, on the line specifying a given year, the user adds the months to be matched. Again, as with year, only the month in the first year is specified. Example: 88 85 3 2 The two-line file shown above, if used for yearlist.raw, is a request that all possible months be matched from the 1988 and 1989 Surveys, but, the request is that month-in-sample 2 and 3 from the 1985 Survey be matched with months-in-sample 6 and 7, respectively, from the 1986 Survey. Examples of setmacs.do, addvars.do, dropvars.do, and yearlist.raw are included along with the *.cps programs.