Codebook Summarize Browse Edit Describe Rename Tabstat Tabulate Drop Changing Data o Rename To rename the title of variable o Replace To change the content of variable o Destring To convert string data into numeric variables o Encode To convert string into numeric variables o Tsset To change the date format, daily, monthly, yearly Generating new variables o Generate To create new variables o Egen Extended version of the generate o Codebook To see further adjustment once new variable created Regress y k l Predict residuals Corr residual k l Regress o Used to regress IV and DV Count if var2>3 Command to count the no of values greater than three By var1 : Count if var2>3 Further categorize the count command label variable id "serial of observations" Stata has two built-in variables called _n and _N. _n is Stata notation for the current observation number. _n is 1 in the first observation, 2 in the second, 3 in the third, and so on. generate id = _n for serial no 1,2,3,4,5… generate nt = _N list sort hhcode idc ucode . by hhcode: generate n1 = _n . by hhcode: generate n2 = _N generate RBO = ( sondau-1)/( n2-1) sort group score by group: generate n1 = _n by group: generate n2 = _N list score group id nt n1 1. 72 1 1 7 1 2. 76 1 3 7 2 3. 85 1 7 7 3 4. 90 1 6 7 4 5. 82 2 5 7 1 6. 84 2 2 7 2 7. 89 3 4 7 1 n2 4 4 4 4 2 2 1 list if n1==1 score group id nt n1 1. 72 1 1 7 1 5. 82 2 5 7 1 7. 89 3 4 7 1 score group id nt n1 4. 90 1 6 7 4 6. 84 2 2 7 2 7. 89 3 4 7 1 n2 4 2 1 list if n1==n2 n2 4 2 1 For son/daughter siblings order steps command sort sondau hhcode id by sondau hhcode: replace sondau = _n if relation ==3 sort id work file commad generate check = idc<=1 & relation<=1 where the two values if different coulums are same will be 1 FOR PRIMARY EDUATION by hhcode: gen PEDU1 = education>=5 if relation==1 & education<8 --by hhcode: generate PEDU1 = 1 if idc==relation where the two values in different columns are same will generate 1 --by hhcode: generate PEDU1 = 1 if relation<=1 & sc1q05<=5 OR -- by hhcode: generate PEDU1 = 1 if relation==1 & sc1q05<=5 where the family head have class 5 education FOR MIDDLE EDUCATION --by hhcode: generate PEDU2 = 1 if relation==1 & sc1q05>5& sc1q05<=8 replace PEDU2 = . by hhcode: replace PEDU2 = education>=8 if relation==1 & education<10 where the head will have middle education and more than class 5 education will be equal to 1 MATRIC by hhcode: gen PEDU3 = education>=10 if relation==1 & education<12 --. by hhcode: generate PEDU3 = 1 if relation==1 & sc1q05>8& sc1q05<=10 INTERMEDIATE by hhcode: gen PEDU4 = education>=12 if relation==1 & education<13 --by hhcode: generate PEDU4 = 1 if relation==1 & sc1q05>10 & sc1q05<=12 BACHELOR by hhcode: genarate PEDU5 = education>=13 if relation==1 & education<14 --by hhcode: generate PEDU5 = 1 if relation==1 & sc1q05>12 & sc1q05<=14 MASTER by hhcode: gen PEDU6 = education>=14 if relation==1 & education<22 --by hhcode: generate PEDU6 = 1 if relation==1 & sc1q05>14 & sc1q05<=16 MS/MPHIL by hhcode: gen PEDU7 = education>=22 if relation==1 & education<25 & education != 23 23 to eliminate phd --by hhcode: generate PEDU7 = 1 if relation==1 & sc1q05==22&24 Where the head education is MS mphil PHD by hhcode: replace PEDU8 = education>=23 if relation==1 & education<24 Where the head education is Phd To create dummy variable 1. tab seaq06, ge(seaq06) 2. tabulate region, gen(r) 3. tabulate seaq06, gen(workD) 4. tab sc1q01, ge(schoolD) to create dummy variable to change order of variable use command 1. order education, before( EDU1) will move the variable before edu1 variable 2. order education, last will move the variable to last 3. order education will move the variable to first to convert numeric value into alphabet decode relation, gen(Relation) will convert the numeric value into alphabet 1 for head, 2 for spouse to create automatically dummy variables regress age i.relation will automatically create dummy variable to regress. regress age b2.relation will change the base group where value 2 for spouse. CHILD EVER ATTENDED SCHOOL generate schoolD4 = school>1 if relation==3 & school !=. --generate school4 = 1 if sc1q01 !=1 education dummy equal to 1 where never attended school generate EDULEVEL1 = sc1q05<5 & relation==3 --generate EDULEVEL = sc1q05<5 & relation==3 to generate 1 if education is less than primary only for son/daughters replace EDULEVEL1 = 2 if (sc1q05>=5 & relation==3) & (sc1q05<10) --replace EDULEVEL = 2 if (sc1q05>=5 & relation==3) & (sc1q05<10) to replace existing variable value with 2 if education is primary but not secondary only for son/daughters replace EDULEVEL = 3 if (sc1q05>=10 & relation==3) & (sc1q05<12) to replace existing variable value with 2 if education is secondary FOR CONSTRUCTION OF EDUCATION VARIABLE DUMMY 1. less than primary a. generate eduD1 = education<5 if relation==3 2. primary level a. generate eduD2 = education>=5 if relation==3 & education<10 3. secondary level and higher a. generate eduD3 = education>=10 if relation==3 & education <25 SCHOOL LEVEL (PRIMARY, SECONDARY) generate schlevel = "less than primary" if education<5 & sb1q2==3 replace schlevel = "primary" if education>=5 & sb1q2==3 & education <10 replace schlevel = "secondary school" if education>=10 & sb1q2==3 & education <24 encode schlevel, gen(schlevel1) drop schlevel TO DROP OBSERVATIONS drop in 870172/921656 ……………. Write in command window TO VERIFY IF VALUES IN TWO COLUMS NOT SAME list if age != AGE to verify values in columns if not equal SIBLING SIZE/ORDER IN FAMILY 1. gsort hhcode relation -age sort age descending used minus sign 2. by hhcode relation: replace siblingorder = _n if relation==3 to order the siblings only for son/daughter 3. by hhcode relation: replace siblingsize = _N if relation==3 to count the siblings in family RELATIVE BIRTH ORDER replace RBO = (siblingorder-1) / (siblingsize-1) SUMMARY OF VARIABLES tab work province , col ch MULTIVARIATE LOGIT MODEL mlogit schoollevel familysize relativebirthorder i.work mlogit schoollevel familysize relativebirthorder i.work, rrr mlogit schoollevel familysize relativebirthorder if age<=17 & relation==3 How to set the ‘Time variable’ for time series analysis in STATA? https://www.projectguru.in/setting-time-variable-time-seriesanalysis-stata/ COUNTING FROM _N TO _N https://stats.oarc.ucla.edu/stata/seminars/notes/counting-from_n-to_n/#:~:text=Stata%20has%20two%20built%2Din,the%20total%20n umber%20of%20observations.