T-Test Tutorial

advertisement
T-test Tutorial:
A T-test is used instead of finding Z-scores when either:
●
●
The sample size is small (n<30)
The population standard deviation is not known.
Assumptions (t-test can be used if any of the following are satisfied):
●
●
The samples most come from a normally distributed population or:
o The sampling distribution is symmetric, unimodal, without outliers and the sample
size is 15 or less.
o The sampling distribution is moderately skewed, unimodal, without outliers, and
the sample size is between 16 and 40.
o The sample size is greater than 40, without outliers.
If the normality assumption is not met, check the flow chart on the BOSS website for
non-parametric tests to use.
How to load data:
●
●
●
Word Document: you can just directly copy the data into MATLAB and set it equal to a
variable of your choice.
Text file: you can just use the following command in MATLAB (when using this make
sure the data and the MATLAB files are in the same folder):
load filename.txt
This will produce a variable called filename with your data.
Excel file: use the following command
num = xlsread(filename)
This will produce a variable called num with your data.
Quick Overview
If you just want to know how to conduct t-tests then use the code below and if you want to learn
more about t-tests then please continue reading after the quick overview.
data= [1.2782
1.0017
1.0758
1.2908
1.5965
1.1203
1.7430
1.0559
1.4935
0.9070
x = dat(:,1);
y = dat(:,2);
1.2504
1.0232
1.5195
1.5116
1.2464
1.8685
2.1392
0.7924
1.5810
1.1693];
mean=1.3;
[h1,p1] = ttest(x,mean)
[h2,p2] = ttest(x,y)
[h3,p3] = ttest2(x,y)
%enter the data you want to test here
% Parsing the data so it’s only one column vector
% Parsing the data so it’s only one column vector
%The mean to compare one way t-test to
%1 sample T-Test
%paired sample T-Test
%2 sample equal varianceT-Test
Different inputs:
● For the 1 sample t-test, make sure your input data is in one column or one row.
● For the other two tests, make sure both data sets are in one column or one row.
Six different outputs:
● h1, h2, h3 will either be 0 or 1. 1 means you can reject the null while 0 means
you cannot.
● p1,p2,and p3 will give you the p values of each t-test.
Note this is the simplest version of conducting t-tests with all the MATLAB defaults: a
significance value of 0.05, a two tailed test, and an equal variance two sample t-test.
Commands on how to change these values are presented in the rest of the document.
Conducting the T-tests:
If the data set is the one given below with each column representing a different sample:
data = [
1.2782
1.0017
1.0758
1.2908
1.5965
1.1203
1.7430
1.0559
1.4935
0.9070
1.2504
1.0232
1.5195
1.5116
1.2464
1.8685
2.1392
0.7924
1.5810
1.1693];
Then first you should change this matrix into two individual matrices that each contains a
different sample.
Sample1=data(:,1);
Sample2=data(:,2);
Now you can conduct t-tests on these two samples.
One-sample t-test
●
●
To compare a sample to a known mean at a significance of 0.05 (MATLAB defaults to
this significance level) using a one sample t-test use:
Make sure that whatever data you have, it is in a row or column vector.
mean=1;
[H,PValue]=ttest(Sample1,mean)
●
To conduct either a one-sided t-test or a t-test with a different significance level of 0.005:
mean=1;
alpha=0.005;
[H,PValue]=ttest(Sample1,mean,alpha,’right’)
Paired t-test
●
●
Use a paired t-test when two sets of data are paired together in some type of way (you
are measuring the heart rate of the same person at two different exercise levels)
Again make sure the samples are in row or column vectors and that they have the same
length. These two samples should be perfectly paired with each other in MATLAB.
●
In MATLAB, the command is:
[H,PValue]=ttest(Sample1,Sample2)
●
●
You still use the ttest command but instead of a mean, you add another sample.
The changes in significance level and tailedness of the test remain the same.
Two Sample t-test
●
●
●
●
However, if you have two samples and they are not paired (heart rates of different
people at different exercise levels), use a two sample t-test
There are two types of two sample t-tests, one of equal variances and one of unequal
variances
You should be using unequal variances all the time because even when the variances
are the same, the unequal variances will produce the same answer as the equal
variances test.
MATLAB defaults to equal variances. To change to unequal variances, use the
command below:
alpha=0.05;
[H,PValue]=ttest2(Sample1,Sample2,alpha,’both’,’unequal’)
●
There is only a slight change in commands from ttest to ttest2 while everything else
remains the same.
Conclusion:
Just to summarize, there are two different t-test commands in MATLAB, one which to do one
sample and paired t-tests and one which to do two sample t-tests. How to do the commands,
inputs, and outputs are shown above and can also be viewed by using help ttest or help ttest2.
Finally the output produced by MATLAB, will tell you whether the null hypothesis can be
rejected or not and what the p-value. Some warnings are that you need to make sure to get your
null hypothesis right and figure out whether it is a right, left, or two tailed test. This will affect
what output, MATLAB will provide.
Download