Answer Key for HW1

advertisement
GDA homework #1
Due MATLAB TIME: 735276
1) What date is this homework due (see command datestr)?
DATESTR(735276)
2) Use the MATLAB command rand to generate a 1000x100 array of random numbers.
Consider this array as 1000 records of random number each 100 elements long.
X=rand(1000,100);
a. Use the corrcoef command to generate the covariance array (r) of
100x 100 of these random numbers along with the p-value of each of
the correlations
[r p]=corrcoef(x);
subplot(2,2,1);
pcolor(r); shading interp; colorbar
subplot(2,2,2);
pcolor(p);shading interp; colorbar
b. This array is symmetric with the diangle values of r=1. Why is this?
Because the diangle terms are the correltion between each
record and itself which by definition has a correlation coefficient
of 1
c. What percent of these correlations are significant at the 95%
confidence interval? Is this expected? Why?
i=find(p<=0.05); % finds all the p values that are significant note that the pvales of the diangle elements are 1. Each time you run this you will get a
slightly different number—but you will generally find that about 5% of these
random series have a significant correlation at the 95% confidence limit. The
reason is, of course, that by definition a random time series has a 5% chance
of being significantly correlated with any series at the 95% confidence limit.
This also means that if you keep searching for some record that is correlated
with your record, ENSO, NAO, price of Balsamic Vinegar in Modona etc. you
will eventually find something that is correlated at 95% confidence limit.
3) The MATLAB command randn(N,M) returns a N,M randomly chosen numbers
chosen from normal distribution.
a. Plot the histogram of 100000 randomally chosen numbers obtained
with randn
x=rand(1,1000000);
hist(x,100);
b. Use randn to generate the chi-squared distrubtion for 1,3,5 and 10
degrees of freedom. Draw the histogram of each of these along with
the chi-squared distribution using the matlab command chi2pdf. If
you’re unclear how to do this what the Khan Academy video I had sent
out.
Here’s an example for 3 degrees of freedom
dof=3
x=randn(dof,100000);
chi2=sum(x.*x);
hist(chi2,100);
4) This MATLAB file battery.mat on the class web page contains hourly sea-level data
from the Battery in Lower Manhattan from 1927 through 2012. The units of the sealevel elevation data (H) are in meters and time (tm).
a. Plot the raw data.
load battery
plot(tm,H);
datetick('x',10,'keepticks')
b. For each year of the record compute the mean sea-level for each year
(Hm) and plot the result. The MATLAB commands datevec and find
will come in handy here. With each data point also plot the standard
error. Plotting the standard error can be done by drawing a vertical
line at each data point that extends +/- the standard error from the
point.
load battery;
[year mo day hour min sec]=datevec(tm);
for iy=1927:2012;
k=find(year==iy);
Hb(iy-1926)=nanmean(H(k));
se(iy-1926)=std(H(k))/sqrt((length(k)));
end
plot(1927:2012, Hb);
hold on
for i=1927:2012;
ii=i-1926;
X=[i i];
Y=[Hb(ii)-se(ii) Hb(ii)+se(ii)];
plot(X,Y,'k');
end
c. Use the matlab routine xcorr to find the decorrelation time-scale for
the yearly averaged sea-level data. Note that you will need to remove
both the mean and a linear trend from the data. This can be done with
the matlab command detrend. While it’s not clear what the meaning
of the value returned by the function xcorr (if you can figure it out let
me know!)—we can be sure that the zero-lag correlation is 1 and thus
we can divide the correlations by the value of the zero lag (you can
find that by using the max function since it’s the highest value i.e.
r=xcorr(Hm); plot(r/max(r)) ).
x=detrend(Hb);
r=xcorr(x);
figure(2);
r=r/max(R);
plot(r);
hold on;
plot(r,'.');
d. Next break the record into multiple parts (for example before and
after 1965, or perhaps 3 parts if you see a change in the trend) and
calculate the sea-level rise in each of those segments by using the
MATLAB command polyfit. Next put error bars on the estimates of
slope (95% confidence limits) to see if there has been a statistically
meaningful difference in the rate of sea-level rise. Use the MATLAB
command tinv to get the student-t values to put error bars on your
estimates of the slope.
Here’s a piece of code to put error bars on the slope. In the end the data shows a
decrease in the rate of sea-level rise but the change in slope is not significant at the
95% confidence limits
http://marine.rutgers.edu/dmcs/ms615/slr.m
Download