CSCI 4390/6390 Database Mining Assignment 1 Instructor: Wei Liu Problem 1 (10 points)

advertisement
CSCI 4390/6390 Database Mining
Assignment 1
Instructor: Wei Liu
Problem 1 (10 points)
Solve Q2 of 1.7 EXERCISES in the textbook (page 30). (Hint: You should derive a lower bound and a upper
bound of Lp distance, and then pursue the limits of them.)
Problem 2 (10 points)
Suppose that there are N independently and
identically (I.I.D.) random variables x1 , x2 , · · · , xN , each of which
P
has the mean µ and the variance Σ. Let x̄ =
N
i=1
xi
N
denote the sample mean. Show that the sample variance
PN
S=
− x̄)2
N −1
i=1 (xi
is the unbiased estimation of the true variance Σ.
(Hint: You should prove E[S] = Σ using the fact Var(x̄) =
Σ
N .)
Problem 3 (10 points)
There are two locations a and b. A walker will commute between them. At each time, the walker always has
two choices: jump to another location with the probability p; stay at the same location for a time unit with the
probability 1 − p. Assume that one jump takes one time unit, and that the walker initially stays at location a.
Please answer:
1) after two time units, what is the probability of the walker being at location a?
2) after four time units, what is the probability of the walker being at location a?
Problem 4 (20 points)
In class, we describe the standard convex quadratic form. Now we want to discuss another quadratic form:
f (x) = x> Ax + b> x + c,
where x ∈ Rd is a variable vector, A ∈ Rd×d is an asymmetric matrix, b ∈ Rd is a constant vector, and c is a
constant. Please answer:
1) under what condition, the function f (x) is convex;
2) if f (x) is a strictly convex function, give one globally optimal solution to the problem minx∈Rd f (x).
Problem 5 (20 points)
Suppose that there are N independently and identically (I.I.D.) random multi-variable vectors x1 , x2 , · · · , xN ,
each of which has the mean µ and the covariance Σ. The dimension of each vector is d, and all these vectors are
linearly independent. Please answer:
1) if d = N , what is the mean and covariance of a random vector q in the linear space Span(x1 , x2 , · · · , xN )?
2) if d > N , what is the mean and covariance of a random vector q in the linear subspace
Span(x1 , x2 , · · · , xN )?
P
α
x
(Hint: You should write q as a linear combination of x1 , x2 , · · · , xN , i.e., q = N
i=1 i i , and then solve these
coefficients α1 , · · · , αN . )
Download