CSCI 4390/6390 Database Mining Assignment 1 Instructor: Wei Liu Problem 1 (10 points) Solve Q2 of 1.7 EXERCISES in the textbook (page 30). (Hint: You should derive a lower bound and a upper bound of Lp distance, and then pursue the limits of them.) Problem 2 (10 points) Suppose that there are N independently and identically (I.I.D.) random variables x1 , x2 , · · · , xN , each of which P has the mean µ and the variance Σ. Let x̄ = N i=1 xi N denote the sample mean. Show that the sample variance PN S= − x̄)2 N −1 i=1 (xi is the unbiased estimation of the true variance Σ. (Hint: You should prove E[S] = Σ using the fact Var(x̄) = Σ N .) Problem 3 (10 points) There are two locations a and b. A walker will commute between them. At each time, the walker always has two choices: jump to another location with the probability p; stay at the same location for a time unit with the probability 1 − p. Assume that one jump takes one time unit, and that the walker initially stays at location a. Please answer: 1) after two time units, what is the probability of the walker being at location a? 2) after four time units, what is the probability of the walker being at location a? Problem 4 (20 points) In class, we describe the standard convex quadratic form. Now we want to discuss another quadratic form: f (x) = x> Ax + b> x + c, where x ∈ Rd is a variable vector, A ∈ Rd×d is an asymmetric matrix, b ∈ Rd is a constant vector, and c is a constant. Please answer: 1) under what condition, the function f (x) is convex; 2) if f (x) is a strictly convex function, give one globally optimal solution to the problem minx∈Rd f (x). Problem 5 (20 points) Suppose that there are N independently and identically (I.I.D.) random multi-variable vectors x1 , x2 , · · · , xN , each of which has the mean µ and the covariance Σ. The dimension of each vector is d, and all these vectors are linearly independent. Please answer: 1) if d = N , what is the mean and covariance of a random vector q in the linear space Span(x1 , x2 , · · · , xN )? 2) if d > N , what is the mean and covariance of a random vector q in the linear subspace Span(x1 , x2 , · · · , xN )? P α x (Hint: You should write q as a linear combination of x1 , x2 , · · · , xN , i.e., q = N i=1 i i , and then solve these coefficients α1 , · · · , αN . )