1、Class 1: Expectations, variances, and basics of estimationBasics of matrix (1) I. Organizational Matters(1) Course requirements:1) Exercises: There will be seven (7) exercises, the last of which is optional. Each exercise will be graded on a scale of 0-10. In addition to the graded exercise, an answ
2、er handout will be given to you in lab sections. 2) Examination: There will be one in-class, open-book examination. (2) Computer software: StataII. Teaching Strategies(1) Emphasis on conceptual understanding. Yes, we will deal with mathematical formulas, actually a lot of mathematical formulas. But,
3、 I do not want you to memorize them. What I hope you will do, is to understand the logic behind the mathematical formulas. (2) Emphasis on hands-on research experience.Yes, we will use computers for most of our work. But I do not want you to become a computer programmer. Many people think they know
4、statistics once they know how to run a statistical package. This is wrong. Doing statistics is more than running computer programs. What I will emphasize is to use computer programs to your advantage in research settings. Computer programs are like automobiles. The best automobile is useless unless
5、someone drives it. You will be the driver of statistical computer programs. (3) Emphasis on student-instructor communication.I happen to believe in students judgment about their own education. Even though I will be ultimately responsible if the class should not go well, I hope that you will feel par
6、t of the class and contribute to the quality of the course. If you have questions, do not hesitate to ask in class. If you have suggestions, please come forward with them. The class is as much yours as mine. Now let us get to the real business.III(1). Expectation and VarianceRandom Variable: A rando
7、m variable is a variable whose numerical value is determined by the outcome of a random trial.Two properties: random and variable.A random variable assigns numeric values to uncertain outcomes. In a common language, “give a number“. For example, income can be a random variable. There are many ways t
8、o do it. You can use the actual dollar amounts. In this case, you have a continuous random variable. Or you can use levels of income, such as high, median, and low. In this case, you have an ordinal random variable 1=high, 2=median, 3=low. Or if you are interested in the issue of poverty, you can ha
9、ve a dichotomous variable: 1=in poverty, 0=not in poverty.Class 1, Page 2In sum, the mapping of numeric values to outcomes of events in this way is the essence of a random variable.Probability Distribution: The probability distribution for a discrete random variable X associates with each of the dis
10、tinct outcomes xi (i = 1, 2,., k) a probability P(X = xi).Cumulative Probability Distribution: The cumulative probability distribution for a discrete random variable X provides the cumulative probabilities P(X x) for all values x.Expected Value of Random Variable: The expected value of a discrete ra
11、ndom variable X is denoted by EX and defined:EX = P(xi)ik1where: P(xi) denotes P(X = xi). The notation E (read “expectation of”) is called the expectation operator.In common language, expectation is the mean. But the difference is that expectation is a concept for the entire population that you neve
12、r observe. It is the result of the infinite number of repetitions. For example, if you toss a coin, the proportion of tails should be .5 in the limit. Or the expectation is .5. Most of the times you do not get the exact .5, but a number close to it. Conditional ExpectationIt is the mean of a variabl
13、e conditional on the value of another random variable. Note the notation: E(Y|X).In 1996, per-capita average wages in three Chinese cities were (in RMB):Shanghai: 3,778Wuhan: 1,709Xian: 1,155Variance of Random Variable: The variance of a discrete random variable X is denoted by VX and defined:VX = (
14、xi - EX)2 P(xi)ik1where: P(xi) denotes P(X = xi). The notation V (read “variance of”) is called the variance operator.Since the variance of a random variable X is a weighted average of the squared deviations, (X - EX)2 , it may be defined equivalently as an expected value: VX = E(X - EX)2. An algebr
15、aically identical expression is: VX = EX2 - (EX)2.Standard Deviation of Random Variable: The positive square root of the variance of X is called the standard deviation of X and is denoted by X: X = VClass 1, Page 3The notation (read “standard deviation of”) is called the standard deviation operator.
16、Standardized Random Variables: If X is a random variable with expected value EX and standard deviation X, then:Y= Eis known as the standardized form of random variable X. Covariance: The covariance of two discrete random variables X and Y is denoted by CovX,Y and defined:CovX, Y = ()()(,xEXyYPxyijij
17、jiwhere: P(xi, yj) denotes )PijThe notation of Cov , (read “covariance of”) is called the covariance operator.When X and Y are independent, Cov X, Y = 0.Cov X, Y = E(X - EX)(Y - EY); Cov X, Y = EXY - EXEY(Variance is a special case of covariance.)Coefficient of Correlation: The coefficient of correl
18、ation of two random variables X and Y is denoted by X,Y (Greek rho) and defined:,XYCovwhere: X is the standard deviation of X; Y is the standard deviation of Y; Cov is the covariance of X and Y.Sum and Difference of Two Random Variables: If X and Y are two random variables, then the expected value a
19、nd the variance of X + Y are as follows:Expected Value: EX+Y = EX + EY; Variance: VX+Y = VX + VY+ 2 Cov (X,Y).If X and Y are two random variables, then the expected value and the variance of X - Y are as follows:Expected Value: EX - Y = EX - EY; Variance: VX - Y = VX + VY - 2 Cov (X,Y).Sum of More T
20、han Two Independent Random Variables: If T = X1 + X2 + . + Xs is the sum of s independent random variables, then the expected value and the variance of T are as follows:Class 1, Page 4Expected Value: ; ETXiis1Variance: ViisIII(2). Properties of Expectations and Covariances:(1) Properties of Expectat
21、ions under Simple Algebraic Operations)()(xbEaXEThis says that a linear transformation is retained after taking an expectation. *is called rescaling: is the location parameter, is the scale parameter. bSpecial cases are: For a constant: aE)(For a different scale: , e.g., transforming the scale of do
22、llars into the )(Xbscale of cents.(2) Properties of Variances under Simple Algebraic Operations)()(2XVbaThis says two things: (1) Adding a constant to a variable does not change the variance of the variable; reason: the definition of variance controls for the mean of the variable graphics. (2) Multi
23、plying a constant to a variable changes the variance of the variable by a factor of the constant squared; this is to easy prove, and I will leave it to you. This is the reason why we often use standard deviation instead of variance2xxis of the same scale as x. (3) Properties of Covariance under Simp
24、le Algebraic OperationsCov(a + bX, c + dY) = bd Cov(X,Y). Again, only scale matters, location does not.(4) Properties of Correlation under Simple Algebraic OperationsI will leave this as part of your first exercise: ),(),(YXdcbaThat is, neither scale nor location affects correlation. Class 1, Page 5
25、IV: Basics of matrix.1. DefinitionsA. MatricesToday, I would like to introduce the basics of matrix algebra. A matrix is a rectangular array of elements arranged in rows and columns:1211mnnmxxXIndex: row index, column index. Dimension: number of rows x number of columns (n x m)Elements: are denoted
26、in small letters with subscripts. An example is the spreadsheet that records the grades for your home work in the following way: Name 1st 2nd 6thA 7 10 9B 6 5 8. . Z 8 9 8This is a matrix. Notation: I will use Capital Letters for Matrices.B. VectorsVectors are special cases of matrices: If the dimen
27、sion of a matrix is n x 1, it is a column vector:nx.21If the dimension is 1 x m, it is a row vector:y = | |12ymNotation: small underlined letters for column vectors (in lecture notes)C. TransposeThe transpose of a matrix is another matrix with positions of rows and columns being exchanged symmetrica
28、lly.Class 1, Page 6For example: if nmnmmnxxX1211)(211().mnnmXxxIt is easy to see that a row vector and a column vector are transposes of each other. 2. Matrix Addition and SubtractionAdditions and subtraction of two matrices are possible only when the matrices have the same dimension. In this case,
29、addition or subtraction of matrices forms another matrix whose elements consist of the sum, or difference, of the corresponding elements of the two matrices. mnnmyxyx.1211Examples:A4321)2(1)2( A5432)2(C3. Matrix Multiplication A. Multiplication of a scalar and a matrixMultiplying a scalar to a matri
30、x is equivalent to multiplying the scalar to each of the elements of the matrix.Class 1, Page 71211.mnnmcxcxB. Multiplication of a Matrix by a Matrix (Inner Product)The inner product of matrix X(a x b) and matrix Y(c x d) exists if b is equal to c. The inner product is a new matrix with the dimensio
31、n (a x d). The element of the new matrix Z is:ckjiijyxzk=1Note that XY and YX are very different. Very often, only one of the inner products (XY and YX) exists. Example:4321)2(xA0)1(xBBA does not exist. AB has the dimension 2x142AOther examples:If , , what is the dimension of AB? (3x3)53(x)3(xBIf ,
32、, what is the dimension of BA? (5x5)If , , what is the dimension of AB? (1x1, scalar)51(xA)1(xIf , , what is the dimension of BA? (nonexistent)34. Special MatricesA. Square Matrix)(nB. Symmetric MatrixA special case of square matrix. For , . All i, j.)(njiijaA = AC. Diagonal MatrixA special case of
33、symmetric matrixClass 1, Page 8nxx0.021D. Scalar Matrix.0ccE. Identity MatrixA special case of scalar matrix10.Important: for rAAI = IA = AF. Null (Zero) MatrixAnother special case of scalar matrix0.From A to E or F, cases are nested from being more general towards being more specific.G. Idempotent
34、MatrixLet A be a square symmetric matrix. A is idempotent if .32H. Vectors and Matrices with elements being one A column vector with all elements being 1, Class 1, Page 91.rA matrix with all elements being 1, 11rJExamples let 1 be a vector of n 1s: )1(n11 = )(n11 = )(JI. Zero VectorA zero vector is
35、0.1r5. Rank of a MatrixThe maximum number of linearly independent rows is equal to the maximum number of linearly independent columns. This unique number is defined to be the rank of the matrix. For example, 542103Because row 3 = row 1 + row 2, the 3rd row is linearly dependent on rows 1 and 2. The maximum number of independent rows is 2. Let us have a new matrix: 0*Singularity: if a square matrix A of dimension has rank n, the matrix is nonsingular. If nthe rank is less than n, the matrix is then singular.