
In this talk, we consider the estimation problem for the parameters of generalized linear models which may have a large collection of potential predictor variables and some of them may not have influence on the response of interest. In this situation, selecting the statistical model is always a challenging problem. In the context of two competing models, we demonstrate the relative performances of shrinkage and classical estimators based on the asymptotic analysis of quadratic risk functions. We demonstrate that the shrinkage estimator outperforms the maximum likelihood estimator uniformly. For comparison purpose, we also consider an absolute penalty estimation (APE) approach. This comparison shows that shrinkage method performs better than the APE type method when the dimension of the restricted parameter space is large. This talk ends with reallife example showing the value of the suggested method in practice. We consider South African heart disease data, which was collected on males in a heart disease highrisk region of Western Cape, South Africa.
Results from matrix algebra and analysis are used to minimize local and global L_{2} criteria of kernel estimates of multivariate distribution functions. Under different matrix structures, optimal bandwidth matrices are derived and their selectors are shown to possess fast rates of convergence.
Let y_{t} be the k×1 vector of excess return on k assets and let x_{t} be the excess return on the market portfolio at time t. The capital asset pricing model (CPAM) can be associated with the null hypothesis H_{0} : a = 0 in the regression model: y_{t} = a + x_{t} b + e_{t}, 1 £ t £ n.
In this talk, we are interested in improving the estimation of mean response parameter by incorporating the available information about the intercept parameter vector. In this context, we suggest preliminary test and shrinkage estimation for the parameter matrix. We investigate the asymptotic relative performance of suggested estimators with the classical estimator. The detailed simulation study illustrates the feasibility of the estimation approach and evaluates their characteristics. The proposed estimation strategies are also applied to stock data for illustrative purpose.
Genomics research often involves a 01 response such as "control" and "case" (nondisease and disease) and a number of covariates such as genetic markers (up to 1.2 million at this date), population membership, conventional covariates such as age, sex, body mass index (BMI), and treatment levels. Population geneticists have developed clever procedures for analysing data from experiments involving such genomics data. These procedures involve using what could be called inverse principal component analysis (PCA) and rely on matrix equations that connect regular and inverse PCA, where inverse PCA refers to PCA with variables exchanged with sample values. That is, inverse PCA is PCA with the design matrix transposed. We examine the properties of the procedures proposed based on this inverse PCA. The population geneticists propose intuitive models for their data. However, in their analysis they ignore their own models and use simple Stat 101 nonparametric chisquare tests. We compare these tests with likelihood ratio tests and Bayesian procedures. More generally, we examine statistical inference appropriate for the models proposed by population geneticists.
This is joint work with Kam Tsui and Fan Yang.
Prompted by the recent explosion of the size of datasets statisticians are working with, there is currently renewed interest in the statistics literature for questions concerning the spectral properties of largedimensional random matrices.
Most of the efforts so far has focused on understanding sample covariance matrices in the "large n, large p" setting, where the number of samples, n, and the number of variables in the problem, p, are of the same magnitudesay a few 10s or 100s. A basic message is that in this highdimensional setting, the sample covariance matrix is a very poor estimator of the population covariance, especially from a spectral point of view. This naturally raises questions about the behavior of spectral methods of multivariate analysis such as the widely used principal component analysis (PCA).
The statistical learning literature has a number of "kernel analogs" to classical multivariate techniques, in which the sample covariance matrix is replaced by a kernel matrix, often times of the form f(X_{i}¢X_{j}) or f(\lVert X_{i}X_{j} \rVert). It is therefore natural to ask what can be said about the spectral properties of these kernel random matrices, and in particular, what is the impact of the choice of the function f and related questions.
In this talk I will discuss these questions when:
This paper is concerned with a partially linear regression model with serially correlated random errors which are unobservable and modelled by a random coefficient autoregressive process. We propose improved weighted semiparametric leastsquare estimators for the parametric component using preliminary test and JamesStein estimation methodologies. The asymptotic properties (including asymptotic distribution bias and asymptotic distribution risk) of the proposed estimators is investigated. We show that these improved estimators perform well relative to the benchmark semiparametric weighted least square estimators. Our simulation results strongly corroborate with our analytical results.
In 1930, A. Oppenheim generalized Hadamard's determinantal inequality


After reviewing a modern version of a proof of Oppenheim's inequality, we offer an extension of the class of matrices (beyond positive semidefinite) for which Oppenheim's inequality remains valid. Along the way, we define a new notion of closure under Hadamard multiplication (called duality) and concentrate on a specific kind of perturbation (called retraction).
Bayesian and frequentist methods can give quite different results when applied to the same modeldata information. If the parameter of interest is linear in the invariant parameterization recommended by Bayes then the Bayesian and the frequentist procedures will fully agree. But if the parameter of interest is curved then the Bayesian divergence appears and the shift is in opposite directions from the frequentist depending on the sign positive or negative of the curvature. This requires the calculation of the obvious curvature and then the recalibration to the Bayesian type curvature which is different from the Efron curvature. We give an overview and examples.
The matrix identity

Ridge regression estimators can be shown to be special cases of mixed and minimax estimators by solving an optimization problem with constraints. They may also be formulated by minimizing the sum of the squares of the regression coefficients on an ellipsoid. These are frequentist arguments.
On the other hand for normal prior distributions and normal errors in the regression model they are special cases of the Bayes estimator.
By an appropriate choice of A, B, C and D in the above identity the estimators obtained by the frequentist and Bayesian argument can be shown to be algebraically equivalent. In this talk I will explain how to do this and give conditions for when this algebraic equivalence is possible.
Given a graph G, the adjacency matrix, the standard Laplacian, and the normalized Laplacian have studied extensively. In this talk, eigenvalue interlacing inequalities are given for each of these three matrices under the two operations of removing an edge or a vertex from G. Examples are provided to show that the inequalities are the best possible of their type.
For semipositive A, the inf and sup of the entry product of Ax over normalized positive vectors x is discussed and a theory developed that to facilitate computation. This the basis for a theory of diagonal scaling based upon a curious matrix/vector equation. Semipositive spectra are also discussed.
Let {X_{k,i}; i ³ 1, k ³ 1} be a double array of nondegenerate i.i.d. random variables and let {p_{n}; n ³ 1} be a sequence of positive integers such that n/p_{n} is bounded away from 0 and ¥. In this paper we give the necessary and sufficient conditions for the asymptotic distribution of the largest entry L_{n} = max_{1 £ i < j £ pn} [^(r)]^{(n)}_{i,j} of the sample correlation matrix G_{n} = ([^(r)]_{i,j}^{(n)})_{1 £ i,j £ pn} where [^(r)]^{(n)}_{i,j} denotes the Pearson correlation coefficient between (X_{1,i},...,X_{n,i})¢ and (X_{1,j},..., X_{n,j})¢. Write F(x) = P (X_{1,1} £ x), x ³ 0, W_{c,n} = max_{1 £ i < j £ pn} å_{k=1}^{n} (X_{k,i}c) (X_{k,j}c), and W_{n} = W_{0,n}, n ³ 1, c Î (¥,¥). Under the assumption that E X_{1,1}^{2+d} < ¥ for some d > 0, we show that the following six statements are equivalent:
This talk is based on my recent work joint with Professors WeiDong Liu and Andrew Rosalsky.
The Dantzig variable selector has recently emerged as a powerful tool for fitting regularized regression models. A key advantage is that it does not pertain to a particular likelihood or objective function, as opposed to the existing penalized likelihood methods, and hence has the potential for wide applications. To our knowledge, almost all the Dantzig selector work has been performed with fully observed response variables. This talk introduces a new class of adaptive Dantzig variable selectors for linear regression models when the response variable is subject to right censoring. This is motivated by a clinical study of detecting predictive genes for myeloma patients' eventfree survival, which is subject to right censoring. We establish the theoretical properties of our procedures, including consistency in model selection (i.e., the right subset model will be identified with a probability tending to 1) and the oracle property of the estimation (i.e., the asymptotic distribution of the estimates is the same as that when the true subset model is known a priori). The practical utility of the proposed adaptive Dantzig selectors is verified via extensive simulations. We apply the new method to the aforementioned myeloma clinical trial and identify important predictive genes for patients' eventfree survival.
Compound magic squares (CMSs) of order mn, whose tiled subsquares of orders m and n are also magic squares (MSs having constant row, column and diagonal linesums within each subsquare), are found back to the 10th century for the case m=n=3. Interesting results follow if they are considered as matrices.
Frierson gave a simple algebraic form for compounding from the unique pattern of third order to a general n=9 CMS in The Monist in 1907, from which he showed 6 fundamental numerical forms using the complete set of integers 1...81. We extend Frierson's work, finding an algebraic description of a family of associative (antipodal sum pairs n^{2}+1) compound magic squares of orders n=3^{l}, l=1,2,.... In doing so we have firmly established two results previously stated by Bellew (1997), 90 fundamental numerical forms for n=27, as well as its generalization for all l.
The present algebra then leads to a general formula for the eigenvalues of this family, which consists of the linesum eigenvalue and l signed pairs, for rank 2l+1.
For n=9 the 8 possible orientations of each of the 9 tiled third order subsquares give rise to 6×8^{9} distinct CMSs, most with increased rank. We resolve disparate factors of 8 of Trigg (1980) and Bellew for n=27 with a new result by taking account of all orders of tiled subsquares, before generalizing this for all l.
Joint work with Ian Cameron.
Let X_{1},...,X_{m} be identically and independently distributed (i.i.d.) having unknown distribution function F. In this investigation, we propose a kernel method to estimate the distribution function of the function g(X_{1},...,X_{m}). We derived the asymptotic mean square error for the estimator and the asymptotic mean Hellinger distance for the estimator. In addition, we propose a databased method to obtain the bandwidth based on both the mean square error and the mean Hellinger distance.
Nonparametric regression analysis provides a statistical tool for estimating regression curves or surfaces from noisy data. Conventional nonparametric regression methods, however, are only appropriate for estimating continuous regression functions. When a underlying regression function has jumps, functions estimated by the conventional methods are not statistically consistent at the jump positions. Recently, jump regression analysis (JRA) for estimating jump regression functions is under rapid development, because JRA has broad applications. One important application is image processing where a monochrome image can be regarded as a surface of the image intensity function which has jumps at the outlines of objects. In this talk, I will make a general introduction to the research area of JRA, and to some of its applications in image processing.
Highthroughput screening (HTS) is a largescale process that screens hundreds of thousands to millions of compounds in order to identify potentially leading candidates rapidly and accurately. There are many statistically challenging issues in HTS. In this talk, I will focus the spatial effect in primary HTS. I will discuss the consequences of spatial effects in selecting leading compounds and why the current experimental design fails to eliminate these spatial effects. A new class of designs will be proposed for elimination of spatial effects.
The new designs have the advantages such as all compounds are comparable within each microplate in spite of the existence of spatial effects; the maximum number of compounds in each microplate is attained, etc. Optimal designs are recommended for HTS experiments with one and multiple controls.
In the context of a partially linear regression model, we consider shrinkage semiparametric estimators based on the Steinrule. In our framework the coefficient vector may be partitioned into two subvectors, where the first subvector gives the coefficients of interest, i.e., main effects (for example treatment effects), and the second subvector is for variables that may or may not need to be controlled for. When estimating the first subvector, we may get the best estimate using the full model that includes both subvectors, or using the reduced model which leaves out the second subvector. We demonstrate that under certain conditions shrinkage estimators which combines two semiparametric estimators computed for the full model and the reduced model outperforms the semiparametric estimate for the full mode. Using the semiparametric estimate for the reduced model is best when the second subvector is the null vector, but this estimator suffers seriously from bias otherwise. The relative dominance picture of suggested estimators is investigated. In each case we consider estimates of the nonparametric component based on Bsplines. We primarily explore the suitability of estimating the nonparametric component based on Bspline, and compare the risk performance numerically with that of the kernel estimates. Further, the performance of the proposed estimators have been compared with an absolute penalty estimator through Monte Carlo simulation.
In this talk, I will present a new construction of quadratic inference functions (QIF) that have received increasing attention in longitudinal data analysis. This new construction is based on a covariance matrix shrinkage, which ensures the QIF to be analytically valid and computationally stable when data are sparse or interaction terms are included in the marginal generalized linear models. Some numeric illustration will be given to demonstrate the proposed method.
We consider sheetlets of postage stamps with r rows and c columns featuring s distinct stamps (we do not require that rc/s be an integer) and where no particular stamp appears more than once in any single row or column and so the sheetlet defines a "Latin rectangle". The "philatelic Sudoku puzzle" is to find an s×s Latin square in which the Latin rectangle defining the sheetlet is a subregion and some blocking within the subregion is involved as with the popular "regular" 9×9 Sudoku puzzle. We let b denote the block size and so b=9 in regular Sudoku. We identify six philatelic Sudoku puzzles with parameter sets (r,c,s;b) as follows:
This talk is based on Section 6 of the invited paper (with Ka Lok Chu & Simo Puntanen) entitled "Some comments on philatelic Latin squares from Pakistan", to be published in the Special Jubilee Issue of the Pakistan Journal of Statistics.
In this talk, we propose a computationally efficient approach for selecting nonzero partial correlations under the highdimension/lowsamplesize setting. This method assumes the overall sparsity of the partial correlation matrix and employs sparse regression techniques for model fitting. We illustrate the performance of our method by extensive simulation studies. It is shown that our method performs well in both nonzero partial correlation selection and the identification of hub variables, and also outperforms two existing methods. We then apply our method to a microarray breast cancer data set and identify a set of "hub genes" which may provide important insights on genetic regulatory networks. Finally, we prove that, under a set of suitable assumptions, the proposed procedure is asymptotically consistent in terms of model selection and parameter estimation.
This is joint work with Jie Peng, Pei Wang and Nengfeng Zhou.