This project is supported by a grant awarded to Zhiyong Zhang and Ke-Hai Yuan through the grant program on Statistical and Research Methodology in Education from the Institute of Education Sciences of U.S. Department of Education .
The title of the project is A General Framework for Statistical Power Analysis with Non-normal and Missing Data through Monte Carlo Simulation.
The importance of conducting a statistical power analysis at the beginning of a study is universally accepted (e.g., Cohen, 1988; Hedges & Rhoads, 2009). Without careful planning, a study can easily fail to detect an existing effect by chance. The increasing complexity of education research driven by education practice poses great challenges on existing methods of statistical power analysis. For example, education research often involves longitudinal and multilevel designs as well as advanced techniques such as structural equation and multilevel models. Furthermore, practical data in education are often not normally distributed and are incomplete. Regarding the nature of real data, Micceri (1989) reported that among 440 large-sample achievement and psychometric measures taken from journal articles, research projects, and tests, all were significantly non-normally distributed. In impact evaluations funded by the National Center for Educational Evaluation and Regional Assistance (NCEE), student achievement outcomes are often missing for 10-20 percent (e.g., Puma, 2009). Without careful consideration of the complexity of study designs and the impact of non-normal data and missing data, the validity of education research can be harmed.
This project develops a general framework for statistical power analysis with non-normal and missing data through Monte Carlo simulation. It enables data generation from a given target population as well as model fitting using cutting-edge methodology. The simulation is carried out for a sufficient larger number of times and the proportion of times that a null hypothesis is rejected is used as an estimate of statistical power. In this project, we address three critical components of such a method. (1) We develop a general method to enable the specification of models including structural equation models and multilevel models as well missing data mechanisms and non-normality of data through drawing path diagrams. (2) We develop methods for simulating data, including non-normal data and missing data, from a supplied model. (3) We develop methods for statistical power analysis that are robust to non-normal data and missing data. Compared to the commonly used methods in the literature, our method will have better controlled type I error rates and greater statistical power.
This project also develops software MCpower (now known as WebPower) to conduct power analysis within the proposed framework. MCpower runs as a web application and, therefore, can be used locally on a personal computer or remotely on a Web server within a web browser. (1) MCpower includes a graphical user interface that allows the specification of structural equation models and multilevel models with non-normal and missing data through drawing path diagrams. (2) MCpower implements different algorithms for data simulation from an arbitrary model. (3) MCpower conducts automatic Monte Carlo simulation to estimate statistical power.
The project is expected to offer the community of education researchers an easy-to-use and general-purpose tool to conduct sophisticated statistical power analysis for structural equation and multilevel models with non-normal and missing data.