Statistics professor takes on big data challenges with multi-faceted research

In This Story

People Mentioned in This Story

George Mason University’s outstanding location, available opportunities, and growing reputation combined to produce a winning formula that attracted statistics professor Lily Wang to the College of Engineering and Computing Department of Statistics in fall 2021.

“Collaboration is the key for my profession, and the Washington, D.C. area has so many government agencies and top technology companies, and it creates fantastic opportunities. Mason is growing so fast and on an impressive trajectory,” says Wang.

Wang’s primary areas of research are broad and diverse. They include non- and semi-parametric modeling and inference, statistical learning of data objects with complex features, methodologies for functional data, spatiotemporal data, survey sampling, and data reduction methods. Working at the interface of statistics, mathematics, and computer science, she is also interested in general issues related to data science and big data analytics. Her methods have a wide application in engineering, neuroimaging, epidemiology, environmental studies, economics, and biomedical science.

For example, she has been heavily involved with the Centers for Disease Control and spatiotemporal data related to the COVID-19 pandemic. The team’s research findings created a dashboard with multiple apps embedded. The dashboard provides a real-time seven-day forecast and a long-term forecast of COVID-19 infection and death count at the county and state level, and the corresponding risk analysis. “We are honored to be one of the teams that the CDC is relying on to better understand and forecast COVID-19 in the United States,” says Wang.

Another area of great interest to Wang is using functional data to learn how to apply statistics to help with early disease diagnosis and disease prognosis prediction. Currently, most existing studies focus on one-dimensional (1D) function. For example, 1D children’s growth charts are commonly used to screen children’s growth. “Modern technologies produce large volumes of multi-modality imaging data that might be used as biomarkers for diseases,” Wang said. So, Wang’s research team uses 2D and higher dimensional medical imaging data and other clinical and genetic data for Alzheimer’s disease research.

The wealth of data presents new opportunities to innovate in science and technology; however, it also requires a parallel effort in statistical method development that enables researchers to make a rigorous inference. She says, “If you think about a high-resolution image, you can have a million pixels for just the one image, and beyond the image, you also have the patient medical and genetic information, so-called ‘big data squared.’ Analysis of these big data can easily go beyond the capability of the traditional methods. Our state-of-art statistical models and powerful learning tools can help to delineate associations among these data.”

Wang is also teaching a split undergraduate/graduate course in applied regression analysis. She really cares about her students, and she always offers various modalities of her classes. She says, “I always try my best to accommodate students’ special needs, especially under the pandemic.”

Whether she is teaching students or working with research collaborators, Wang’s excitement about being a new faculty member at Mason comes through. “Statistics is a thriving and fast-developing discipline in the data science era. Our department at Mason is home to renowned researchers in statistics, biostatistics, and data science/analytics with a bright future. I am so happy to be part of it,” she says.