PhD
Associate Professor, Department of Biostatistics
Associate Professor, UNC School of Dentistry
UNC Gillings School of Global Public Health
UNC-Chapel Hill
Cancer Genetics Research Program
Area of Interest
My goal, via genomic data analysis and integration, is to better understand disease mechanism and to improve people’s health. I am trained as a biostatistician working in the bioinformatics field. My research interest and contributions are in both the methodology development and the application of these methods (among other methods) in cancer related research. I have extensive experience to work with biologists, clinicians including continuous collaboration with oncologists, and geneticists. I also develop sophisticated tools to solve new data problems in biomedical research. Right now my main focuses are cancer genomics, microbiome metagenomics/metatranscriptomics, and single cell RNAseq (scRNAseq) data. My current main focuses are developing statistical and AI-based computational methods to analyze cancer genomics, microbiome metagenomics/metatranscriptomics, single cell RNAseq (scRNAseq) and spatial omics, and electronic health record (EHR) data.
I have developed novel statistical methods for genomics data analysis that include gene set analysis (ROAST, CAMERA, and a method for time course data) and miRNA normalization in cancer samples. These methods have been highly cited by many transcriptome studies (including cancer studies) for pathway analysis and relating datasets by similar expression pattern in signature gene sets.
My PhD work about breast cancer was to find the cell of origin of different breast cancer molecular subtypes by pathway analysis and gene sets tests across multiple datasets. As a postdoc, I developed the bioinformatics data integration framework of drug discovery for lung squamous cell carcinoma (SqCC) and a drug repurposing framework for autoimmune diseases based on GWAS risk SNPs and public drug target database. In the SqCC study, I have characterized the signature genes of SqCC, related them to cell lines, and identified the corresponding potentially effective drugs. Other cancer types I have worked on include gastric cancer, skin cancer, prostate cancer and Head and Neck cancer. The GWAS based drug repurposing project has been extended as genomic data based cancer-subtype specific drug repurposing.
At UNC, I have collaborated with colleagues in the School of Medicine on the relation between DNA repair pathways and cancer mechanism, by analyzing local mouse WES data and integrating TCGA data. Meanwhile, my group is also developing various statistical methods in scRNAseq pathway analysis (applied for understanding cancer initiation/progression, and characterizing HIV infected mouse), differential expression analysis and clustering analysis, as well as methods for metagenomics data and microbial omics data integration and longitudinal prediction particularly when microbial DNAseq, RNAseq, and metabolomics data are all available. Connection between dental Electronic Patient Records (EPR) and their Electronic Medical Records (EMR) is current under investigation in my group for their relation to prediction of head and neck cancer using machine learning and large language model (LLM).
Awards and Honors
- Early Career Overseas Fellowship, Australian National Health and Medical Research Council, 2011
