Biostatistics

The Biostatistics faculty include: Paul Auer, Chiang-Ching (Spencer) Huang, Peter Tonellato, and Xuexia (Helen) Wang. The Biostatistics program includes the Laboratory for Public Health Informatics and Genomics, for which Peter Tonellato is PI. Learn more about featured research projects by the Biostatistics faculty:


Biomedical informatic analysis of the RNA of primary cells and tumors arising in conditional XRCC4 and p53 deficient mouse background
Charles Murphy, Erik Gafni, Parameswary Muniandy, Himanshu Sharma, Peter J. Tonellato, Catherine Yan
Combined inactivation of p53 and LIG4, or the LIg4 protein-interactor XRCC4, leads to cancer development in mice in virtually every mouse cell type we have tested. Besides studying the fundamental role of NHEJ proteins in hematopoietic stem cell and the immune system, we have developed several mouse models of human cancers based on conditional inactiviation of the non- homologous end-joining NHEH DNA repair gene XRCC4 and p53. We hypothesize that in addition to known mutations, unknown "driver" mutations cooperate with DNA damage response (DDR) to promote the pathogenesis of the cancers that develop in our models. To test this hypothesis, we will use next generation sequencing and microarray technology to identify mutations (point mutations, small indels, fusion events) and alterations in expression in the transcriptome of the primary mouse cells of origin and cancers. We will perform miRNome and LincRNome analysis to investigate non- coding-mRNA regulations, which we anticipate will serve to identify common and tumor cell type specific driver mutations. We anticipate comparative analysis of the identified mutations to human datasets will identify new, unknown human driver mutations.

Detection of Colorectal Cancer Susceptibility Loci Using Genome-Wide Sequencing
Ulrike Peters, Li Hsu, Debbie Nickerson, Suzanne Leal, Goncalo Abecasis, Paul Auer
This multidisciplinary projects aims to investigate whether different types of genetic variants, including rare and structural variants, influence colorectal cancer risk in humans. Specifically, we will examine variants across the entire genomes of colorectal cancer cases and controls to identify new genetic risk factors for colorectal cancer, and investigate whether known environmental risk factors for colorectal cancer modify genetic susceptibility to this disease.

Development of an EMR-based database for anticoagulation/anticlotting outcome research
Kourosh Ravvaz, Peter J. Tonellato, Michael Michalkiewicz
Warfarin is one of the most frequently prescribed drugs used to prevent blood clotting. However, warfarin's therapeutic window is narrow (2.0 < INR < 3.0 ) relative to a large diverse urban patient population where therapeutic dosing may vary as much as 15 fold. Consequently, warfarin initiation requires intensive patient monitoring and achieving “therapeutic” dosing may require frequent dose modification following complex dose-adjustment protocols. Dozens of algorithms exist that integrate clinical and genetic factors into individualized predictive models for warfarin dose. However, there has been little testing of algorithms across a large diverse urban population. Consequently, the value of sophisticated algorithms based on genetic and clinical data remains an open question. To address this question, the Aurora Health Care (AHC) Research Institute and Zilber School of Public Health are working together on developing a retrospective EMR-based longitudinal anticoagulation clinical database with the goal of simulating and testing warfarin dosing algorithms designed to achieve therapeutic dose more quickly. The initial database contains over 157,000 unique anticoagulated patients from 15 inpatient facilities across south-eastern Wisconsin from 2002-2012. Preliminary simulations include segmenting the database’s population and identifying the optimal warfarin dosing protocol for any given segment. This is one of the largest warfarin urban patient population databases. Such a resource is invaluable to pursue the broad objectives of individualized and thus, improved patient-centric outcome research.

Dynamic molecular network of immune system in cardiovascular diseases
Chiang-Ching Huang, Taura Bar, Reyna VanGilder
Atherosclerosis is the main cause of cardiovascular disease (CVD), the number one cause of death in the world. Increasing evidence shows that both innate and adaptive immune systems tightly regulate atherogenesis. Several immune molecules have been suggested to play a critical role in the inflammatory process of atherosclerosis. However, the fundamental knowledge of dynamic immune regulation in atherosclerosis is far from complete. This project addresses this gap in knowledge by investigating the transcriptional network structure of two major innate and adaptive immune pathways, toll-like receptor and T-cell receptor signaling in atherosclerosis, myocardial infarction (MI), and ischemic stroke (IS). A parallel comparison of transcriptional patterns across these physiopathological conditions will shed light on how these two immune systems interact to influence disease progression and identify patients at a higher risk for developing MI or IS.

Guiding warfarin clinical trial design using pharmacogenetic simulations
Peter J. Tonellato, Kourosh Ravvaz, Chun-Yuan Huang
Highly-sensitive genetic tests that detect variant alleles combined with increasing genomic knowledge offer physicians the ability to individualize a patient’s drug treatment. If pharmacogenomic treatment is successful, one anticipates a large reduction in adverse drug reactions leading to improved patient care, improved outcomes, reduced treatment periods, and overall lower costs. Unfortunately, it is extremely expensive and time-consuming to conduct the clinical trials to identify the correct combination of genotypes, phenotypes, clinical and personal data necessary to accurately model drug response, test treatment options and produce the 'optimal' protocol. In addition, there are no modeling frameworks to extend the simulations and optimization to population wide studies capable of guiding public health policy. Here, we propose the extension and confirmation of a clinical trial simulation framework to model warfarin dosing and INR response to guide clinical trial design. And we ultimately extend the modeling and simulations to city and county-wide predictions which provides evidence to guide public health policy and help direct limited public health resources to avoid health disparity.

A longitudinal study of gastrointestinal-related ER visits in the United States
Lyndon Hernandez, Peter J. Tonellato, Christina Eldredge, Annie Penlesky, Kimberly Siegler
Previous studies on gastrointestinal-related emergency room (ER) visits have only focused on a limited time period. i It is important to determine if there are time trends as to diagnosis, endoscopies, admissions, and charges over a long-term period. Endoscopies are being performed at an increasing rate for the last decade, and so it is important to know if there has been a corresponding increase in the number of emergent endoscopies being performed. From 2006 to 2010, there will be more severe diagnosis of GI diseases leading to emergent endoscopies and admissions. The year 2007 was start of the economic recession in the US, thus we have an opportunity to see the effect on ER utilization, such as an upswing in uninsured and/or sicker patients. This is a cross-sectional study using data from the Nationwide Emergency Department Sample (NEDS) from 2006 to 2010. We will look at the time trends of the proportion of patients requiring endoscopies and hospitalization, and if that varies by insurance status, zip code of income or other patientdemographics such as co-morbidity using the Charleston co-morbidity index. Yearly trends in charges will also be analyzed and adjusted for inflation.

Metabolomics risk score for near-term CVD events in individuals with PAD
Chiang-Ching Huang, Mary McDermott, Kiang Liu, Jane Tseng
Compared to individuals without peripheral arterial disease (PAD), those with PAD have a nearly two-fold increased risk of all-cause mortality and two- to three-fold increased rate of acute coronary syndrome (ACS), even after adjusting for cardiovascular disease (CVD) risk factors and comorbidities. To date, there is no robust classification system to discriminate high-risk (e.g., PAD) patients who are more likely to suffer near-term mortality or ACS events from those who are less likely. Since established risk factors discriminate near-term risk poorly, identifying novel pathways that may signal near-term ACS events is expected to improve our discrimination ability and understanding of the pathogenesis of ACS events. The objective of this project is to develop a multi- metabolite classification system for near-term ACS events in patients with PAD. This study will use high sensitive metabolomics/lipidomic techniques to systematically identify metabolic pathways and metabolites associated with near-term ACS events.

Methylmercury induced visual and neurodevelopmental deficits in zebrafish: The role of DNA methylation in the transgenerational inheritance of disease phenotypes
Thomas Achankunju, Michael J Carvan, Peter J. Tonellato
Developmental exposure to environmental pollutants such as pesticides, bisphenol A, dioxin and hydrocarbon compounds have been associated with the onset of adult diseases and transgenerational inheritance of the diseases. Our preliminary studies have identified that developmental exposure to MeHg is correlated to reduced visual startle reflex and altered the response of potassium ion channels of the bipolar cells of the retina in zebrafish. In addition, in our preliminary studies, we have demonstrated that the third generation of fish population also showed altered visual response and retinal electrophysiology as that of the first generation. The third generation is the first generation that was not directly exposed to MeHg but inherited the altered physiology from the MeHg exposed first generation. This is the first evidence of transgenerational inheritance of a phenotype due to developmental exposure to MeHg in any species. The transgenerational effect of MeHg has not been well identified in zebrafish model. The alteration of gene expressions involved in vision and molecular mechanisms behind the transgenerational inheritance of visual defects due to developmental exposure to MeHg are unknown. No studies have been conducted to identify the molecular mechanism of transgenerational inheritance of visual defects induced by MeHg exposure. In our study, we are investigating the gene functions altered in the third generation due to developmental exposure to MeHg in the first generation. The role of DNA methylation, an epigenetic change, in the inheritance will also be investigated in this study.

The NHLBI Exome Sequencing Project
The ESP Consortium (Paul Auer, contributing member)
The goal of the Exome Sequencing Project (ESP) is to discover novel genes and mechanisms contributing to heart, lung and blood disorders by pioneering the application of next-generation sequencing of the protein coding regions of the human genome across diverse, richly-phenotyped populations and to share these datasets and findings with the scientific community to extend and enrich the diagnosis, management and treatment of heart, lung and blood disorders

Powerful approaches to test rare variants in admixed populations
Xuexia Wang, Qiuying Sha, Mingyao Li, Shuanglin Zhang
Population stratification has long been recognized as an issue in genetic association studies. It emerges when there is a systematic difference in allele frequencies among study subjects due to ancestry difference across individuals. Unrecognized population stratification can lead to both false- positive and false-negative findings and can obscure true association signals if it is not appropriately corrected. For rare variants this problem can be more serious, since the spectrum of rare variation can be very different in diverse populations. Actually, rare variants typically demonstrate different and stronger stratification than common variants, which cannot be corrected by existing methods. In this study, we will develop powerful approaches to test rare variants and control population stratification in admixed populations.

Powerful statistical tools to test gene environment interactions in next-generation sequencing data for childhood diseases
Xuexia Wang, Chiang-Ching Huang, Peter J. Tonellato
A number of childhood chronic diseases such as Asthma, Autism Spectrum Disorders (ASD), Attention Deficit Hyperactivity Disorder (ADHD), childhood cancer, and obesity have increasing prevalence and severity over the past few decades in the United States, despite major advances in the recognition and treatment of these diseases. For most of these diseases, several environmental (E) factors and rapidly increasing genetic (G) factors have been identified. However, little is understood about the interplay between genetic and environmental factors though strong evidence is accumulating that the environment can alter gene expression and influences an individual’s phenotype. Therefore, a deeper knowledge of the complex dynamic gene-by-environment interactions is required in order to understand more clearly about the heritability of childhood chronic diseases. In this study, we will develop powerful statistical tools to test gene environment interactions in next-generation sequencing data for childhood diseases.

Predicting clinical validity of bladder cancer nomograms
Kourosh Ravvaz, Tracy M Downs, Peter J. Tonellato
Complex early stage bladder cancer has growing impact on individual and population health, health care cost, and medical treatment improvements. Bladder cancer is a heterogeneous disease requiring accurately risk group stratification to precisely predict tumor progression and recurrence and therefore accurately treat even early detected, high-risk patients using intravesical therapy. However, current knowledge of risk and treatment is not fully incorporated into commonly used nomograms. This retrospective study being conducted by a multidisciplinary group of researchers from UWM and UW-Madison Carbone Cancer Center will create a simulation framework to test and adjust existing nomograms to include recent clinical findings to produce “optimal” predictions of risk and outcomes.

Robust taxanomic development using 16s rRNA pyrosequencing fragments
Charles J Murphy, Ryan Newton, Sandra McLellan, Peter J. Tonellato
Next generation sequencing technology, such as pyrosequencing, can generate large sequence datasets to estimate bacterial communities in biological samples. Pyrosequencing often uses specific genomic regions, such as the 16s rRAN gene, as a stable taxonomic markers. The primary analysis is to estimate bacterial communities in pooled biological samples, but is complicated with the consideration of variable length sequence reads, which poses the technical problem of correlating taxonomies between older technology data (shorter sequence reads) and newer technology data (longer sequence reads); where longer sequences and shorter sequences have overlapping regions. Methods to correlate bacterial communities between longer and shorter sequences are actively being addressed. Presented here is the Hybrid Analysis (HA) that estimates bacterial communities in pooled samples containing variable length sequencing fragments. Initial testing of the HA algorithm are promising; further testing is required.