Software

Active Projects

LUCID

Latent Unknown Clustering with Integrated Data (LUCID) is designed to leverage the omic data for identifying clusters of individuals with differences in the outcome and with similar profiles of risk factors and biomarkers. Rather than perform the analysis in a staged approach, latent cluster estimation and corresponding effect estimation is performed jointly. The model can be used to better identify disease associations or predict an individual’s potential risk, while also suggesting possible biological mechanisms defined by a combination of all factors.

JAM

Joint Analysis of Marginal summary statistics (JAM) unites both ideas of mediation and latent clustering using summary statistics from multiple omic studies and develops a causal inference framework to identify mediating effects of biologically relevant factors on outcomes. Using only summary statistics, this approach is innovative by going well beyond current methods to characterize pathways and corresponding intermediates and SNPs contributing to those associations.

Xtune

xtune: Regularized Regression with Feature-Specific Penalties Integrating External Information extends standard penalized regression (Lasso, Ridge, and Elastic-net) to allow feature-specific shrinkage based on external information with the goal of achieving a better prediction accuracy and variable selection. Examples of external information include the grouping of predictors, prior knowledge of biological importance, external p-values, function annotations, etc. The choice of multiple tuning parameters is done using an Empirical Bayes approach. A majorization-minimization algorithm is employed for implementation.

Completed Projects

Priority Pruner

PriorityPruner is a software program which can prune a list of SNPs that are in high linkage disequilibrium (LD) with other SNPs in the list, while preferentially keeping/selecting SNPs of higher priority (e.g., the most significant SNPs in a genome-wide association study).

Snagger

Snagger is an extension to the existing open-source software, Haploview, which uses pairwise r²linkage disequilibrium between single nucleotide polymorphisms (SNPs) to select tagSNPs.

BVS

Bayesian Variable Selection (BVS) focus on analyzing case-control association studies involving a group of genetic variants. In particular, we are interested in modeling the outcome variable as a function of multivariate genetic profile using Bayesian model uncertainty and variable selection techniques.

GitHub Links

dvconti
https://github.com/dvconti

Division of Biostatistics
https://github.com/USCbiostats

USC COVID-19
https://uscbiostats.github.io/COVID19/

Chatzi Lab
https://github.com/chatzilab

Multiethnic Cohort
https://github.com/USCmec