A collection of software tools and applications developed by the Center for Computational Genomics and Data Science (CGDS) to address unmet needs in extracting knowledge from ‘omic and associated data, supporting researchers and physicians in studying molecular variation in patients to generate definitive diagnoses and aid in understanding of pathogenic mechanisms.

Explore more tools and ongoing projects in our GitHub organization. For an automatically updated listing, summarized, and logical organization of all repositories owned/maintained by CGDS, visit our GitHub CGDS Repo Discovery project.

Rosalution

Supporting data accessibility, integration, curation, interoperability, and reuse for precision animal modeling
Code Paper

A collage of screen captures of content of the Rosalution application

Rosalution is an open-source web application maintained and developed by the University of Alabama at Birmingham (UAB) Center for Computational Genomics and Data Science (CGDS). It supports researchers and clinicians collaborating on the study of molecular variation in patients to generate definitive diagnoses and better understand pathogenic mechanisms.

Initially created to support the UAB Center for Precision Animal Modeling (C-PAM), Rosalution streamlines the selection of candidate animal models to replicate patient-specific genetic variations—accelerating discovery, diagnosis, and therapy development for ultra-rare and understudied disease. Its role has since expanded to support broader translational biomedical research, including critical studies on Ciliopathies and Pulmonary Arterial Hypertension (PAH).

By integrating streamlined data collection, quality control, and standardization of ‘omic data with clinical insights, Rosalution enables researchers to identify and prioritize candidate genes and variants relevant to these complex diseases. This effort consolidates both process and data, helping reduce the cost and increase the scalability of precision medicine.

QuaC

A Pipeline Implementing Quality Control Best Practices for Genome Sequencing and Exome Sequencing Data
Code Paper

QuaC is a Snakemake-based quality control (QC) pipeline developed by CGDS to implement and standardize QC best practices for human whole genome and whole exome sequencing (WGS/WES) data. It automates execution of multiple QC tools using BAM and VCF files and can also incorporate QC metrics from raw FASTQ files. QuaC evaluates these metrics against thresholds using its QuaC-Watch module and summarizes results for efficient review.

QuaC compiles all outputs—including QuaC-Watch results—into MultiQC reports at both the sample and project level. These reports provide overview summaries while preserving detailed inspection. It also introduces a standardized Sample QC Review System schema to document reviewer assessments and improve communication of QC outcomes across teams and tools.

DITTO

An Explainable Machine-Learning Model for Transcript-Specific Variant Pathogenicity Prediction
Code Paper

DITTO is a transparent, transcript-aware machine learning model that predicts pathogenicity of small genetic variants using VCF data. Unlike existing tools that often overlook transcript variability or require complex pipelines, DITTO integrates interpretation into a single explainable system. This helps address challenges in rare disease diagnostics such as high interpretation cost and limited expert availability.

Secondary analysis tools

In addition to diagnosing diseases and developing primary tools, CGDS creates a wide range of pipelines and software packages to support data generation, analysis, visualization, and interpretation. We use established tools when appropriate but also build new pipelines and frameworks where gaps exist. These tools support researchers and physicians in making informed discoveries and clinical decisions and are openly shared to promote reproducibility and collaboration.