GenAMap Description

What is association mapping? In genetic association mapping, we look for polymorphisms in the genome (specifically single-nucleotide polymorphisms, or SNPs) that are associated with disease. Finding a SNP that is associated with a disease can lead to greater insight in how the disease progresses and could lead to better treatments or preventative medicine. Although classical studies have considered association mapping with only a disease state as the phenotype, studies that map to multiple phenotypes such as gene expression or clinical records are becoming more common.

What is structured association mapping? Both the genome and the phenome have complex structure. On the genome side, there is linkage disequillibrium and population structure. We can represent intermediate phenotypes as a network, a tree, or even a dynamic tree. All of this structure tells an important part of the biological mechanism behind the disease. With powerful algorithms, we can take this structure into account when performing biological analyses and reduce false positives and enhance discovery of weak signals, especially those due to the structure we are considering.

What structured association mapping algorithms are available?

GFlasso The graphical fused lasso for intermediate phenotypes structured as as network.

TreeLasso The Tree lasso for intermediate phenotypes structured as a tree.

Multi Population GWA mapping for association mapping according to population.

gGFlasso for three-way association analysis with gene expression data and multiple correlated phenotypic traits.

Adaptive Multi-Task Lasso for association analysis when features about the SNPs are known.

What is GenAMap? GenAMap brings the power of structured association mapping to a usable, intuitive interface. GenAMap makes GWAS and eQTL studies easier on three fronts:

1) Data management – create subsets, manage, and visualize genomic and phenotype data.

2) Algorithms – run algorithms to generate structure such as a gene network or a population stratification. Or, run a structured association algorithm. All algorithms are run on a remote cluster complete with complex parallelization schemes to provide max run-time efficiency.

3) Visualizations – tools to visualize the structure of the data while exploring the association results. We provide visualization tools to get a feel of the overall associations in the dataset, along with the ability to zoom in and explore specific parts of the data. Tools are interactive, linking to databases online for more information.

A) Network visualization, annotation, and exploration
B) The exploration of associations from the genome to complex phenotypic trait networks
C) Population association analysis
D) Three-way association analysis

Who is involved with GenAMap?

PI Eric P Xing

Software Architect, scientist, designer, and development director Ross E Curtis

Maintenance Haohan Wang (Please send bug reports and questions to haohanw at cs dot cmu dot edu)

Algorithm Contribution Seyoung Kim, Kriti Puniyani, Seunghak Lee, Junming Yin, Ross E Curtis, Jun Zhu

Software Development Flavia Grosan, Anuj Goyal, Jorge Vendries, Michael Zuromskis, Sharath Babu, James Moffatt, Kelly Chan

Special Thanks for the following Open Source Libraries JUNG, SSHTools, JHeatChart, JFreeChart, Mysql++, BLAS, ATLAS, CRAN-R project and libraries, Parallel Spectral Clustering in Distributed Systems, BiNGO, javastat, commons math


GenAMap 1.0 – released July 2010. Includes data implementation, algorithm control implementation, and first network visualizations for networks and associations. Supports the GFlasso algorithm.

GenAMap 2.0 – released February 2011. Visualization of networks, trees, and population structure. GFlasso fully supported.

GenAMap 3.0 – released September 2011. This is the final release of GenAMap with population, three-way, and network association analysis supported. See docs for details.