bdg-formats: Schemas for genomic data
The bdg-formats project provides schemas for describing common genomic data, such as reads, reference oriented data, variants, genotypes, and assemblies. The schemas developed by this project use Apache Avro, which allows use across most common programming languages and platforms. These schemas form the core data structures used in the ADAM project.
bdg-formats is on Github.
Currently, the bdg-formats schemas are part of the ADAM 0.6.1 release. ADAM is available for projects using Maven or SBT through the Sonatype OSS repository. We are working to deploy artifacts for non-JVM languages; watch this space for more info!
The bdg-formats parallel the development of ADAM. For support, please contact the ADAM developer mailing list. Additionally, we track issues and feature enhancement requests through our Github issue tracker.
An early version of the bdg-formats schemas was been described in the UC Berkeley EECS technical report for the ADAM project. The Bibtex for this reference is:
1 2 3 4 5 6 7 8 9
ADAM is available under the Apache 2 open source software (OSS) license. This OSS license is non-viral, and places no restrictions on users who would like to use or modify the software.