Top 1K Features Creators Events Podcasts Books Extensions Interviews Blog Explorer CSV

HAL Format

< >

HAL Format is an open source binary data format created in 2012.

#1182on PLDB 12Years Old
Download source code:
git clone https://github.com/ComparativeGenomicsToolkit/hal
Source Code

HAL is a graph-based structure to efficiently store and index multiple genome alignments and ancestral reconstructions. HAL files are represented in HDF5 format, an open standard for storing and indexing large, compressed scientific data sets. Genomes within HAL are organized according to the phylogenetic tree that relate them: each genome is segmented into pairwise DNA alignment blocks with respect to its parent and children (if present) in the tree. Note that if the phylogeny is unknown, a star tree can be used. The modularity provided by this tree-based decomposition allows for efficient querying of sub-alignments, as well as the ability to add, remove and update genomes within the alignment with only local modifications to the structure. Another important feature of HAL is reference independence: alignments in this format can be queried with respect to the coordinates of any genome they contain.


- Build the next great programming language Add Issues About Search Keywords Livestreams Labs Resources Acknowledgements

Built with Scroll v154.3.0