Top 1,000 Features Creators Events Podcasts Extensions Interviews Blog Explorer CSV


< >

Seq is a programming language created in 2019.

#853on PLDB 5Years Old
Download source code:
git clone
Source Code

A High-Performance Language for Bioinformatics. Here, we introduce Seq, the first language tailored specifically to bioinformatics, which marries the ease and productivity of Python with C-like performance. Seq is a subset of Python鈥攁nd in many cases a drop-in replacement鈥攜et also incorporates novel bioinformatics- and computational genomics-oriented data types, language constructs and optimizations. Seq enables users to write high-level, Pythonic code without having to worry about low-level or domain-specific optimizations, and allows for seamless expression of the algorithms, idioms and patterns found in many genomics or bioinformatics applications. On equivalent CPython code, Seq attains a performance improvement of up to two orders of magnitude, and a 175脳 improvement once domain-specific language features and optimizations are used. With parallelism, we demonstrate up to a 650脳 improvement. Compared to optimized C++ code, which is already difficult for most biologists to produce, Seq frequently attains up to a 2脳 improvement, and with shorter, cleaner code. Thus, Seq opens the door to an age of democratization of highly-optimized bioinformatics software.

Example from the web:
from sys import argv from genomeindex import * # index and process 20-mers def process(kmer: k20, index: GenomeIndex[k20]): prefetch index[kmer], index[~kmer] hits_fwd = index[kmer] hits_rev = index[~kmer]

View source

- Build the next great programming language About Resources Acknowledgements Part of the World Wide Scroll