Seq - Programming language

Seq is a programming language created in 2019.

#799on PLDB

6Years Old

Download source code:

git clone https://github.com/seq-lang/seq

A High-Performance Language for Bioinformatics. Here, we introduce Seq, the first language tailored specifically to bioinformatics, which marries the ease and productivity of Python with C-like performance. Seq is a subset of Python—and in many cases a drop-in replacement—yet also incorporates novel bioinformatics- and computational genomics-oriented data types, language constructs and optimizations. Seq enables users to write high-level, Pythonic code without having to worry about low-level or domain-specific optimizations, and allows for seamless expression of the algorithms, idioms and patterns found in many genomics or bioinformatics applications. On equivalent CPython code, Seq attains a performance improvement of up to two orders of magnitude, and a 175× improvement once domain-specific language features and optimizations are used. With parallelism, we demonstrate up to a 650× improvement. Compared to optimized C++ code, which is already difficult for most biologists to produce, Seq frequently attains up to a 2× improvement, and with shorter, cleaner code. Thus, Seq opens the door to an age of democratization of highly-optimized bioinformatics software.

Tags: programming language
Seq is developed on GitHub and has 698 stars
Early development of Seq happened in MIT
Seq is written in C++, Python, reStructuredText, JSON, CMake, Markdown, TypeScript, YAML, Bourne shell, JavaScript, Make, Dockerfile
Read more about Seq on the web: 1.

Example from the web:

from sys import argv
from genomeindex import *

# index and process 20-mers
def process(kmer: k20, index: GenomeIndex[k20]):
 prefetch index[kmer], index[~kmer]
 hits_fwd = index[kmer]
 hits_rev = index[~kmer]