Sr. Biostatistician / Bioinformatics Software Engineer
Recreated bedtools, a fast, flexible, linux command line toolset for genomic arithmetic, written in C++. New version is up to 60x faster than previous ones. Used templating and polymorphism to handle ten bioinformatics file format types, four compression types, and four file/stdin input types. Bedtools is also now faster than bedops, a competitor whose sole marketing claim had been superior speed.
- Refactored code base for modularity, allowing vastly easier maintenance and rapid future development
- Created automated regression testing tool that enables bedtools to compare correctness, speed, and memory footprint against prior versions of itself for many combinations of simulated dataset size and density
- Added many new unit tests to ensure correctness of all prior features when adding new ones
- Provided support for bugs and enhancements on open source github site