Hello, I'm undfined 👋 I find joy in translating ideas into impactful software. I believe my superpower lies in unfurling the complexity. I've lead a variety of 0 to 1 projects and relish opportunities that push me outside of my comfort zone.
I'm currently a Principal Software Engineer with the Allen Institute for AI. My current work consists primarily of pre-training data experimentation and the supporting tools, infrastructure, and services to enable building SOTA pre-training data at scale.
Notable Projects
OLMo @ Ai2
Principal SE | Jul '22 - Present
Built data pipelines processing multi-trillion token datasets for LLM pre-training. Constructed specialized data mixtures for two-stage curriculum training: large-scale corpora for primary pre-training and high-quality domain-specific datasets for late-stage patching. Developed data mixing strategies inspired by RegMix and DoReMi approaches, optimizing domain weights through proxy model experiments. Applied quality filtering using perplexity scoring, heuristic rules, and toxicity detection. Created curation pipelines for academic content, code, and synthetic instruction data. Optimized distributed processing with Ray.
Babylon.finance
Cofounder | Feb '21 - Nov '22
I cofounded and was Head of Product for Babylon.finance for which we raised a $2MM seed round in February of 2021. While operating I lead or provided significant impact to a variety of strategic partnerships, go-to-market initiatives, branding and identity, marketing strategy and execution, as well as defining our overall product roadmap. I was also our primary application developer shipping code on a daily basis.
Semantic Scholar @ Ai2
Tech Lead | Apr '17 - Feb '21
I lead a variety of initiatives with the Semantic Scholar team, most notably I was the lead engineer for a redesign of semantic search technology. I led the implementation of a redesign of our search stack from the ground up and was able to achieve a variety of best in-class relevance outcomes within the space of scientific literature.
Moz Content
Product Lead | Mar '13 - Jul '16
I lead a 0 to 1 product within the marketing automation space. Within year 2 of operations we were able to achieve an MRR of more than $55k and nearly 10,000 subscribers of our SaaS product offering. This was with an incredibly nimble budget and team size.
Google Maps
Web Developer | Jan '11 - Feb '13
At Google I lead a small team and developed software for an analytics product focused on improving productivity tools for the many data labeling and model training efforts within the Maps organization.