Research Engineer — Post-Training, Alignment & Reasoning Systems

We are building advanced AI systems focused on reasoning, generalization, and controllable behavior. Our work spans large-scale language models, synthetic data generation, post-training pipelines, and human-in-the-loop systems designed to improve model intelligence beyond pretraining alone.

We are seeking a Research Engineer to lead post-training and alignment efforts for advanced reasoning models. This role sits at the intersection of machine learning research, data engineering, and large-scale training systems.

You will work across the full model lifecycle — from data strategy and synthetic data design to supervised fine-tuning, reinforcement learning, and evaluation — with a focus on improving reasoning quality and alignment.

What You’ll Do
Design Synthetic Data and Pretraining Strategies
Develop synthetic data generation pipelines to improve reasoning capability and data efficiency
Design filtering, selection, and curriculum strategies for large-scale training datasets
Improve pretraining efficiency through better data composition and training signal design

Build Post-Training and Alignment Pipelines
Design and optimize post-training workflows including:
Supervised fine-tuning (SFT)
Reinforcement learning (RL)
Improve reasoning quality, alignment, and controllability through training interventions
Work on systems where data, objectives, and model behavior are tightly coupled

Develop Human-in-the-Loop Data Systems
Build scalable human annotation workflows for reasoning-focused datasets
Design labeling protocols and quality control systems for high-signal training data
Coordinate human data operations to support large-scale model development

Lead Evaluation and Model Analysis
Design evaluation frameworks for reasoning and generalization performance
Conduct ablation studies and failure analysis to understand model behavior
Develop automated evaluation methods such as:
LLM-as-a-judge systems
Verifier-based evaluation pipelines
Continuously iterate on data and training strategies based on empirical results

What We’re Looking For
3+ years of experience in NLP, deep learning, or ML engineering
Experience working with large-scale data processing systems such as:
Apache Spark
Ray
Databricks
Similar distributed data frameworks
Strong ability to read, critique, and implement research in:
Synthetic data generation
Data selection and filtering
Reasoning and alignment methods
Experience working across data, training, and evaluation pipelines

Preferred Experience
Experience training LLMs (7B+ parameters) or smaller-scale models with full pipeline ownership
Experience with:
Synthetic data generation
Dataset pruning or curation
Reasoning or alignment research
Familiarity with automated evaluation methods such as:
LLM-as-a-judge
Verifier-based scoring systems
Contributions via research papers, technical blog posts, or open-source work in relevant areas

Why This Role Matters
Directly shape how models learn reasoning and generalization capabilities
Work across the full stack: data, training, evaluation, and alignment
Improve model intelligence beyond what is achieved through pretraining alone
Enable smaller models to outperform larger systems on reasoning tasks
High-impact role with tight feedback loops between research and real model behavior

About the Company
We are a research-driven AI company focused on building scalable reasoning systems. By combining advances in machine learning, data systems, and post-training methods, we aim to develop models that are more capable, aligned, and efficient.

We are committed to building an inclusive and diverse workplace and encourage applicants from all backgrounds to apply.
Show more Show less

Research Engineer, LLM Pre-training & Post-training

Similar Jobs