H
AI Engineer
Accepting applicationsHaystack · San Diego, CA
Full-Time Mid_senior AIPythonaiaterf
Posted
1d ago
Category
Test
Experience
Mid_senior
Country
United States
We're hiring on behalf of a Haystack partner!
The Role
Lead accuracy-centric architecture and optimization for LLMs, VLMs, and multimodal AI models.
Drive Day0 hardware enablement on current and future AI platforms.
Design and implement quantization strategies to balance model quality, performance, and hardware.
Analyze and resolve accuracy regressions and numerical stability issues across the inference stack.
Collaborate with performance engineers to co-optimize kernels and execution strategies.
Define and implement accuracy evaluation metrics and tooling for robust model deployment.
What You'll Need
Extensive experience with LLMs and/or VLMs in production or pre-production environments.
Expert-level understanding of quantization, numerics, and precision tradeoffs.
Deep knowledge of transformer architectures, attention mechanisms, and MoEs.
Proven ability to balance accuracy, performance, and hardware constraints.
Strong Python skills and experience across compiler, kernel, and hardware abstraction layers.
A Bachelor's degree with 4+ years of experience, or a Master's with 3+ years, or a PhD with 2+ years in a related field.
What's On Offer
Opportunity to work at the forefront of Cloud AI and foundation model inference.
A senior technical role with broad cross-functional impact.
Competitive annual discretionary bonus and opportunity for annual RSU grants.
Comprehensive benefits package designed to support work-life balance.
Apply via Haystack today!
Show more Show less
The Role
Lead accuracy-centric architecture and optimization for LLMs, VLMs, and multimodal AI models.
Drive Day0 hardware enablement on current and future AI platforms.
Design and implement quantization strategies to balance model quality, performance, and hardware.
Analyze and resolve accuracy regressions and numerical stability issues across the inference stack.
Collaborate with performance engineers to co-optimize kernels and execution strategies.
Define and implement accuracy evaluation metrics and tooling for robust model deployment.
What You'll Need
Extensive experience with LLMs and/or VLMs in production or pre-production environments.
Expert-level understanding of quantization, numerics, and precision tradeoffs.
Deep knowledge of transformer architectures, attention mechanisms, and MoEs.
Proven ability to balance accuracy, performance, and hardware constraints.
Strong Python skills and experience across compiler, kernel, and hardware abstraction layers.
A Bachelor's degree with 4+ years of experience, or a Master's with 3+ years, or a PhD with 2+ years in a related field.
What's On Offer
Opportunity to work at the forefront of Cloud AI and foundation model inference.
A senior technical role with broad cross-functional impact.
Competitive annual discretionary bonus and opportunity for annual RSU grants.
Comprehensive benefits package designed to support work-life balance.
Apply via Haystack today!
Show more Show less