US Citizens/Green Card only

Work on cutting-edge AI infrastructure, contributing to next-generation high-performance compute systems supporting enterprise-scale machine learning and scientific workloads.

Overview
Seeking a Runtime Engineer to build high-performance system software powering large-scale AI and ML workloads. This role focuses on developing low-level infrastructure that maximizes hardware efficiency and enables scalable, distributed compute across enterprise AI platforms.

Key Responsibilities

Core Engineering
Design and develop runtime stack features for high-performance ML training and inference
Build system-level software including drivers, kernel integrations, and OS interfaces
Develop high-performance user-space libraries for optimal hardware utilization

Distributed Systems & Performance
Enable scalable data processing across distributed environments
Optimize networking, communication, and workload orchestration across nodes
Improve system performance, reliability, and observability

Tooling & Ecosystem
Build user-facing tools for profiling, debugging, and system management
Support orchestration, monitoring, and error handling across compute systems
Collaborate cross-functionally with hardware, ML, compiler, and DevOps teams

Profile
3–5 years of experience in systems or infrastructure engineering
Strong programming skills in C/C++ and Python
Experience with operating systems, kernel development, and user-space libraries
Background in distributed systems with focus on scalability and performance

Preferred Strengths
Familiarity with high-speed interconnects (e.g., PCIe, InfiniBand, RoCE)
Experience with low-latency networking (e.g., RDMA)
Strong debugging, optimization, and collaboration skills
Show more Show less

Software Engineer - Runtime

Similar Jobs