OG
Software Engineer - Runtime
Accepting applicationsOho Group · Mountain View, CA
Full-Time Mid_senior AIC++PCIePythonai
Posted
23 Apr
Category
Test
Experience
Mid_senior
Country
United States
US Citizens/Green Card only
Work on cutting-edge AI infrastructure, contributing to next-generation high-performance compute systems supporting enterprise-scale machine learning and scientific workloads.
Overview
Seeking a Runtime Engineer to build high-performance system software powering large-scale AI and ML workloads. This role focuses on developing low-level infrastructure that maximizes hardware efficiency and enables scalable, distributed compute across enterprise AI platforms.
Key Responsibilities
Core Engineering
Design and develop runtime stack features for high-performance ML training and inference
Build system-level software including drivers, kernel integrations, and OS interfaces
Develop high-performance user-space libraries for optimal hardware utilization
Distributed Systems & Performance
Enable scalable data processing across distributed environments
Optimize networking, communication, and workload orchestration across nodes
Improve system performance, reliability, and observability
Tooling & Ecosystem
Build user-facing tools for profiling, debugging, and system management
Support orchestration, monitoring, and error handling across compute systems
Collaborate cross-functionally with hardware, ML, compiler, and DevOps teams
Profile
3–5 years of experience in systems or infrastructure engineering
Strong programming skills in C/C++ and Python
Experience with operating systems, kernel development, and user-space libraries
Background in distributed systems with focus on scalability and performance
Preferred Strengths
Familiarity with high-speed interconnects (e.g., PCIe, InfiniBand, RoCE)
Experience with low-latency networking (e.g., RDMA)
Strong debugging, optimization, and collaboration skills
Show more Show less
Work on cutting-edge AI infrastructure, contributing to next-generation high-performance compute systems supporting enterprise-scale machine learning and scientific workloads.
Overview
Seeking a Runtime Engineer to build high-performance system software powering large-scale AI and ML workloads. This role focuses on developing low-level infrastructure that maximizes hardware efficiency and enables scalable, distributed compute across enterprise AI platforms.
Key Responsibilities
Core Engineering
Design and develop runtime stack features for high-performance ML training and inference
Build system-level software including drivers, kernel integrations, and OS interfaces
Develop high-performance user-space libraries for optimal hardware utilization
Distributed Systems & Performance
Enable scalable data processing across distributed environments
Optimize networking, communication, and workload orchestration across nodes
Improve system performance, reliability, and observability
Tooling & Ecosystem
Build user-facing tools for profiling, debugging, and system management
Support orchestration, monitoring, and error handling across compute systems
Collaborate cross-functionally with hardware, ML, compiler, and DevOps teams
Profile
3–5 years of experience in systems or infrastructure engineering
Strong programming skills in C/C++ and Python
Experience with operating systems, kernel development, and user-space libraries
Background in distributed systems with focus on scalability and performance
Preferred Strengths
Familiarity with high-speed interconnects (e.g., PCIe, InfiniBand, RoCE)
Experience with low-latency networking (e.g., RDMA)
Strong debugging, optimization, and collaboration skills
Show more Show less
Similar Jobs
M
HBM PE DFT
Micron · Boise, United States, North America
N
Test Engineer - Photonic
NVIDIA · Roskilde, Denmark, Europe
N
Lead Engineer, Healthcare Data Operations and Strategy
NVIDIA · Santa Clara, United States, North America
AM
Administrative Assistant – Categorie Protette L.68/99
Applied Materials · Treviso, Italy, Europe