All roles

AI Systems Engineer

Design and optimize high-performance algorithm engines for in-house large model development and real-world AI applications.

About the Role

  • Build the systems core that powers 39 AI's in-house large model development, from training to inference deployment.
  • Partner with research and product teams to turn model capabilities into reliable, high-performance production systems.
  • Shape long-term technical direction across inference engines, distributed training, and model serving.

Responsibilities

  • Design and develop high-performance algorithm engines for in-house large model development and applications, providing foundational primitives for training and inference.
  • Optimize and debug large model training and inference systems, working extensively with and contributing to engines like SGLang, vLLM, and Megatron-LM.
  • Run performance evaluation and analysis for algorithm deployment, define system-level technical planning and performance standards, and continuously improve stability and efficiency.
  • Participate in the architecture design and implementation of large-model systems, driving efficient deployment and iteration in real-world scenarios.

Requirements

  • Solid C/C++/Python programming skills, familiarity with common data structures and algorithms, and strong motivation to solve complex systems problems.
  • Strong understanding of computer systems, with experience developing and architecting large software systems or low-level engines.
  • Hands-on experience with at least one large model inference or training engine, such as SGLang, vLLM, or Megatron-LM, including practical debugging and optimization.
  • Strong systems debugging skills; able to quickly root-cause and resolve performance and stability issues in distributed environments.
  • Strong communication, collaboration, and ownership, with the ability to drive technical plans into production and iterate continuously.

Nice to Have

  • Experience developing large model training or inference systems, with familiarity in related architectures and scheduling strategies.
  • Familiarity with deep learning framework internals such as PyTorch, or compiler technologies such as Triton and TVM.
  • Experience with distributed systems, high-performance networking, or storage optimization.
  • Awards in programming competitions such as ICPC, IOI, NOI, or equivalents, or contributions to relevant open-source projects.
  • Practical experience with dialogue systems, NLP application engines, or large model service deployment.

How to Apply

  • Send your resume along with GitHub, personal project, or technical writing links to the contact below.
  • For open-source contributions or past projects, direct links or short write-ups are welcome.
  • Take-home tasks, if any, will be paid at a reasonable market rate.
  • No requirements around years of experience or degree - we evaluate on technical depth and past work.