• Games
  • Industry
  • Resources
  • Community
  • Learning
  • Support
  • Pricing
Develop
Unity Engine
Build 2D and 3D games for any platform
Collaboration
Collaborate and iterate quickly with your team
Download Unity
Plans and pricing
Deploy
Multiplatform
Discover 25+ platforms Unity supports
LiveOps
Post-launch insights and live game ops
Grow
User acquisition
Get discovered and acquire mobile users
In-App Purchase
Discover and manage IAP across stores
Monetization
Connect players with the right games
Advertise with Unity
Monetize with Unity
Use cases
Mobile Games
Build & grow mobile hits with Unity
Indie Games
Ship big games with small teams
XR Games
Launch XR games across platforms
Multiplayer Games
Simplify multiplayer game development
Use cases
3D collaboration
Build and review 3D projects in real time
Immersive training
Train in immersive environments
Customer experiences
Create interactive 3D experiences
Industries
Manufacturing
Achieve operational excellence
Retail
Transform in-store experiences into online ones
Automotive
Boost innovation and in-car experiences
See all industries
Technical library
Documentation
Official user manuals and API references
Developer tools
Release versions and issue tracker
Roadmap
Review upcoming features
Glossary
Library of technical terms
Insights
Case studies
Real-world success stories
Best practice guides
Expert tips and tricks
Demos
Demos, samples, and building blocks
All resources
What's new
Blog
Updates, information, and technical tips
News
News, stories, and press center
Community Hub
Discussions
Discuss, problem-solve, and connect
Events
Global and local events
Community stories
Made with Unity
Showcasing Unity creators
Livestreams
Join devs, creators, and insiders
Unity Awards
Celebrating Unity creators worldwide
For every level
Unity Learn
Master Unity skills for free
Professional training
Level up your team with Unity trainers
New to Unity
Getting started
Kickstart your learning
Unity Essential Pathways
New to Unity? Start your journey
How-to Guides
Actionable tips and best practices
Education
For students
Kickstart your career
For educators
Supercharge your teaching
Education Grant License
Bring Unity’s power to your institution
Certifications
Prove your Unity mastery
Support options
Get help
Helping you succeed with Unity
Success plans
Reach your goals faster with expert support
FAQ
Answers to common questions
Contact us
Connect with our team
Download Unity
Get started
Language
  • English
  • Deutsch
  • 日本語
  • Français
  • Português
  • 中文
  • Español
  • Русский
  • 한국어
Social
Currency
Purchase
  • Products
  • Unity Ads
  • Unity Asset Store
  • Resellers
Education
  • Students
  • Educators
  • Institutions
  • Certification
  • Learn
  • Skills Development Program
Download
  • Unity Hub
  • Download Archive
  • Beta Program
Unity Labs
  • Labs
  • Publications
Resources
  • Learn platform
  • Community
  • Documentation
  • Unity QA
  • FAQ
  • Services Status
  • Case Studies
  • Made with Unity
Unity
  • Our Company
  • Newsletter
  • Blog
  • Events
  • Careers
  • Help
  • Press
  • Partners
  • Investors
  • Affiliates
  • Security
  • Social Impact
  • Inclusion & Diversity
  • Contact us
Copyright © 2025 Unity Technologies
  • Legal
  • Privacy Policy
  • Cookies
  • Do Not Sell or Share My Personal Information

"Unity", Unity logos, and other Unity trademarks are trademarks or registered trademarks of Unity Technologies or its affiliates in the U.S. and elsewhere (more info here). Other names or brands are trademarks of their respective owners.

Hero background image
Requisition ID: JOBREQ-2616113

Staff Machine Learning Engineer, ML Infrastructure - Online

Shanghai, China, Full-time
  1. Unity Careers
  2. Positions
  3. Description
ALERT: Unity has received reports of scams where individuals purporting to be Unity HR representatives conduct bogus employment interviews via email or text, and then request payment as a condition for receiving an offer of employment. Please be aware that Unity does not conduct interviews by email or text, and will never request payment as a condition for applying for a position or receiving an offer of employment. These scam operators may also ask for your personal information (name, address, birthdate, social security number, etc.) which you should not provide to them. If you have been a target of such a scam, you should report it by contacting the U.S. Federal Trade Commission (see this FTC posting for further details) the office of your state Attorney General, or the government agency responsible for investigating matters such as this where you reside this FTC posting for further details) the office of your state Attorney General, or the government agency responsible for investigating matters such as this where you reside.
See FTC
  • The opportunity
  • The Role
  • What you'll be doing
  • What we're looking for
  • Additional information
  • Benefits
  • Life at Unity
  • Apply

The opportunity

Unity Vector builds ML infrastructure that powers real-time prediction, experimentation, attribution, and AI-driven decision-making across the company.

Our online ML systems serve production models at scale, supporting low-latency inference, large-scale experimentation, model deployment and optimization, feature processing, and business-critical decisioning. As model complexity, traffic volume, and experimentation velocity continue to grow, our inference platform must remain reliable, scalable, observable, and cost-efficient.

To support this growth, we need strong technical ownership to evolve the online ML infrastructure that enables ML teams to safely deploy, validate, and operate production models at scale.

The Role

We are seeking a senior/staff ML engineer to design and evolve Unity Vector’s online model inference platform. This role focuses on building reliable infrastructure for serving machine learning models in production, optimizing inference performance, and enabling safe, efficient experimentation across high-traffic online systems.

You will work closely with ML engineers, platform teams, and product stakeholders to ensure models can be deployed, scaled, monitored, and iterated on efficiently. You will play a key role in shaping how models are packaged, served, validated, monitored, and optimized in production environments.

This role requires strong systems thinking, deep experience with production ML infrastructure, and the ability to drive architectural improvements across teams.

What you'll be doing

  • Design and operate large-scale online inference infrastructure that serves production ML models with low latency and high reliability.
  • Build and improve model serving systems using technologies such as PyTorch, Triton Inference Server, Kubernetes, GKE, Ray, or similar distributed serving frameworks.
  • Optimize inference performance through batching, model compilation, GPU/CPU utilization improvements, request scheduling, and runtime-level tuning.
  • Develop infrastructure for model deployment, canary testing, A/B experimentation, traffic splitting, rollback, and production validation.
  • Improve observability of online ML systems through latency, throughput, error-rate, cost, saturation, and model-health monitoring.
  • Build self-healing and autoscaling capabilities to support dynamic experiment traffic, changing model complexity, and production reliability requirements.
  • Partner closely with ML engineers to support faster model iteration while maintaining production safety, scalability, and cost efficiency.
  • Improve the reliability and reproducibility of model serving workflows, including model packaging, artifact validation, compatibility testing, and deployment automation.
  • Lead architectural improvements that make the online ML platform more robust, user-friendly, scalable, and cost-efficient.

What we're looking for

  • Strong experience building and operating production-grade online ML inference systems.
  • Experience with model serving frameworks such as NVIDIA Triton Inference Server, TorchServe, Ray Serve, TensorFlow Serving, or similar systems.
  • Experience optimizing inference workloads using techniques such as dynamic batching, model compilation, quantization, GPU acceleration, GPU kernel optimization, caching, or runtime tuning.
  • Strong experience with distributed systems, Kubernetes, autoscaling, service reliability, and production observability.
  • Strong programming skills in Python, with practical experience working on production ML systems and high-scale services.
  • Experience with PyTorch and modern model deployment workflows, including model packaging, validation, and serving lifecycle management.
  • Experience designing infrastructure for safe model rollout, canary testing, A/B experimentation, and automated rollback.
  • Strong systems thinking, with the ability to reason about latency, throughput, reliability, scalability, and cost tradeoffs in online systems.
  • Proven ability to lead technical direction and influence architectural decisions across teams without formal authority.

Additional information

  • Relocation support is not available for this position
  • Work visa/immigration sponsorship is not available for this position

Benefits

At Unity, we want our team members to thrive. We offer a wide range of benefits designed to support well-being and work-life balance.

Please note: Benefits eligibility, specific offerings, and coverage vary based on the country and employment status.

While specific benefits vary, here are some of the ways we strive to take care of our eligible team members globally: Comprehensive health, life, and disability insurance | Commute subsidy | Employee stock ownership | Competitive retirement/pension plans | Generous vacation and personal days | Support for new parents through leave and family-care programs | Office food snacks | Mental Health and Wellbeing programs and support | Employee Resource Groups | Global Employee Assistance Program | Training and development programs | Volunteering and donation matching program

Life at Unity

Unity [NYSE: U] is the world’s leading game engine, powering play for more than 3 billion consumers each month. The top mobile games in the world, the most played PC indie titles, the most innovative console games, and virtually all of the top XR and Web Games are developed, deployed, and grown in Unity. Unity also enables teams across industries like automotive, manufacturing, and healthcare to design, simulate, and collaborate in 3D — closing the gap between ideas and reality. For more information, please visit www.unity.com.

Unity is an equal opportunity employer committed to fostering an inclusive, innovative environment with the best employees. Therefore, we provide employment opportunities without regard to age, race, color, ancestry, national origin, disability, gender, or any other protected status in accordance with applicable law. If you have a disability that means there are preparations or accommodations we can make to help ensure you have a comfortable and positive interview experience, please fill out this form to let us know.

This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.

Headhunters and recruitment agencies may not submit resumes/CVs through this website or directly to managers. Unity does not accept unsolicited headhunter and agency resumes. Unity will not pay fees to any third-party agency or company that does not have a signed agreement with Unity.

Your privacy is important to us. Please take a moment to review our Prospect and Applicant Privacy Policies. Should you have any concerns about your privacy, please contact us at DPO@unity.com.

#SEN


Location: Shanghai, ChinaDepartment: AI & Machine LearningType: Full-timeRequisition ID: JOBREQ-2616113