• 游戏
  • 工业
  • 资源
  • 社区
  • 学习
  • 支持
  • 定价
开发
Unity 引擎
为任何平台构建2D和3D游戏
协作
与团队协作并快速迭代
下载 Unity
计划和定价
部署
多平台
探索 Unity 支持的超过 25 个平台
常态化运营
游戏上线后的数据洞察与常态化运营
Grow
用户获取
被发现并获取移动用户
应用内购
发现并管理各商店的IAP
商业化
将玩家与合适的游戏连接
通过 Unity 投放广告
通过 Unity 实现变现
使用案例
移动游戏
使用 Unity 打造移动端爆款游戏
独立游戏
小团队也能做出大游戏
XR 游戏
跨平台发布 XR 游戏
多人游戏
简化多人游戏开发
使用案例
3D协作
实时构建和审查3D项目
沉浸式培训
在沉浸式环境中培训
客户体验
创建互动3D体验
行业
制造业
实现运营卓越
零售
将店内体验转化为在线体验
汽车
提升创新能力和车内体验
查看所有行业
技术库
文档
官方用户手册和API参考
开发者工具
发布版本和问题跟踪器
路线图
查看即将推出的功能
术语表
技术术语库
洞察
案例分析
真实成功案例
最佳实践指南
专家提示和技巧
演示
演示、示例和构建模块
所有资源
新增功能
博客
更新、信息和技术提示
新闻
新闻、故事和新闻中心
社区中心
讨论
讨论、解决问题和连接
事件
全球和本地活动
社区故事
Made with Unity
展示Unity创作者
直播活动
加入开发者、创作者和内部人员
Unity奖项
庆祝全球的Unity创作者
适合每个级别
Unity Learn
免费掌握Unity技能
专业培训
通过Unity培训师提升您的团队
Unity新手
准备开始
开始您的学习
Unity基础路径
你是Unity 新手?开始您的旅程
使用指南
可操作的技巧和最佳实践
教育
对于学生
开启您的职业生涯
对于教育者
增强您的教学
教育资助许可证
将Unity的力量带入您的机构
认证
证明您的Unity精通
支持选项
获取帮助
帮助您在Unity中取得成功
成功计划
通过专家支持更快实现目标
常见问题解答
常见问题解答
联系我们
与我们的团队联系
下载 Unity
开始使用
语言
  • English
  • Deutsch
  • 日本語
  • Français
  • Português
  • 中文
  • Español
  • Русский
  • 한국어
社交
货币
采购
  • 产品
  • Unity Ads
  • Unity Asset Store
  • 经销商
教育
  • 学生
  • 教师
  • 机构
  • 认证
  • 学习
  • 技能发展计划
下载
  • Unity Hub
  • 下载存档
  • Beta 版测试
Unity Labs
  • 实验室
  • 作品
资源
  • 学习平台
  • 社区
  • 文档
  • Unity QA
  • 常见问题解答
  • 服务状态
  • 案例分析
  • Made with Unity
Unity
  • 我们公司
  • 新闻简报
  • 博客
  • 事件
  • 工作机会
  • 帮助
  • 新闻
  • 合作伙伴
  • 投资人
  • 附属机构
  • 安防
  • 社会影响力
  • 包容性与多样性
  • 联系我们
版权所有 © 2025 Unity Technologies
  • 法律
  • 隐私政策
  • Cookie
  • 不要出售或分享我的个人信息

“Unity”、Unity 徽标及其他 Unity 商标是 Unity Technologies 或其分支机构在美国及其他地区的商标或注册商标(单击此处获取更多信息)。其他名称或品牌是其各自所有者的商标。

Hero background image
{title},{location} - Unity职业: JOBREQ-2616041

Senior Machine Learning Engineer, On-Device & Mobile AI Optimization

San Francisco, CA, USA, Full-time
  1. Unity Careers
  2. Positions
  3. 描述
警告:Unity收到关于诈骗的报告,诈骗者假冒Unity人力资源代表,通过电子邮件或短信进行虚假的就业面试,并要求支付作为获得工作机会的条件。请注意,Unity不会通过电子邮件或短信进行面试,也绝不会要求支付作为申请职位或获得工作机会的条件。这些诈骗者可能还会要求您的个人信息(姓名、地址、出生日期、社会安全号码等),您不应向他们提供这些信息。如果您成为此类诈骗的目标,您应通过联系美国联邦贸易委员会(请参阅此FTC公告以获取更多详细信息)、您所在州的检察长办公室或负责调查此类事务的政府机构进行举报。请查看更少
查看FTC
  • The opportunity
  • fast, small,
  • and reliably
  • What you'll be doing
  • What we're looking for
  • You might also have
  • Additional information
  • Benefits
  • Life at Unity
  • 应用

The opportunity

We are building the next generation of AI-driven game experiences, running generative models on-device, right where the players are — on phones, tablets, laptops, and desktops. Our games run inside a modern, browser-native runtime (built on technologies such as WebGPU and WebNN), so the models that power these experiences must be deployed and accelerated entirely within that runtime. As a Senior Machine Learning Engineer for On-Device & Mobile AI, you will take state-of-the-art multi-modal models — transformers, diffusion networks, and vision-language models (VLMs) — and make them run

fast, small,

and reliably

on mobile and constrained hardware.

This is a deeply hands-on role. You will own the optimization and deployment of significant parts of the inference stack — from a trained checkpoint leaving research, through export, quantization, and kernel-level tuning, to a shipped feature running inside the engine at interactive frame rates within a fixed memory and power budget. Your work directly shapes the latency, quality, memory footprint, and battery profile of AI features experienced by billions of players.

This role is for an engineer who is energized by the gap between a research model and a shipping, on-device product. If you enjoy profilers, frame captures, op-fusion, and shaving milliseconds and megabytes, this is your role.

What you'll be doing

  • Inference & On-Device Optimization
  • Own the optimization pipeline for the models you ship: model export, graph transformation, operator fusion, memory-layout planning, and hardware-specific tuning across NPU, mobile GPU, and desktop/laptop GPU.
  • Apply quantization (INT4/INT8/FP16), weight sharing, structured/unstructured pruning, and knowledge distillation to hit hard latency, memory, and power budgets — and validate them against quality bars.
  • Do low-level performance work: write and tune WebGPU compute shaders (WGSL) and, where relevant, native kernels (Metal, Vulkan/SPIR-V compute, CUDA); profile with browser and platform tools (Chrome/Dawn GPU traces, PIX, Instruments/Metal System Trace,
  • Snapdragon Profiler, Nsight, RenderDoc), and eliminate bottlenecks at the op and memory-bandwidth level.
  • Apply efficiency techniques — dynamic resolution, token reduction, cross-frame caching/reuse, reduced-step diffusion samplers — as engineering levers to meet budgets on target SKUs.
  • Runtime & Systems Integration
  • Work with WebGPU-targeted inference runtimes (ONNX Runtime Web, Transformers.js, WebLLM, TensorFlow.js) alongside native options (CoreML, ONNX Runtime, TFLite, ExecuTorch), and extend or build glue code where off-the-shelf options fall short of our diffusion and VLM workloads.
  • Build parts of the integration between the ML runtime and the game engine: real-time scheduling, memory pooling, zero-copy buffer sharing between the inference and render paths, and frame-budget management alongside the renderer.
  • Build supporting engineering for your components: model packaging and asset pipelines, on-device fallbacks and SKU-aware capability tiers, crash/quality telemetry, and automated on-device benchmarking in CI.
  • Research Productionization
  • Partner with research scientists to turn novel CV and multi-modal architectures into implementations that are deployable, debuggable, and fast on device.
  • Provide a feedback loop into research: surface hardware constraints, op-support gaps, and cost models early so model design and deployment converge.
  • Track breakthroughs in efficient inference (efficient attention, distillation, reduced-step diffusion) and assess them pragmatically: what actually moves latency/memory/power on our target devices.
  • Collaboration & Engineering Quality
  • Contribute to engineering best practices, code-review standards, performance-regression gates, and on-device benchmarking methodology.
  • Support a culture of measurement: track KPIs for latency, quality, memory, and power for the systems you work on, across the device matrix.
  • Partner with platform engineers, product managers, and runtime teams to align your work with device-SKU constraints and product roadmaps.
  • Share knowledge and mentor junior and mid-level engineers through code review, pairing, and design discussion.

What we're looking for

  • 5+ years in software/ML engineering, with meaningful time focused on on-device / edge inference or real-time, performance-critical systems.
  • Production deployment of transformer- and/or diffusion-based models (e.g., ViT, Stable Diffusion, CLIP/SigLIP-style encoders) on mobile, desktop, or embedded hardware — shipped, not just prototyped.
  • Hands-on experience with at least one major inference runtime (ONNX Runtime / ORT Web, CoreML, TFLite, ExecuTorch) and a working understanding of operator fusion, memory layout, and runtime scheduling.
  • Low-level performance engineering: solid command of at least one GPU/compute API — WebGPU/WGSL, Metal, Vulkan, D3D12, or CUDA — and the profiling tools to go with it.  You can read a frame capture and a kernel trace and reason about where the time and memory go.
  • Working knowledge of model-optimization techniques — quantization (INT4/INT8/FP16), weight sharing, pruning, and distillation — and the judgment to apply them to hit latency and memory budgets. You use them effectively as engineering tools.
  • Understanding of target hardware: mobile SoCs (Apple Neural Engine, Qualcomm Hexagon/Adreno, ARM Mali) and/or desktop/laptop GPUs (Apple Silicon, NVIDIA, AMD, Intel).
  • Strong Python for export pipelines and training-side tooling; familiarity with the core languages of a browser-native runtime (TypeScript/JavaScript, WGSL) is a plus.
  • Working fluency with the models you deploy — enough to read an architecture, modify it for deployment, and reason about accuracy trade-offs.
  • A collaborative working style: clear communication, reliable delivery, and a willingness to support and learn from teammates.

You might also have

  • Experience shipping world-model, neural-rendering, or real-time generative pipelines NeRF, 3DGS, real-time diffusion, or similar) on device.
  • Hands-on experience deploying models through WebGPU — e.g., ONNX Runtime Web WebGPU EP), Transformers.js, WebLLM, or TensorFlow.js — including writing/tuning WGSL compute shaders.
  • Game-engine or real-time-graphics background (Unity, Unreal, or a custom engine; Metal/Vulkan/D3D/OpenGL ES render pipelines) — especially integrating compute workloads alongside a renderer.
  • Contributions to open-source ML inference frameworks, runtimes, or GPU/compute libraries especially in the WebGPU ecosystem (Dawn, wgpu, ORT Web, Transformers.js, WebLLM).
  • Familiarity with compiler stacks (MLIR, TVM, IREE, XLA) for custom kernel generation and graph optimization.
  • Experience with on-device benchmarking infrastructure, performance-regression CI, and device-farm matrices.
  • Proficiency in C++/Objective-C/Swift for runtime integration.

Additional information

  • Relocation support is not available for this position
  • Work visa/immigration sponsorship is not available for this position

Benefits

At Unity, we want our team members to thrive. We offer a wide range of benefits designed to support well-being and work-life balance.

Please note: Benefits eligibility, specific offerings, and coverage vary based on the country and employment status.

While specific benefits vary, here are some of the ways we strive to take care of our eligible team members globally: Comprehensive health, life, and disability insurance | Commute subsidy | Employee stock ownership | Competitive retirement/pension plans | Generous vacation and personal days | Support for new parents through leave and family-care programs | Office food snacks | Mental Health and Wellbeing programs and support | Employee Resource Groups | Global Employee Assistance Program | Training and development programs | Volunteering and donation matching program

Life at Unity

Unity [NYSE: U] is the world’s leading game engine, powering play for more than 3 billion consumers each month. The top mobile games in the world, the most played PC indie titles, the most innovative console games, and virtually all of the top XR and Web Games are developed, deployed, and grown in Unity. Unity also enables teams across industries like automotive, manufacturing, and healthcare to design, simulate, and collaborate in 3D — closing the gap between ideas and reality. For more information, please visit www.unity.com.

Unity is a proud equal opportunity employer. We are committed to fostering an inclusive, innovative environment and celebrate our employees across age, race, color, ancestry, national origin, religion, disability, sex, gender identity or expression, sexual orientation, or any other protected status in accordance with applicable law. Our differences are strengths that enable us to support the growing and evolving needs of our customers, partners, and collaborators. If you have a disability that means there are preparations or accommodations we can make to help ensure you have a comfortable and positive interview experience, please fill out this form to let us know.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

This position requires the incumbent to have a sufficient knowledge of English to have professional verbal and written exchanges in this language since the performance of the duties related to this position requires frequent and regular communication with colleagues and partners located worldwide and whose common language is English.

Headhunters and recruitment agencies may not submit resumes/CVs through this Web site or directly to managers. Unity does not accept unsolicited headhunter and agency resumes. Unity will not pay fees to any third-party agency or company that does not have a signed agreement with Unity.

Your privacy is important to us. Please take a moment to review our Prospect Privacy Policy and Applicant Privacy Policy. Should you have any concerns about your privacy, please contact us at DPO@unity.com.

#SEN #LI-MC1

*Note: This range reflects the anticipated base salary for this position. Beyond base salary, this role may be eligible for equity awards and participation in our company incentive plans (such as annual discretionary bonuses or sales commissions). The final offer amount will depend on several factors, including geographic location and the candidate’s relevant experience, professional background, and skill set.  Gross pay salary $188,200—$282,200 USD

所在地: San Francisco, CA, USA申请编号: AI & Machine LearningType: Full-time{title},{location} - Unity职业: JOBREQ-2616041