Staff SWE, Compiler Architect, System Performance Modeling
Google · Washington, US
Job description
In accordance with Washington state law, we are highlighting our comprehensive benefits package, which is available to all eligible US based employees. Benefits for this role include:
- Health, dental, vision, life, disability insurance
- Retirement Benefits: 401(k) with company match
- Paid Time Off: 20 days of vacation per year, accruing at a rate of 6.15 hours per pay period for the first five years of employment
- Sick Time: 40 hours/year (increased to 69 hours/year for Seattle) including 5 discretionary sick days per instance
- Maternity Leave (Short-Term Disability + Baby Bonding): 28-30 weeks
- Baby Bonding Leave: 18 weeks
- Holidays: 13 paid days per year
Note: By applying to this position you will have an opportunity to share your preferred working location from the following: Sunnyvale, CA, USA; Kirkland, WA, USA; New York, NY, USA; Seattle, WA, USA.### Minimum qualifications:
- Bachelor's degree or equivalent practical experience.
- 8 years of experience programming in C++ or Python.
- 5 years of experience testing, and launching software products.
- 5 years of experience with performance, large-scale systems data analysis, visualization tools, or debugging.
- 3 years of experience with software design and architecture.
Preferred qualifications:
- Experience with hardware/software co-design problems, especially performance analysis and bottleneck identification at the pre-silicon stage.
- Experience with ML system architectures, including knowledge of compilers, Intermediate Representations (IRs), and hardware accelerators.
- Experience enabling and optimizing large-scale ML models (e.g., LLMs, large embedding models).
- Ability to lead technical strategy for complex systems, influencing both simulation toolchains and hardware roadmaps.
- Proven expertise in constructing custom IR dialects and leveraging open-source compiler frameworks (MLIR, XLA) to solve system level analysis and exploring software-hardware mapping opportunities.
- Expertise in architecting high-confidence, high-velocity system performance modeling and correlation infrastructure.
About the job
Google Cloud’s mission is to make every business successful through AI by combining cutting-edge technology, infrastructure, and talent. AI/ML software engineers in Cloud bridge the gap between pioneering models and a massive product vehicle reaching billions. Our talent density and AI-powered tools drive rapid development, rooted in a culture of empowerment and a bias to action. In this role, you aren’t just building technology; you’re shaping the frontier of enterprise and driving the evolution of advanced models.
Our team is pioneering next-generation performance modeling and simulation technologies that drive multi-year system architecture roadmaps for cutting-edge machine learning accelerators. We are looking for a visionary technical lead to define and own the accuracy and fidelity of our critical co-design simulation platform. Help work on the most complex system-level performance challenges in close collaboration with hardware designers, ML researchers, and product architects, defining the next decade of AI systems at data center scale. If you are excited about building the most powerful ML systems with HW-SW co-design and optimization, please join us and accomplish the missions together.
The AI and Infrastructure team is redefining what’s possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.
We're the driving force behind Google's groundbreaking innovations, empowering the development of our cutting-edge AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.
Individual pay is determined by factors including job-related skills, experience, and relevant education or training.
US: $207000 - $301000 (USD) + 20% bonus target + bonus + equity + benefits
Learn more about benefits at Google.Responsibilities
- Establish and maintain high-confidence correlation infrastructure between simulated performance and physical hardware measurements (silicon).
- Architect and evolve the simulation layer to support deep exploration of complex, business-critical workloads (e.g., large language models, advanced kernels) and future system topologies.
- Identify and solve system-level hardware/software bottlenecks and optimization opportunities at the critical pre-silicon stage.
- Provide high-confidence lower-bound performance estimates for future ML systems and architectures. Google is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. See also Google's EEO Policy and EEO is the Law. If you have a disability or special need that requires accommodation, please let us know by completing our Accommodations for Applicants form.
ML/AI Work links you to the employer's original posting — always verify the details there before applying.