ML/AIWork

Cloud / DevOps Engineer, AI Compute Infrastructure

Richtech Creative Displays · Las Vegas, US

Job description

Position Summary
Richtech Robotics is looking for a Cloud / DevOps Engineer to support our AI compute infrastructure services. This role will help deploy, manage, and support cloud-based GPU environments for customers building AI models, robotics applications, simulation workflows, and Physical AI systems. The ideal candidate has strong Linux, networking, cloud infrastructure, and DevOps experience, with a willingness to learn GPU computing, CUDA environments, and AI workload deployment.

Responsibilities

  • Deploy and manage cloud-based GPU compute environments for customer workloads.
  • Configure virtual networks, VPNs, firewalls, security groups, SSH access, storage, and user permissions.
  • Build and maintain Linux-based environments for AI development, including Docker containers, CUDA drivers, Python environments, and Jupyter notebooks.
  • Work with AI engineers to deploy required runtime environments for model training, fine-tuning, simulation, and inference.
  • Monitor GPU usage, system performance, uptime, storage, and network connectivity.
  • Troubleshoot customer issues related to access, environment setup, networking, storage, and compute availability.
  • Create reusable deployment scripts, images, templates, and technical documentation.
  • Coordinate with cloud infrastructure partners and internal teams to resolve technical issues.

Required Qualifications

  • 2+ years of experience in cloud infrastructure, DevOps, systems administration, or network engineering.
  • Strong Linux administration skills.
  • Solid understanding of networking, including TCP/IP, VPN, DNS, firewalls, routing, security groups, and private networks.
  • Experience with Docker and containerized environments.
  • Experience with at least one major cloud platform or private cloud environment.
  • Familiarity with monitoring, logging, automation, and scripting.
  • Ability to troubleshoot infrastructure issues independently.
  • Strong communication skills and willingness to support customer-facing technical requests.
  • Interest in learning GPU computing, CUDA environments, and AI infrastructure.

Preferred Qualifications

  • Experience deploying NVIDIA GPU drivers, CUDA, cuDNN, or NVIDIA Container Toolkit.
  • Familiarity with PyTorch, TensorFlow, Hugging Face, Jupyter, or vLLM.
  • Experience with Slurm or distributed compute environments.
  • Experience with Prometheus, Grafana, ELK, or similar monitoring tools.
  • Prior experience supporting AI/ML, data science, robotics, or simulation workload

Pay: $80,000.00 - $120,000.00 per year

Benefits:

  • Dental insurance
  • Health insurance
  • Paid time off
  • Vision insurance

Work Location: In person

ML/AI Work links you to the employer's original posting — always verify the details there before applying.

More ML Systems and Inference roles

View all →
$80,000 – $120,000/yr
Richtech Creative Displays
Apply →