Senior SW Engineer – AI Infrastructure & Optimization
NeuReality · Katowice, PL
Job description
We are looking for a Senior Software Engineer to help build and optimize large-scale, high-performance GenAI infrastructure and inference systems on Kubernetes.
As AI workloads increasingly move toward Kubernetes-native infrastructure, we are building systems that support distributed inference, performance optimization, reliability, observability, and production-grade deployment at scale.
This role is ideal for an engineer who can reason deeply about systems, performance, tradeoffs, and reliability, and who is comfortable owning difficult technical decisions end-to-end.
You will work across inference serving, distributed systems, optimization, and Kubernetes-native AI infrastructure.
What You’ll Do
- Build and optimize high-performance Kubernetes-native GenAI inference systems
- Work with modern inference stacks such as vLLM, SGLang, TensorRT-LLM, and related tooling
- Work with Kubernetes-native distributed LLM inference frameworks such as llm-d and NVIDIA Dynamo
- Design and implement optimization algorithms and performance improvements
- Improve reliability, observability, deployment, and operational maturity of AI systems
- Make architectural decisions and take ownership of technical outcomes
- Collaborate with a small, senior engineering team focused on performance and production quality
Requirements:
Required Qualifications
- Minimum 5 years of experience as a Software Engineer, with strong software engineering and system design skills.
- Programming experience in Go and Python
- Hands-on experience with the Kubernetes ecosystem, including Operators, service meshes, GitOps, Gateway API, and OpenTelemetry
- Experience with cloud platforms
- Strong understanding of optimization algorithms and performance engineering
- Ability to independently drive technical initiatives from concept to production
- Strong systems thinking and debugging skills
- Comfort operating in environments with high autonomy and responsibility
Nice to Have
- Experience with modern LLM inference frameworks such as vLLM, SGLang, or TensorRT-LLM
- Experience with distributed LLM inference frameworks such as llm-d or NVIDIA Dynamo
- Contributions to open-source Kubernetes or ML infrastructure projects
- GPU performance optimization and profiling experience
- Familiarity with CUDA, NCCL, or Triton kernels
- Experience running GenAI systems at scale in production
ML/AI Work links you to the employer's original posting — always verify the details there before applying.
More Machine Learning roles
View all →Senior Machine Learning Software Engineer
Autodesk · Katowice, PL
Tech Annotator - Japanese - Paris
DataForce by TransPerfect · Paris, FR
CAMERIERE/A AI PIANI
Cosmopolitan Hotels spa · Rome, IT
Stage – Marketing Digital, IA & Automation H/F
Bo&Mie · Paris, FR
Facchino ai piani Hotel
HOTEL ATLANTIC PALACE · Florence, IT
Cuoco capo partita ai secondi
Excelsior srl · Verona, IT