Strategic Healthcare Programs (SHP) is a leading provider of analytics and performance management solutions for the post-acute healthcare market. We are an industry leader in helping Home Health, Hospice, and Skilled Nursing providers improve their financial and quality performance while complying with many regulatory requirements. Additionally, we connect the post-acute world to the broader provider markets to allow for optimal management across the continuum of care.

Role Overview

We're hiring a strong Python engineer to build and operate our production ML platform end-to-end. You'll productionalize data science work by building robust on-premises infrastructure, establishing software engineering best practices, and creating the tooling that enables our data scientists to ship faster. All infrastructure is self-hosted.

This is a remote or hybrid position within the United States. Employees living within 75 miles of the Santa Barbara office are required to work in-person in the office every Wednesday.

ML experience is welcome but not required. We care most about your software engineering foundation: production Python, OOP, testing, and async/parallel performance. Our existing ML engineers will get you up to speed on the ML side — frameworks, LLMs, vector stores, vLLM, and the rest.

Team: You'll join a tight ML team where every engineer owns meaningful surface area. We're a small team where every engineer owns their code end-to-end. We value people who deeply understand the systems they build — not just that they run.

What You'll Do Day-to-Day

Production ML Systems (40%)

Build automated ML pipelines: data ingestion training evaluation deployment retraining
Deploy and serve models (batch + real-time) via FastAPI/Flask APIs with auto-scaling and rollback
Implement CI/CD for ML: model packaging, versioning, automated deployments
Optimize workflows using async, parallelism, Ray, and Dask

ML Platform & Tooling (35%)

Design reusable internal Python packages for preprocessing, training, inference, and evaluation
Refactor data science notebooks into maintainable OOP modules
Build workflow orchestration for training and inference pipelines
Create standardized templates for model development

Observability & Reliability (15%)

Monitor latency, drift, data quality, and model performance
Build alerting for degradation and anomalies (Prometheus, Grafana)
Create dashboards for production model health
Set up automated retraining triggers

Code Quality & Collaboration (10%)

Coach data scientists on production-grade Python: testing, OOP, async/parallel patterns
Establish and enforce software best practices across the ML codebase
Partner with data scientists to translate pain points into engineering solutions

Required Skills

Must Have:

5+ years of production Python engineering
Strong OOP fundamentals: classes, inheritance, composition, design patterns
Testing discipline: unit, integration, fixtures, mocking
Demonstrated async and parallel optimization (asyncio, multiprocessing, threading)
Building and operating production Python services (APIs, workers, background jobs)
Familiarity with FastAPI or Flask
Experience deploying to self-hosted/on-prem environments

Soft Skills:

Translate engineering needs into clean, maintainable code
Comfortable coaching peers on production engineering practices
Curious about ML and motivated to ramp into it

Nice-to-Have

Prior MLOps or ML platform experience
ML frameworks: scikit-learn, XGBoost, PyTorch
Observability stack: Prometheus, Grafana, structured logging/tracing
RAG pipelines: vector stores, semantic search
LLM serving: vLLM, Text Generation Inference
GenAI/agentic frameworks: LangChain, LlamaIndex, DSPy
Orchestration: Prefect, Kubeflow, Airflow, or similar
Kubernetes and containerization in on-prem environments
Experiment tracking: MLflow
LLM observability: Phoenix, Langfuse, OpenLIT
On-prem GPU infrastructure management

Pay

$140,000. - $175,000. annual, depending upon experience.

Benefits

We value work/life balance. We offer comprehensive health benefits, a 401(k) plan with a company match, an employee stock purchase plan, vacation time, sick time, and paid holidays.

This position is not eligible for immigration sponsorship.

Experience

Required

5+ years of production Python engineering
Strong OOP fundamentals: classes, inheritance, composition, design patterns
Testing discipline: unit, integration, fixtures, mocking
Demonstrated async and parallel optimization (asyncio, multiprocessing, threading)
Building and operating production Python services (APIs, workers, background jobs)
Familiarity with FastAPI or Flask
Experience deploying to self-hosted/on-prem environments

Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities
This employer is required to notify all applicants of their rights pursuant to federal employment laws. For further information, please review the Know Your Rights (https://www.eeoc.gov/poster) notice from the Department of Labor.

ML/AI Work links you to the employer's original posting — always verify the details there before applying.

MLOps Engineer

Job description

Experience

More MLOps and Platform roles

AI Quality Engineer

Machine Learning Engineer (m/w/d) – MLOps & Software Engineering

Machine Learning Engineer (m/w/d) – MLOps & Software Engineering

Senior Machine Learning Engineer

AI Platform and Linux Infrastructure Engineer (all genders)

AI Engineer