ML/AIWork
The Home Depot logo

Staff Machine Learning Engineer (REMOTE)

The Home Depot · Remote · Atlanta

Job description

Job Description

Position Purpose:

The Staff Software Engineer is responsible for leading a team of engineers building and designing a product that our customers and associates love. As a Staff Software Engineer, you will be part of a dynamic team with engineers of all experience levels who help each other build and grow technical and leadership skills while creating, deploying, and supporting production applications. In this role, you will also provide technical leadership on machine learning systems, including model development, production deployment, monitoring, and lifecycle management of ML solutions operating at scale. In addition, Staff Software Engineers will assist in product and tool selection, configuration, security, resilience, performance tuning, and production monitoring.

Staff Software Engineers contribute to foundational code elements that can be reused as well as architectural diagrams and other product-related documentation. You will help define best practices for building reliable, explainable, and maintainable ML systems that integrate seamlessly with broader software platforms.

As a Staff Software Engineer, you will be a core player on the product team and are expected to build and grow the skillsets of the more junior Engineers.

Key Responsibilities:

45% Delivery and Execution - Collaborates and pairs with other product team members (UX, engineering, and product management) to create secure, reliable, scalable machine learning solutions, Works with Product Team to ensure user stories that are developer-ready, easy to understand, and testable; Configures commercial off the shelf solutions to align with evolving business needs; Creates meaningful dashboards, logging, alerting, and responses to ensure that issues are captured and addressed proactively

15% Learning - Participates in learning activities around modern software design, machine learning, and development core practices (communities of practice); Proactively views articles, tutorials, and videos to learn about new technologies and best practices being used within other technology organizations; Attends conferences and learns how to apply innovations and technologies where appropriate

20% Strategy and Planning - Researches and analyzes business trends and behavioral data to identify opportunities for improvement and new initiatives; Leads the evaluation development and recommendation of specific technology products and platforms to provide cost-effective solutions that meet business and technology requirements; Researches and designs best fit infrastructure, network, database, security, and machine learning architectures for products; Proactively creates and maintains tools for monitoring and support; Participates in project planning and management across multiple efforts; Develops formal training courses

20% Support and Enablement - Fields questions from other product teams or support teams; Monitors tools and participates in conversations to encourage collaboration across product teams; Provides application support for software running in production; Proactively monitors production Service Level Objectives for products; Proactively reviews the Performance and Capacity of all aspects of production: code, infrastructure, data, message processing, and prediction quality

Direct Manager/Direct Reports:

This Position typically reports to the Software Engineer Manager or Sr Software Engineer Manager

This Position has 0 Direct Reports

Travel Requirements:

Typically requires overnight travel 5% to 20% of the time.

Physical Requirements:

Most of the time is spent sitting in a comfortable position, and there is frequent opportunity to move about. On rare occasions, there may be a need to move or lift light articles.

Working Conditions:

Located in a comfortable indoor area. Any unpleasant conditions would be infrequent and not objectionable.

Minimum Qualifications:

Must be eighteen years of age or older.

Must be legally permitted to work in the United States.

Preferred Qualifications:

3 - 6 years of relevant work experience

Strong experience designing, training, evaluating, and deploying machine learning models in production environments, including batch and real-time inference systems

Experience with ML lifecycle management, including feature engineering, model versioning, experimentation, validation, and monitoring for data drift and model performance degradation

Experience building and operating ML pipelines using cloud-native services, data platforms, and CI/CD practices for reproducible and reliable model deployment

Strong understanding of applied statistics, model evaluation metrics, and tradeoffs between model accuracy, interpretability, latency, and operational cost

Experience with algorithms such as clustering, forecasting, anomaly detection, and neural networks.

Experience with basic statistics and regression algorithms

Experience in advanced machine learning techniques such as NLP, convolutional neural networks, autoencoders, and embedding generation and utilization

Experience in training machine learning models with extremely large datasets

Experience with Data Analysis and Machine Learning Tools and Libraries like Jupyter Notebooks, Pandas, SciPy, Scikit-learn, Gensim, TensorFlow, PyTorch, etc. - and experience integrating them into scalable software systems

Experience in Google Cloud Platform and AI/ML-related components such as Vertex AI, BigQueryML, and AutoML

Experience in effective data engineering practices and big data platforms such as BigQuery, Data Store, etc

Experience in a modern scripting language (preferably Python)

Experience in writing SQL queries against a relational database

Experience in version control systems (preferably Git)

Experience in a Linux or Unix-based environment

Experience in a CI/CD toolchain

Experience in REST and effective web service design

Experience in production systems design, including High Availability, Disaster Recovery, Performance, Efficiency, and Security

Experience in cloud computing platforms and associated automation patterns, and the machine learning services they provide

Experience in defensive coding practices and patterns for high Availability

Experience in A/B testing and effective REST design for scalable web services architecture

Familiarity with advanced machine learning architectures, GANs, GRU, LSTMs, RNNs, CNNs, and style transfer

Minimum Education:

The knowledge, skills, and abilities are typically acquired through the completion of a high school diploma and/or GED.

Preferred Education:

No additional education

Minimum Years of Work Experience:

3

Preferred Years of Work Experience:

No additional years of experience

Minimum Leadership Experience:

None

Preferred Leadership Experience:

None

Certifications:

None

Competencies:

Global Perspective

Manages Ambiguity

Nimble Learning

Self-Development

Collaborates

Cultivates Innovation

Situational Adaptability

Communicates Effectively

Drives Results

Interpersonal Savvy

ML/AI Work links you to the employer's original posting — always verify the details there before applying.

More ML Systems and Inference roles

View all →
$120,000 – $190,000/yr
The Home Depot
Apply →