Senior Data Engineer
Kohl's · Milwaukee, US
Job description
About the Role
As Senior Data Engineer, you will lead the development and ownership of domain data products, including batch, streaming and artificial intelligence/machine learning (AI/ML) feature pipelines. You will drive design decisions that improve data reliability, performance and governance maturity while standardizing patterns that scale across teams. You will partner cross-functionally to enable analytics, ML and GenAI use cases with trusted data.
What You’ll Do
- Design, build and maintain batch, streaming and real-time Artificial Intelligence (AI) feature pipelines to extract data from diverse source systems and producers (Application Programming Interfaces (APIs), events, databases, files) ensuring efficient ingestion, transformation and publishing
- Design, refine and implement scalable data models, semantic layers and data contracts to promote consistency, reuse and accessibility
- Owns the end-to-end data product lifecycle for the domain. Define and maintain data contracts, including service level agreements (SLAs), schema expectations, quality metrics and consumer ownership, to ensure a reliable and trustworthy experience
- Partner with cross functional teams to co-design scalable data solutions that meet business needs and clearly define the boundaries between data pipeline responsibilities and model-building activities
- Develop automated workflows and Continuous Integration / Continuous Deployment (CI/CD) pipelines using tools such as Airflow, Apache Spark and Python to drive reliability and faster delivery
- Implement validation, observability and evaluation frameworks that ensure accuracy, lineage and timeliness across data pipelines and large language model (LLM) outputs
- Apply and enforce governance, privacy and compliance standards (GDPR, PCI DSS, CCPA), ensuring data security and traceability
- Partner with cross functional teams to translate business needs into technical data solutions that scale across domains
- Drive performance tuning, automation and adoption of AI-powered data tools to enhance data platform efficiency
- Mentor data engineers and champion best practices for maintainable, governed and reusable data assets
- Own cost and performance tradeoffs for domain data products and monitor compute usage, storage growth and unit cost to implement optimizations that reduce spend while meeting SLAs
- Additional tasks may be assigned
What Skills You Have
Required
- 4+ years designing, building and optimizing data pipelines and models in production, ideally within large-scale cloud environments
- Proficiency in SQL and Python (or Scala) for data development, testing and automation
Preferred
- Bachelor’s or Master’s degree in Computer Science, Information Systems, Data Engineering or a related field
- Experience with Apache Spark (or equivalent) for large-scale data processing and performance optimization
- Experience using Airflow/Cloud Composer/Dagster for orchestration, transformation and CI/CD pipelines
- Experience with cloud warehouses/lakes (BigQuery, Redshift, Snowflake) and object storage
- Experience designing and optimizing streaming pipelines using Kafka, Pub/Sub, spark
- Strong understanding of dimensional modeling, normalization and schema design for analytics and GenAI integration into data products
- Experience with data testing, lineage, monitoring and observability frameworks to ensure data integrity and reliability
ML/AI Work links you to the employer's original posting — always verify the details there before applying.
More AI Security roles
View all →Consultant, Artificial Intelligence (AI) Engineer
Pioneer Management Consulting · St. Paul, US
Software Engineer III, AI/ML Computer Vision, Pixel Camera
Google · San Diego, US
Senior AI Security Engineer (R-19324)
Dun & Bradstreet · Dublin, IE
Senior Software Engineer Architect
Applied Information Sciences · Baltimore, US
Staff Datacloud Blackbelt Engineer, Data and AI
Google · San Jose, US
Computer Vision Engineer – Autonomy & Perception
Pivotal · San Jose, US