ML/AIWork
Neomax logo

Data Scientist

Neomax · Baltimore, US

Job description

Job Description

We are seeking a Data Scientist who can work in a quick-paced, dynamic, agile software development environment.

You will collaborate on a team on multiple projects that include automating processing of large forensic images, extracting and enriching metadata, and displaying resulting information in meaningful ways for analysts to conduct assessments.

,

Required Skills

Demonstrated experience building production data pipelines and ETL/ELT workflows at scale

Demonstrated experience with Apache Spark and PySpark for distributed data processing

Demonstrated experience with advanced Python programming skills including data manipulation libraries (Pandas, NumPy) and data engineering best practices

Demonstrated experience understanding data security, privacy, governance, and compliance principles

Demonstrated experience with workflow orchestration tools such as Step Functions and Airflow

Demonstrated experience with containerization such as Docker or Podman, and deploying data applications in cloud environments

Demonstrated experience with AWS services (in particular S3, Lambda, and Step Functions)

Demonstrated experience with PostgreSQL and MySQL in production environments, including performance tuning and schema design

Demonstrated experience with SQL and query optimization for complex analytical workloads

Demonstrated experience with version control (Git) and CI/CD practices for data pipelines

Demonstrated experience working with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight

Demonstrated experience with strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks

,

Desired Skills

Demonstrated experience with data lakehouse architectures using Apache Iceberg

Demonstrated experience configuring, deploying, and integrating data platform components: Apache Ranger (access control and data governance), Trino (distributed SQL query engine),

Data catalogs (Unity Catalog OSS, Apache Polaris, etc.), and Apache Superset (data visualization and dashboarding)

Demonstrated experience with Bash scripting for automation and data processing tasks

Demonstrated experience with Infrastructure as Code (Terraform or CloudFormation) for data infrastructure

Demonstrated experience with tracking data lineage and associated tooling such as OpenLineage

Demonstrated experience with Java

Demonstrated experience with data quality frameworks, testing methodologies, and validation strategies

Demonstrated experience or background with large-scale data migrations or platform modernization efforts

Demonstrated experience integrating AI/ML services and models (translation, OCR, speech-to-text, NLP, language detection, topic modeling), LLMs, and RAG (retrieval-augmented generation) pipelines

Demonstrated experience with geospatial data processing (H3, PostGIS, or similar)

Demonstrated experience Contributing to data engineering documentation, best practices, or design patterns

Demonstrated experience with NoSQL databases (DynamoDB, etc.)

Demonstrated experience with excellent written and verbal communication skills with both technical and non-technical audiences

Demonstrated experience with Linux Operating Systems

Demonstrated experience with Agile/Scrum development methodologies in a fast-paced, collaborative team environment

Demonstrated experience working effectively in high-performing, cross-functional teams with multiple concurrent projects

Demonstrated experience working directly with stakeholders to gather requirements, understand needs, and translate them into technical solutions with minimal oversight

Demonstrated experience in self-directed work with a strong ownership mentality and commitment to code quality, testing, and documentation

Demonstrated experience context-switching between projects and systems as priorities demand

,

About NeoMax

NeoMax is a minority owned small business that specializes in DevOps, Data Science, and Cybersecurity IT solutions. With over ten years of experience in research, engineering and development, we take a “big picture” approach to meeting our business needs, prioritizing both technical and soft skills in the workplace for improved performance.

We create a thriving work environment that enables us to provide excellent support to our customers. We encourage and respect each other, just as we are encouraged and respected by our leadership. We are committed to contributing to the health of the organization by ensuring our words and actions align with our core values of Integrity, Diversity, Empowerment, Ambition and Service. We recognize that we represent the company in all aspects of our professional interactions, and we treat individuals external to our organization with the same respect that we treat each other.

ML/AI Work links you to the employer's original posting — always verify the details there before applying.

More Data Science roles

View all →
Data Scientist
Neomax
Apply →