Data Scientist
Neomax · Baltimore, US
Job description
Job Description
We are seeking a Data Scientist who can work in a quick-paced, dynamic, agile software development environment.
You will collaborate on a team on multiple projects that include automating processing of large forensic images, extracting and enriching metadata, and displaying resulting information in meaningful ways for analysts to conduct assessments.
,
Required Skills
Demonstrated experience building production data pipelines and ETL/ELT workflows at scale
Demonstrated experience with Apache Spark and PySpark for distributed data processing
Demonstrated experience with advanced Python programming skills including data manipulation libraries (Pandas, NumPy) and data engineering best practices
Demonstrated experience understanding data security, privacy, governance, and compliance principles
Demonstrated experience with workflow orchestration tools such as Step Functions and Airflow
Demonstrated experience with containerization such as Docker or Podman, and deploying data applications in cloud environments
Demonstrated experience with AWS services (in particular S3, Lambda, and Step Functions)
Demonstrated experience with PostgreSQL and MySQL in production environments, including performance tuning and schema design
Demonstrated experience with SQL and query optimization for complex analytical workloads
Demonstrated experience with version control (Git) and CI/CD practices for data pipelines
Demonstrated experience working with stakeholders to understand data requirements, assess feasibility, and design appropriate solutions with minimal oversight
Demonstrated experience with strong problem-solving and debugging skills for data quality issues, pipeline failures, and performance bottlenecks
,
Desired Skills
Demonstrated experience with data lakehouse architectures using Apache Iceberg
Demonstrated experience configuring, deploying, and integrating data platform components: Apache Ranger (access control and data governance), Trino (distributed SQL query engine),
Data catalogs (Unity Catalog OSS, Apache Polaris, etc.), and Apache Superset (data visualization and dashboarding)
Demonstrated experience with Bash scripting for automation and data processing tasks
Demonstrated experience with Infrastructure as Code (Terraform or CloudFormation) for data infrastructure
Demonstrated experience with tracking data lineage and associated tooling such as OpenLineage
Demonstrated experience with Java
Demonstrated experience with data quality frameworks, testing methodologies, and validation strategies
Demonstrated experience or background with large-scale data migrations or platform modernization efforts
Demonstrated experience integrating AI/ML services and models (translation, OCR, speech-to-text, NLP, language detection, topic modeling), LLMs, and RAG (retrieval-augmented generation) pipelines
Demonstrated experience with geospatial data processing (H3, PostGIS, or similar)
Demonstrated experience Contributing to data engineering documentation, best practices, or design patterns
Demonstrated experience with NoSQL databases (DynamoDB, etc.)
Demonstrated experience with excellent written and verbal communication skills with both technical and non-technical audiences
Demonstrated experience with Linux Operating Systems
Demonstrated experience with Agile/Scrum development methodologies in a fast-paced, collaborative team environment
Demonstrated experience working effectively in high-performing, cross-functional teams with multiple concurrent projects
Demonstrated experience working directly with stakeholders to gather requirements, understand needs, and translate them into technical solutions with minimal oversight
Demonstrated experience in self-directed work with a strong ownership mentality and commitment to code quality, testing, and documentation
Demonstrated experience context-switching between projects and systems as priorities demand
,
About NeoMax
NeoMax is a minority owned small business that specializes in DevOps, Data Science, and Cybersecurity IT solutions. With over ten years of experience in research, engineering and development, we take a “big picture” approach to meeting our business needs, prioritizing both technical and soft skills in the workplace for improved performance.
We create a thriving work environment that enables us to provide excellent support to our customers. We encourage and respect each other, just as we are encouraged and respected by our leadership. We are committed to contributing to the health of the organization by ensuring our words and actions align with our core values of Integrity, Diversity, Empowerment, Ambition and Service. We recognize that we represent the company in all aspects of our professional interactions, and we treat individuals external to our organization with the same respect that we treat each other.
ML/AI Work links you to the employer's original posting — always verify the details there before applying.
More Data Science roles
View all →Data Scientist – Seed Robotics & AI
Enza Zaden · Amersfoort, NL
Senior Data Scientist - Government & Public Services
Deloitte · Baltimore, US
DATA SCIENTIST LEAD L1(CONTRACT)
Wipro UK · Milton Keynes, GB
Data Scientist, Behavior Evaluation
Zoox · Boston, US
Data Scientist, Autonomy Behavior Monitoring
Zoox · Boston, US
Data Scientist, Behavior Evaluation
Zoox · Oakland, US