ML Research Scientist -Deep Learning & Transformer Architectures

Please direct all resume submissions to QuantTalentUS@mlp.comand reference REQ-29605 in the subject.

Overview
As part of a long-term research agenda within a newly formed systematic equities pod, we are building a proprietary Transformer-based model trained on tokenized intraday market data for next-token prediction of price movements.

We are seeking an exceptional ML research scientist with deep expertise in Transformer architectures and large-scale model training. You will design, implement, and train a custom decoder-only Transformer from scratch -not fine-tune an existing LLM, but build a purpose• built architecture for financial time-series.

This is a long-term research project with significant computational resources. The successful candidate will have a PhD in machine learning or a related field and demonstrated ability to implement Transformer architectures from first principles.

Principal Responsibilities

Design and implement a custom decoder-only Transformer architecture optimized for tokenized financial time-series data
Develop a novel tokenization scheme for intraday market data: price movements, volume, order flow, and cross-sectional features
Implement efficient training pipelines using PyTorch with mixed-precision training, gradient checkpointing, and multi-GPU parallelism
Design attention mechanisms adapted to financial data: temporal attention patterns, cross-asset attention, and multi-scale representations
Build evaluation frameworks for next-token prediction accuracy, signal quality, and trading performance
Implement inference optimization for low-latency production deployment: model quantization, KV-cache, speculative decoding
Conduct rigorous ablation studies to validate architecture choices and training methodology
Collaborate with the team to integrate model predictions into the live trading pipeline
Document research methodology, experimental results, and architectural decisions

Required Skills / Qualifications

PhD in Machine Learning, Computer Science, Statistics, Applied Mathematics, or a related field with a focus on deep learning
Demonstrated ability to implement Transformer architectures from scratch (not just finetuning pre-trained models)
Deep understanding of attention mechanisms, positional encodings, tokenization strategies, and training dynamics
Expert-level PyTorch skills including custom modules, training loops, mixed-precision, and multi-GPU training
Strong mathematical foundations: linear algebra, probability theory, optimization, information theory
Experience training models at scale (100M+ parameters)
Strong programming skills in Python and C++ for performance-critical components
Self-directed researcher capable of defining and executing a multi-month research agenda
Familiarity with Al-assisted development tools (Cursor, Claude Code)

Preferred Skills / Experience

Experience applying deep learning to financial data or time-series forecasting
Familiarity with tokenizatlon approaches for continuous or non-text data
Published research in top ML venues (NeurlPS, ICML, ICLR) or equivalent industry experience
Knowledge of market microstructure and intraday trading dynamics
Experience with model compression, quantization, and inference optimization

Millennium offers a total compensation package which includes a base salary, discretionary performance bonus, and comprehensive benefits. The estimated base salary range for this position is $150,000 to $200,000, which is specific to New York and may change in the future. When finalizing an offer, we take into consideration an individual’s experience level and the qualifications they bring to the role to formulate a competitive total compensation package.

ML/AI Work links you to the employer's original posting — always verify the details there before applying.

Job description