Platform Engineering builds the core platforms, tooling, and paved roads that Bloomberg engineers rely on to ship reliable, secure, and high-performing systems at scale.

The AI App Enablement & Observability team accelerates how AI products are built across Bloomberg Industry Group. Our mission is to make AI systems reliable, performant, cost-efficient, and continuously improving through platform tooling, deep observability, and automated feedback loops.

We build developer-facing platforms and workflows that enable teams to experiment, deploy, and operate AI and agent-based systems with confidence. This includes LLM gateways, agent platforms, benchmarking systems, telemetry pipelines, and self-improving infrastructure that closes the loop between observability and action. We emphasise strong developer experience, intuitive APIs/SDKs, and end-to-end ownership.

What’s in it for you?

You will help define how Bloomberg Industry Group builds and operates AI systems at scale by working on platforms that:

Accelerate AI product development through reusable tooling and paved roads
Provide end-to-end observability across AI systems (models, agents, pipelines, applications)
Enable self-improving systems through telemetry-driven feedback loops
Optimise cost, performance, and reliability of AI workloads
Support both production AI systems and internal engineering agents

You’ll collaborate across AI product, infrastructure, and platform teams to deliver foundational systems.

We’ll trust you to:

Platform & Enablement

Build and evolve AI platform tooling (e.g., developer workflows, benchmarking systems)
Design developer-friendly APIs, SDKs, and interfaces
Contribute to systems across the Model Development Lifecycle (experimentation, deployment, evaluation)

Observability & Telemetry

Build and operate observability platforms and telemetry pipelines (logs, metrics, traces, events)
Provide visibility into latency, token usage, cost, quality, drift, and reliability
Define instrumentation standards, schemas, and conventions
Implement distributed tracing using modern approaches (e.g., OpenTelemetry)

AI System Insights & Debugging

Enable end-to-end debugging of AI and agent workflows (model calls, tool usage, retrieval, orchestration)
Build benchmarking, regression detection, and performance analysis capabilities
Support observability for both production systems and internal engineering agents

Closed-loop Optimization & Automation

Develop systems that turn telemetry into action (automated experimentation, regression detection, alerting)
Build feedback loops that continuously improve model quality and system behavior
Enable self-healing and self-optimising workflows

Cost, Performance & Reliability

Build tooling for cost visibility, forecasting, and optimization
Define SLOs, alerting, and performance tuning practices
Improve reliability and scalability of AI infrastructure

Ownership & Collaboration

Own projects end-to-end (RFCs, architecture, implementation, rollout, production support)
Partner with AI teams to drive adoption of platform tooling and standards
Produce high-quality documentation and improve developer experience

You’ll need to have:

Demonstrated experience building production software or platform systems
Strong engineering fundamentals with distributed systems or backend platforms
Experience or strong interest in observability and debugging complex systems
Experience or strong interest in AI/ML systems, LLMs, or agent-based architectures
Strong ownership mindset and ability to drive ambiguous problems to production
Hands-on experience with modern agentic coding tools and multi-model workflows
Working knowledge of agent architecture internals (context engineering, tool loops, sub-agent orchestration)

We’d love to see:

Experience with OpenTelemetry and modern observability ecosystems, including instrumentation, collectors, exporters, and tools like Prometheus, Grafana, and tracing/log systems
Experience designing and operating telemetry pipelines, including sampling, retention, cardinality, and cost tradeoffs, as well as integrating observability into CI/CD and developer workflows
Familiarity with AI/agent frameworks, including instrumentation of LLM calls, tool usage, workflows, and evaluation signals (quality metrics, benchmarking, regression detection)
Experience building cost monitoring, forecasting, and optimization systems for AI workloads
Familiarity with cloud and infrastructure tooling (e.g., AWS, Azure, Kubernetes, Terraform)
Experience with agentic infrastructure concepts such as MCP servers, hooks, skills, subagents, sandboxing, and persistent memory patterns
Active engagement with the agentic engineering frontier, including emerging patterns (e.g., harness vs. model, review debt, feedback loops)
Demonstrated agent-native development practices (iterating with agents using testing, verification, and feedback loops)
Strong security awareness for autonomous systems, including sandboxing, prompt injection risks, credential exposure, and guardrails

If indicated, please note that years of experience are a guide; we will consider applications from all candidates who can demonstrate the skills necessary for the role.

Discover what makes Bloomberg unique - watch our podcast series for an inside look at our culture, values, and the people behind our success.

Accommodations

Bloomberg provides reasonable adjustment/accommodation to individuals with disabilities. Please tell us if you require a reasonable adjustment/accommodation to apply for a job. Examples of reasonable adjustment/accommodation include but are not limited to making a change to the application process or work procedures, providing documents in an alternate format or using specialized equipment. To request an adjustment/accommodation to apply for a job, please email AMER_recruit@bloomberg.net (Americas), EMEA_recruit@bloomberg.net (Europe, the Middle East and Africa), or APAC_recruit@bloomberg.net (Asia-Pacific), based on the region you are submitting an application for. We may share your information with a third party provider of accommodations services who may use this information to reach out to you for the purposes of accommodating your application.

Equal Opportunity

Bloomberg is an equal opportunity employer and prohibits discrimination in employment. It is Bloomberg’s policy to provide equal opportunity and access for all persons, and the Company is committed to attracting, retaining, developing, and promoting the most qualified individuals without regard to age, ancestry, color, gender identity or expression, genetic predisposition or carrier status, marital status, national or ethnic origin, race, religion or belief, sex, sexual orientation, self-identified or perceived sex, sexual and other reproductive health decisions, parental or caring status, physical or mental disability, pregnancy, childbirth or related medical conditions, or parental leave, protected veteran status, status as a victim of domestic violence, or any other classification protected by applicable law (each, a “Protected Characteristic”). Bloomberg prohibits treating applicants or employees less favorably in connection with the terms and conditions of employment, in all phases of the employment process, because of one or more Protected Characteristics.

ML/AI Work links you to the employer's original posting — always verify the details there before applying.

Senior Software Engineer - AI App Enablement & Observability

Job description

Accommodations

Equal Opportunity

More Generative AI and LLM roles

Machine Learning Engineer – IA Conversationnelle & Voicebot – Paris (IT) / Freelance

Ingénieur Machine Learning – IA Conversationnelle & Voicebot

ML Engineer

DS IA Gen

Lead Commercial Data Scientist

AI Solutions Engineer