Software Developer
Oracle · San Francisco, US
Job description
In this role you will lead the design and development team to build advanced AI applications powered by AI models. You will use AI/ML to automate, optimize, and secure networks, focusing on tasks like self-provisioning, auto-ingesting, auto-qualifying systems and self-healing networks, requiring skills in Python, ML frameworks, training AI models, and an understanding of networking protocols, data center designs, infrastructure as a service, network monitoring and network automation.
As a Principal AI Developer in the Networking Org, you will be responsible for building and optimizing large-scale AI systems, ensuring scalability, reliability, and performance. The candidate should be able to work collaboratively with cross-functional teams to drive the development and deployment of AI solutions. If you have a passion for building cutting-edge AI applications and are looking for a challenging role, we encourage you to apply. Strong problem-solving skills, attention to detail, and excellent communication skills are essential for this role.
- Design and implement scalable orchestration for serving and training AI/ML models.
- Explore and incorporate contemporary research on AI, agents, and inference systems into the software stack for designing, monitoring, troubleshooting and deploying networks.
- Evaluate, Integrate, and Optimize technologies across the stack, for latency, throughput, and resource utilization for training and inference workloads.
- Lead initiatives in AI systems design, including Retrieval-Augmented Generation (RAG) and LLM fine-tuning. Design and develop scalable services and tools to support GPU-accelerated AI pipelines, Python/Go, and observability frameworks.
Required/Preferred experience:
-
Strong Python and ML frameworks (PyTorch, TensorFlow)
-
LLMs, embeddings, vector search, RAG pipelines, and fine-tuning
-
Data engineering: Spark, Kafka, Flink, OCI Streaming/Data Flow
-
Distributed systems and large-scale training/inference
-
Handling network telemetry (NetFlow, packet captures, streaming telemetry)
-
Network automation frameworks (Terraform, Ansible, NAPALM, Batfish is a
plus)
-
Containerization, model serving, GPU workflows, CI/CD, and MLOps tools Writing design docs, scoping features, and owning delivery end-to-end
Required Education and Work Experience:
BSEE, BSCS, BSCE, or equivalent. MSEE, MSCS, or MSCE is a plus. At least 7+ years of experience building software systems and prior experience building AI applications training models.
ML/AI Work links you to the employer's original posting — always verify the details there before applying.
More Data Science roles
View all →Data Scientist
Booz Allen Hamilton · Remote · Fayetteville
Pricing Data Scientist
Radwell International · Chicago, US
Data Scientist, Mid
Booz Allen Hamilton · Remote · Baltimore
Data Scientist
Booz Allen Hamilton · Remote · Baltimore
People Data Scientist
Unitek Learning · Phoenix, US
Senior Data Scientist - Financial Markets
Bank of Canada · Remote · Ottawa