Senior Principal AI Infrastructure Architect
NTT DATA · Remote · Milan
Job description
Make an impact with NTT DATA
Join a company that is pushing the boundaries of what is possible. We are renowned for our technical excellence and leading innovations, and for making a difference to our clients and society. Our workplace embraces diversity and inclusion – it’s a place where you can grow, belong and thrive.
Your day at NTT DATA
The Senior Principal AI Infrastructure Architect is a highly skilled and advanced subject matter expert, responsible for leading the design of complex AI platform and managed-service solutions and driving the strategic vision and direction for the company's largest enterprise clients. The role sits at the centre of NTT DATA's AI Factories practice and is focused on the hardware foundations — GPU and accelerator compute, host CPU platforms, high-performance storage and AI fabric — that underpin enterprise-sc****ale training, fine-tuning and inference workloads.
Key Responsibilities:
-
Lead the end-to-end design of large, complex AI infrastructure solutions — covering accelerated compute (NVIDIA H100/H200/B200 and GB200 NVL72, AMD Instinct MI300X/MI325X, Intel Gaudi 3), CPU host platforms (Intel Xeon, AMD EPYC, NVIDIA Grace), high-throughput storage tiers and lossless AI fabric — for enterprise, sovereign AI and AI Factory clients.
-
Architect reference designs built on NVIDIA DGX/HGX SuperPOD, Dell AI Factory with NVIDIA, Cisco Nexus HyperFabric AI, HPE / Lenovo / Supermicro accelerated compute and equivalent platforms, balancing single-node performance with cluster-scale efficiency.
-
Size and validate GPU clusters against real workloads — foundation-model pre-training, distributed fine-tuning, RAG, real-time and batch inference — using the right combination of NVLink/NVSwitch domains, InfiniBand NDR/XDR or Ultra Ethernet / NVIDIA Spectrum-X fabrics and tiered NVMe and parallel storage (VAST, WEKA, DDN, Pure FlashBlade, NetApp ONTAP AI, Dell PowerScale).
-
Define the supporting datacenter design: high-density power (50–140 kW/rack), direct-to-chip and rear-door liquid cooling, structured cabling for AI fabrics and modular deployment models across on-prem, colo and sovereign-cloud footprints.
-
Work closely with the sales team to drive the presales process for AI infrastructure pursuits — client discovery, technical workshops, proposal writing, executive presentations and bid defence.
-
Translate clients' AI ambitions and business outcomes into a hardware and platform roadmap, positioning NTT DATA's end-to-end portfolio — silicon, systems, storage, fabric, MLOps stack and managed services — to land service-led AI solutions.
-
Lead integration of compute, storage, networking, the AI software stack (CUDA, ROCm, Triton, NIM, NVIDIA AI Enterprise, Run:ai, Slurm, Kubernetes / Kubeflow) and managed-service operating models across multiple domains, delivery units and geographies.
-
Build business cases,TCOand unit-economics models (cost per token, cost per training run, GPU-hour economics) and end-to-end transition roadmaps for cloud-to-private AI migrations and sovereign AI deployments.
-
Define architectural principles for AI infrastructure — acceleratorutilisation, data gravity, multi-tenancy, model lifecycle, energy efficiency — and apply them to influence architectural outcomes and governance.
-
Develop As-Is, Vision, FMO and To-Be AI platform architectures,identifygaps and develop transition roadmaps.
-
Synthesise****current and future trends in AI silicon, memory hierarchies (HBM3e, CXL),interconnectsand AI software stacks with client strategic imperatives to create compelling, evidence-based solutions.
-
Contribute to NTT DATA's AI Factories knowledge base by sharing reference architectures, sizing tools and lessons learned with internal teams and clients.
Knowledge and Attributes
-
Deep, hands-on knowledge ofAI hardware: GPU and accelerator portfolios (NVIDIA Hopper / Blackwell, AMD MI300/MI325, Intel Gaudi 3, emerging custom silicon), host CPU platforms (Intel Xeon, AMD EPYC, NVIDIA Grace), system topologies (HGX, DGX, MGX, OAM) and how each choice maps to specific AI workloads.
-
Strong understanding ofAI-class storage: parallel filesystems, all-flashNVMeplatforms, S3-class object stores, checkpoint and dataset pipelines and the I/O patterns of large-scale training and inference (VAST, WEKA, DDNEXAScaler, PureFlashBlade, NetApp ONTAP AI, DellPowerScale).
-
Solid command ofAI networking— InfiniBand NDR/XDR, RoCEv2, NVIDIA Spectrum-X, Ultra Ethernet,NVLink/NVSwitchfabrics, congestioncontroland fabric design for rail-optimisedand fat-tree topologies.
-
Working knowledge of theAI software and orchestration stack: CUDA,cuDNN, NCCL,ROCm, Triton Inference Server, NIM,vLLM,TensorRT-LLM,Slurm, Kubernetes (with GPU Operator), Kubeflow,Run:ai,MLflowand NVIDIA AI Enterprise.
-
Familiarity withdatacenter facilities engineeringfor AI workloads: high-density power, liquid cooling (DLC, rear-door, immersion), PUE/WUEoptimisationand the practical constraints of retrofitting existingcolospace for accelerated compute.
-
Excellent written and oral communication skills, with the ability to translate complex technical concepts for technical and non-technical executive audiences.
-
Strong systems-thinking and strategic-thinking skills — able to capture the key elements of a system into a simple abstraction that empowers good decisions.
-
Strong business financial skills, with the demonstrable ability to perform a cost-benefit analysis, build CAPEX vs OPEXcomparisonsand manage budgets.
-
Knowledge of cloud, hybrid and sovereign AI deployment patterns, plus architectural governance for Agile,DevSecOpsandMLOps.
-
Significant knowledge of core Managed Service portfolio artefacts, techniques, demos, tools and deliverables, applied to AI platform operations.
Academic Qualifications and Certifications:
-
Bachelor's degree or equivalent in Information Technology, Engineering, ComputerScienceora relatedfield. Master's or PhDadvantageous.
-
Vendor and technology certifications in AI infrastructure highly desirable — for example NVIDIA-Certified Associate / Professional (AI Infrastructure, AI Operations), Dell Technologies AI Factory, Cisco / Nutanix / HPE accelerated compute, Red Hat OpenShift AI,Run:ai— plus relevant storage and networking certifications.
-
Scaled Agile certificationadvantageous.
Required experience:
-
Significant experiencein a consulting,presalesor architecture role within a large-scale (preferably multi-national) technology services environment, witha track record****of leading AI infrastructure pursuits.
-
Demonstrable experience designing and delivering production AI platforms — from single multi-GPU servers through to multi-rack training clusters and inference factories.
-
Strong working knowledge of the AI hardware vendor landscape (NVIDIA, AMD, Intel, Dell, HPE, Lenovo, Supermicro, Cisco, Pure, VAST, WEKA, DDN, NetApp) and how to position partner ecosystems competitively.
-
Proven ability to translate AI workload requirements (model size, parameter count, sequence length, throughput SLOs, latency targets) intoaccuratehardware bills of materials and sizing justifications.
-
Significant client engagement and consulting experience, including client needs assessment, change management and the ability toidentifywhitespace for follow-on AI infrastructure and managed-services work.
-
Significant business development and presales experience on infrastructure-led deals, ideally including sovereign AI, AIFactoryor regulated-industry GenAIprogrammes.
-
Strong understanding of how AI infrastructure integrates with business processes, applications, dataplatformsand existing enterprise architecture.
Workplace type**:**
Remote Working
About NTT DATA
NTT DATA is a $30+ billion business and technology services leader, serving 75% of the Fortune Global 100. We are committed to accelerating client success and positively impacting society through responsible innovation. We are one of the world’s leading AI and digital infrastructure providers, with unmatched capabilities in enterprise-scale AI, cloud, security, connectivity, data centers and application services. Our consulting and industry solutions help organizations and society move confidently and sustainably into the digital future. As a Global Top Employer, we have experts in more than 50 countries. We also offer clients access to a robust ecosystem of innovation centers as well as established and start-up partners. NTT DATA is part of NTT Group, which invests over $3 billion each year in R&D.
Equal Opportunity Employer
NTT DATA is proud to be an Equal Opportunity Employer with a global culture that embraces diversity. We are committed to providing an environment free of unfair discrimination and harassment. We do not discriminate based on age, race, colour, gender, sexual orientation, religion, nationality, disability, pregnancy, marital status, veteran status, or any other protected category. Join our growing global team and accelerate your career with us. Apply today.
Third parties fraudulently posing as NTT DATA recruiters
NTT DATA recruiters will never ask job seekers or candidates for payment or banking information during the recruitment process, for any reason. Please remain vigilant of third parties who may attempt to impersonate NTT DATA recruiters—whether in writing or by phone—in order to deceptively obtain personal data or money from you. All email communications from an NTT DATA recruiter will come from an @nttdata.com email address. If you suspect any fraudulent activity, please contact us.
ML/AI Work links you to the employer's original posting — always verify the details there before applying.
More Architecture and Leadership roles
View all →AI Architect
Koch · Wichita, US
Finance AI Solution Architect
Booz Allen Hamilton · Remote · Baltimore
AI Architect
Koch · Tulsa, US
AI Architect
Koch · Houston, US
Senior Data & AI Architect (w/m/d) Sovereign Cloud
Capgemini · Berlin, DE
Associate Director of AI Governance
Latham & Watkins LLP · London, GB