Mastercard is a leading global payments & technology company that connects consumers, businesses, merchants, issuers & governments around the world. In this role as Lead DevOps Engineer for the Foundry RnD team, you will be at the forefront of driving platform infrastructure for MLOps and agentic AI systems. You will be responsible for establishing reusable patterns for CI/CD, scalable inference, orchestration, and observability while ensuring robust cost control and security.
Key Responsibilities
- Drive Platform Infrastructure: Own DevOps and infrastructure for MLOps and agentic AI systems. Design secure, scalable, and repeatable systems using Infrastructure as Code (IaC) to support R&D workloads.
- Build Secure CI/CD & Automation: Enable secure tool access and workload isolation for LLM-backed APIs and MCP servers. Partner with security and compliance teams on access control, infrastructure governance, and auditability.
- Ensure Reliability & Observability: Implement comprehensive monitoring, logging, and alerting systems. Optimize observability for ML-specific workloads to ensure operational insight and high performance.
- Technical Leadership: Provide hands-on leadership across DevOps initiatives. Review code, enforce best practices, improve tooling, and promote clean, well-tested infrastructure.
- Cross-Functional Collaboration: Partner with ML researchers, software engineers, and platform teams to design deployment strategies and meet agile milestones.
Requirements and Skills
- Education: Bachelor's degree in Computer Science, Engineering, or a related field.
- Experience: 8–12+ years in DevOps, SRE, or platform engineering, including experience in senior or lead roles.
- Cloud Expertise: Strong proficiency in cloud platforms (AWS, Azure, or GCP) and AI/ML components such as Databricks, Azure ML, and MLflow.
- Infrastructure as Code (IaC): Expertise in Terraform and orchestration tools like Terragrunt, along with GitOps practices.
- Containerization: Deep mastery of Kubernetes and Docker, specifically for optimizing ML development workflows and managing clusters at scale.
- AI/ML Platform Knowledge: Understanding of model registries, feature stores, AI agents, RAG techniques, and frameworks like LangChain or LlamaIndex.
- Programming: Advanced skills in Bash and Python. Familiarity with Go or other systems programming languages is considered a plus.
How to Apply
Interested and qualified candidates should apply through the MasterCard careers portal by visiting https://www.myjobmag.co.ke/apply-now/1161830 or the official MasterCard careers website.