AI/ML Engineer
Job Description:
AI/ML Engineer
Purpose:
Serve as the technical backbone of the AI Hub. Design, build, productionise and maintain core AI/ML infrastructure and reusable components that will power the DT2.0 platform build, priority use cases, and the Group's intelligent transformation.
Key Responsibilities:
- Design and implement core AI infrastructure — model serving, MLOps pipelines, RAG architectures, vector databases, DT Data Lake integration, rules engine and microservices.
- Lead technical delivery of high-impact use cases — AI-Assisted Legacy Modernisation, Intelligent Operations, and AI-native components of DT2.0.
- Build and maintain the foundational AI Agent Layer and reusable AI services across onboarding, operations and sales enablement, including multi-agent orchestration.
- Apply AI/ML techniques to accelerate the DT2.0 platform build (automated code analysis, intelligent configuration generation, predictive monitoring).
- Implement production-grade evaluation, observability and monitoring — eval harnesses, regression testing, A/B testing of models, continuous quality measurement.
- Own LLM cost and latency optimisation — model selection, caching, prompt engineering, token economics, and routing logic.
- Establish AI security standards — prompt injection defence, output validation, secrets management, model access controls — to fintech-grade requirements.
- Ensure all solutions are modular, scalable, cloud-native, secure, and fully compliant with the Responsible AI Policy, PCI-DSS and POPIA.
- Mentor squad members and contribute to AI standards and best practices across the organisation.
- Track and report technical KPIs and ROI of AI initiatives.
Requirements & Qualifications:
Essential
- BSc Computer Science or equivalent technical degree — non-negotiable.
- 3–6 years software engineering experience with AI/ML in production.
- Strong C#/.NET proficiency (primary API stack), shipping production APIs.
- Python proficiency for AI/ML workloads.
- Modern AI/ML frameworks: LangChain, LlamaIndex, Hugging Face, OpenAI/Anthropic APIs, scikit-learn, PyTorch/TensorFlow.
- RAG systems with vector DBs (Pinecone, Weaviate, pgvector, Qdrant).
- Agentic / multi-agent systems (LangGraph, CrewAI, AutoGen, custom orchestration).
- MLOps, Azure (preferred), Docker/Kubernetes, data pipelines.
- Azure DevOps (primary) for CI/CD.
- Passion for clean, production-grade code.
Bonus / Advantageous
- FastAPI for AI service-specific work.
- Eval / observability tooling — LangSmith, Langfuse, Arize, W&B.
- Fine-tuning approaches — PEFT, LoRA, QLoRA, and trade-off judgement.
- GitHub Actions or other modern CI/CD tooling.
- Modern data engineering — dbt, Spark/Databricks, Kafka.
- Fintech/payments or large-scale system modernisation background.
What Success Looks Like in 12 Months:
- Core AI infrastructure is live and reused across multiple DT2.0 squads.
- Two priority use cases (Legacy Modernisation Toolkit and Operations layer) are in production.
- Measurable acceleration of DT2.0 delivery velocity, visible to EXCO.
- AI services in production with defined SLAs, eval coverage, and cost/latency baselines.