Job Description
Are you ready to architect the future of Artificial Intelligence? At Zai Core Systems, we are building the scalable, secure, and efficient infrastructure required for the AI revolution of 2026 and beyond. We are seeking a visionary AI Infrastructure Architect to lead our next-generation compute stack.
We are looking for a technical leader who thrives in ambiguity and is passionate about optimizing performance at scale. You will be responsible for the end-to-end lifecycle of our data centers, cloud migrations, and AI model deployment pipelines, ensuring our platforms are future-proof for the demands of the coming decade.
Responsibilities
- Design and implement high-availability, distributed systems architecture for next-generation Large Language Models (LLMs).
- Optimize inference latency and throughput for 2026-scale workloads using edge computing and specialized hardware.
- Drive cloud migration strategies and manage hybrid cloud environments (AWS, Azure, GCP).
- Implement robust security protocols and compliance standards (SOC2, HIPAA) for sensitive AI data.
- Collaborate with data scientists to streamline the MLOps pipeline and automate model deployment workflows.
- Lead technical strategy for GPU clusters and high-performance computing (HPC) infrastructure.
Qualifications
- 10+ years of experience in DevOps, SRE, or Systems Engineering, with at least 3 years focused on AI/ML infrastructure.
- Deep expertise in container orchestration (Kubernetes), microservices, and serverless architectures.
- Proficiency in Python, Go, or Rust for infrastructure automation.
- Experience with GPU clusters, large-scale data processing (Spark, Ray), and distributed databases.
- Strong understanding of networking, load balancing, and cloud networking (VPC, Peering).