Principal Infrastructure Engineer

 

Description:

Our client is a scaling Cloud Service Provider specialising in delivering High-Performance Computing as a Service (HPCaaS) to enterprises globally. Their platform supports cutting-edge AI-native workloads and HPC environments, leveraging modern cloud-native technologies to drive innovation. This is an opportunity to work on infrastructure that powers the future of AI, ML, and advanced computational workloads.

 

The Role:

Infrastructure Design & Virtualisation

  • Architect and implement virtualisation solutions optimised for AI and HPC workloads, with a focus on hypervisor performance tuning.
  • Design dynamic, scalable infrastructure that meets evolving customer demands for storage and networking.

 

Bare-Metal and Operating System Management

  • Lead provisioning, orchestration, and optimisation of bare-metal systems across global deployments.
  • Ensure secure, high-performance configurations for Unix/Linux environments at scale.

 

Networking and High-Performance Storage

  • Design and deploy cloud-native, high-performance storage and networking solutions tailored to demanding workloads.
  • Leverage expertise in networking protocols (TCP, UDP, DNS, BGP) and software-defined networking (SDN) technologies.

 

Kubernetes and Cloud-Native Platforms

  • Manage Kubernetes clusters across hybrid and multi-cloud environments, including container networking interfaces (CNIs) and service meshes.
  • Develop CI/CD pipelines to automate infrastructure delivery and enhance operational reliability.

 

Observability and Automation

  • Build observability pipelines integrating logging, metrics, and distributed tracing tools.
  • Automate deployments and streamline operations with tools like Terraform, Ansible, Python, and Go.

 

Architecture & Solution Design

  • Evaluate emerging technologies for scalability, security, and performance within the client’s platform.
  • Create detailed technical and business-aligned architectural proposals.
  • Collaborate with cross-functional teams to ensure successful solution delivery.

 

Collaboration and Leadership

  • Foster a solution-driven mindset, championing innovative approaches to challenges.
  • Mentor team members in infrastructure best practices and emerging technologies.
  • Align infrastructure projects with broader organisational goals in partnership with engineering leaders.

 

Skills and Experience

  • Expertise in AI/ML workloads, GPU-accelerated systems, or HPC infrastructures.
  • Proven experience leading infrastructure architecture initiatives in agile environments.
  • Advanced proficiency in Kubernetes, container networking (CNI), and service mesh technologies.
  • Strong background in virtualisation technologies and hypervisor optimisation.
  • Extensive experience in large-scale global deployments, especially in HPC or AI-native environments.
  • In-depth knowledge of IaC tools like Terraform and Ansible.
  • Proficiency in programming languages like Go or Python for automation.
  • Familiarity with observability tools (e.g., Prometheus, Grafana) and distributed tracing systems.

 

Preferred Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, or a related field.
  • 8+ years of experience in global-scale infrastructure design and deployment.
  • Strong communication skills with the ability to convey complex concepts to diverse teams.
  • Commitment to creating clear documentation for infrastructure processes and designs.

Organization asobbi
Industry Engineering Jobs
Occupational Category Principal Infrastructure Engineer
Job Location Dubai,UAE
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Experienced Professional
Experience 8 Years
Posted at 2025-01-07 3:53 pm
Expires on 2025-04-07