Description:
We are seeking an experienced and driven Technical Project Manager to lead and manage complex engineering programs focused on improving system reliability and AI efficiency, including the implementation of AI co-pilot solutions. As a key member of our engineering team, you will work at the intersection of cutting-edge technology, reliability engineering, and AI development, ensuring seamless execution of projects that deliver scalable, high-performance solutions.
This role demands a solid technical background, exceptional project management skills, and the ability to collaborate across multidisciplinary teams to drive key initiatives to success. The ideal candidate will have experience in managing engineering programs within high-availability systems and AI-based solutions, with a strong focus on performance optimization, quality assurance, and risk mitigation.
Key Responsibilities:
- Program Management: Lead multiple engineering programs focused on system reliability, performance optimization, and AI efficiency, ensuring the successful delivery of projects within scope, budget, and timelines.
- AI Co-Pilot Integration: Oversee the implementation and enhancement of AI-driven co-pilot systems, ensuring they are efficient, reliable, and aligned with the company’s overall strategic objectives for automation and user experience.
- Cross-functional Collaboration: Work closely with engineering, data science, AI/ML, QA, and operations teams to define project goals, technical requirements, and timelines, fostering effective communication and collaboration.
- Risk Management: Identify and mitigate potential risks to program success, including reliability issues, technical debt, and resource constraints. Proactively resolve project bottlenecks and technical challenges.
- Stakeholder Communication: Serve as the primary point of contact for stakeholders, providing regular updates on project progress, risks, milestones, and key performance metrics. Translate technical complexities into clear and actionable communication for non-technical audiences.
- Reliability Engineering: Implement best practices for reliability engineering (e.g., monitoring, failure analysis, redundancy), and drive the continuous improvement of the system’s performance and uptime.
- Process Improvement: Establish and maintain project management processes, ensuring alignment with engineering and organizational standards. Advocate for continuous improvement practices within the teams you manage.
- Budget & Resource Management: Oversee program budgets, allocation of resources, and team workload management, ensuring the efficient utilization of both human and technical resources.
- AI Efficiency Metrics: Define and track success metrics for AI efficiency programs (e.g., co-pilot models), such as system performance, user experience improvement, resource consumption, and computational efficiency.
Required Skills and Qualifications:
- Education: Bachelor’s degree in Computer Science, Engineering, or related technical field. Advanced degrees or certifications (e.g., PMP, Scrum Master) are a plus.
- Experience: 5+ years of experience in technical project management, with a strong focus on engineering projects related to system reliability and AI optimization.
- Technical Expertise: Strong understanding of AI/ML technologies, reliability engineering principles, cloud architectures, and AI co-pilot systems. Familiarity with programming languages (e.g., Python, Java, C++) and AI frameworks is a plus.
- Project Management: Proven experience leading complex engineering projects with multiple stakeholders and cross-functional teams. Strong command of Agile, Scrum, or similar methodologies.
- Problem-Solving: Excellent analytical and troubleshooting skills with the ability to make data-driven decisions and solve complex technical problems related to performance and reliability.
- Leadership & Communication: Exceptional leadership skills with the ability to manage, motivate, and mentor diverse teams. Strong written and verbal communication skills to effectively report progress to senior management and stakeholders.
- AI & Reliability Focus: Practical experience in deploying solutions focused on AI model optimization, system reliability, scalability, and fault-tolerant systems.