Deskripsi Pekerjaan
Are you ready to redefine operational excellence in the Fintech space? Fiuu is seeking a highly skilled Senior AIOps Engineer to join our innovative team in i-City, Shah Alam. In this pivotal role, you will bridge the gap between AI/ML development and IT operations, transforming how we monitor, manage, and optimize our massive-scale fintech infrastructure.
As a Senior AIOps Engineer, you will lead the implementation of predictive intelligence, automated incident remediation, and big data analytics to ensure 99.999% system availability. We are looking for a visionary technologist who is passionate about reducing 'noise' in monitoring systems and driving high-velocity automation through RPA and custom-built intelligence engines. If you thrive in a fast-paced environment and have a deep obsession with data-driven operational efficiency, this is the perfect challenge for you.
Tanggung Jawab
- Architect and implement advanced AIOps platforms to provide real-time observability and predictive fault detection across complex microservices architectures.
- Design and deploy automated incident remediation workflows using RPA tools and custom Python/Go scripts to minimize Mean Time to Resolution (MTTR).
- Analyze massive datasets from distributed systems to uncover patterns, anomalies, and performance bottlenecks.
- Collaborate with SRE and DevOps teams to integrate machine learning models into the CI/CD pipeline for automated canary deployments and performance regression testing.
- Lead the migration from reactive monitoring to proactive, self-healing infrastructure patterns.
- Develop and maintain internal dashboards and automated reporting for system health, resource utilization, and cost optimization.
- Mentoring junior engineers on best practices regarding infrastructure automation, data modeling, and cloud-native observability.
Kualifikasi
- Bachelor’s degree in Computer Science, Data Engineering, or a related technical field.
- 5+ years of experience in SRE, DevOps, or Data Engineering roles, with at least 2 years specifically focused on AIOps or observability.
- Deep proficiency in Python, Go, or Java for automation and tool development.
- Hands-on experience with ML frameworks (e.g., TensorFlow, PyTorch, or Scikit-learn) for anomaly detection.
- Strong background in Big Data ecosystems (e.g., ELK Stack, Splunk, Prometheus, Grafana, or Datadog).
- Expertise in cloud platforms (AWS, Azure, or GCP) and container orchestration tools like Kubernetes.
- Strong understanding of Fintech security standards, operational compliance, and high-availability architecture.
- Excellent analytical, communication, and problem-solving skills, with a focus on delivering scalable, production-grade solutions.