Beranda Loker Detail
G
Information & Communication Technology 🏢 Full Time ⭐️ Terverifikasi

Lead Software Engineer (Observability)

Great Eastern
Kuala Lumpur
Estimasi Gaji
MYR 12.000 – MYR 16.000
Live Update
6 Mei 2026
Batas Akhir
6 Mei 2027

Deskripsi Pekerjaan

About the Role

Great Eastern, a leading insurance group in Asia, is on a mission to transform our digital landscape. We are looking for a visionary Lead Software Engineer (Observability) to own our reliability strategy from our hub in Kuala Lumpur. In this high-impact role, you will lead the design, implementation, and operations support for our observability services and infrastructure. You will be the technical authority on monitoring, alerting, and incident response, driving a culture of operational excellence.

You will be responsible for building a world-class observability platform that provides deep insights into our microservices ecosystem. By championing OpenTelemetry, you will enable distributed tracing and unified telemetry. Your work will directly impact the stability and performance of the systems that serve millions of users.

If you are passionate about site reliability, automation, and building high-quality engineering cultures, this is the perfect next step in your career. You will mentor a talented team of engineers and collaborate closely with development, infrastructure, and security teams to ensure our digital services are always available, fast, and reliable.

Tanggung Jawab

  • Lead the architecture and implementation of a unified observability platform integrating metrics, logs, and traces.
  • Provide hands-on operational support, tuning, and maintenance of observability infrastructure to ensure high uptime.
  • Establish and champion Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets.
  • Design and manage sophisticated monitoring, alerting, and dashboarding solutions using modern observability stacks (e.g., Prometheus, Grafana, Elasticsearch).
  • Champion the adoption of OpenTelemetry standards for distributed tracing across all engineering teams.
  • Lead incident management workflows, conduct blameless post-mortems, and drive sustainable reliability improvements.
  • Mentor and coach engineers on SRE principles and operational best practices.
  • Collaborate with software engineering, DevOps, and infrastructure teams to build resilient and scalable systems.

Kualifikasi

  • Bachelor's degree in Computer Science, Information Technology, or a related technical field.
  • 7+ years of experience in software engineering, site reliability engineering (SRE), or platform engineering.
  • At least 2 years of experience leading technical projects or mentoring engineering teams.
  • Deep hands-on expertise with observability tools such as Prometheus, Grafana, Loki, Tempo, OpenTelemetry, Datadog, or New Relic.
  • Strong proficiency in one or more programming/scripting languages (Go, Python, Java, or Shell).
  • Extensive experience with cloud platforms (AWS, Azure) and container orchestration (Kubernetes, Docker).
  • Solid understanding of networking concepts, microservices architecture, and distributed systems.
  • Excellent communication skills with the ability to influence technical strategy and advocate for reliability.

Keahlian yang Dibutuhkan

Observability OpenTelemetry Prometheus Grafana SRE DevOps Python Go Java Kubernetes Docker AWS Azure Elasticsearch Incident Management Distributed Tracing Microservices

Siap Mengambil Tantangan Ini?

Pastikan resume Anda sudah siap. Kirimkan lamaran Anda sekarang sebelum tanggal deadline.

Lamar Sekarang

Lowongan Terkait

Rekomendasi pekerjaan serupa untuk Anda

Lihat Semua