Deskripsi Pekerjaan
Join ByteDance’s Networking team, the critical backbone of our global ecosystem. As a Software Development Engineer specializing in Network Monitoring & Alerts, you will be at the forefront of building ultra-scale monitoring solutions that ensure the reliability and performance of one of the world's largest network infrastructures. Our team leverages innovative ideas in network architecture and Software Defined Networking (SDN) to support millions of concurrent users across platforms like TikTok.
In this role, you will design and implement high-performance systems to collect, process, and analyze massive volumes of network telemetry data in real-time. You will work on the bleeding edge of network technology, transforming raw data into actionable insights and intelligent alerting systems. Your work will directly impact the stability and efficiency of our global data centers, ensuring a seamless experience for our users worldwide.
We are looking for passionate engineers who thrive in fast-paced environments and are excited by the challenge of managing complex, distributed systems. You will collaborate with world-class engineers to define monitoring strategies, optimize system performance, and drive the next generation of our global network infrastructure.
Tanggung Jawab
- Design and develop large-scale, distributed network monitoring systems and automated alerting frameworks.
- Implement high-throughput data collection pipelines for network telemetry using gNMI, SNMP, and Flow data.
- Build real-time analysis engines to detect network anomalies, traffic spikes, and performance bottlenecks.
- Optimize the performance and scalability of SDN controllers and monitoring agents across a global footprint.
- Collaborate with Network Architects and SREs to define critical SLIs/SLOs and automated incident remediation workflows.
- Drive software engineering best practices, including robust code reviews, CI/CD, and comprehensive automated testing.
- Research and integrate emerging technologies in AIOps and network visibility to enhance predictive maintenance.
Kualifikasi
- Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical field.
- Proficiency in at least one systems programming language, preferably Go, Python, or C++.
- Strong foundational knowledge of TCP/IP networking, routing protocols (BGP), and network virtualization.
- Hands-on experience with Software Defined Networking (SDN) or programmable data planes.
- Experience building or maintaining large-scale distributed systems or time-series databases (e.g., Prometheus, ClickHouse).
- Familiarity with container orchestration using Kubernetes and cloud-native infrastructure.
- Strong analytical and problem-solving skills with a data-driven mindset.