Deskripsi Pekerjaan
Join ByteDance’s Infrastructure Engineering team, the backbone of our global platform’s rapid growth. We are seeking a highly motivated Production System Engineer to help design, build, and operate our hyperscale datacenter environments. In this role, you will be instrumental in ensuring the stability, performance, and scalability of systems that power millions of users worldwide.
You will work at the intersection of software engineering and systems operations, tackling complex challenges in distributed systems, network architecture, and resource management. If you are passionate about automation, site reliability, and solving large-scale engineering puzzles, we want to hear from you.
Tanggung Jawab
- Design, deploy, and maintain large-scale infrastructure systems to support hyper-growth services.
- Automate operational tasks, configuration management, and deployment pipelines using modern CI/CD tools.
- Analyze system performance metrics to identify bottlenecks and implement optimization strategies.
- Collaborate with cross-functional teams to resolve complex production incidents and minimize downtime.
- Manage capacity planning for compute, storage, and networking resources in our global datacenters.
- Develop internal tools and scripts to enhance system reliability and observability.
- Participate in on-call rotations to ensure 24/7 service availability and stability.
Kualifikasi
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
- Minimum 3+ years of experience in SRE, DevOps, or Systems Engineering within a large-scale production environment.
- Proficiency in at least one scripting/programming language such as Python, Go, or Bash.
- Deep understanding of Linux systems internals, network protocols (TCP/IP), and storage technologies.
- Experience with container orchestration platforms (Kubernetes, Docker) and cloud infrastructure.
- Strong problem-solving skills and the ability to thrive in a fast-paced, high-pressure environment.
- Excellent communication and collaboration skills to work effectively across global time zones.