Deskripsi Pekerjaan
Unlock the Future of High-Performance Computing at NSCC Singapore!
Are you a visionary HPC Storage Engineer passionate about pushing the boundaries of scientific discovery and technological innovation? The National Supercomputing Centre (NSCC) Singapore, an initiative by the Agency for Science, Technology and Research (A*STAR), invites you to join our elite team. We are seeking a highly skilled and motivated HPC Storage Engineer (System) to manage, optimize, and evolve our cutting-edge high-performance computing storage infrastructure.
In this critical role, you will be at the forefront of enabling groundbreaking research across various scientific and engineering domains. You will work with petabytes of data, ensuring the seamless operation, exceptional performance, and robust reliability of our enterprise-grade storage systems. Your expertise will directly contribute to accelerating computational science, artificial intelligence, and big data analytics projects that have a tangible impact on Singapore's economic growth and societal well-being.
If you thrive in a challenging yet rewarding environment, possess a deep understanding of parallel file systems, and are committed to maintaining a world-class HPC environment, we want to hear from you. This is a unique opportunity to shape the future of supercomputing infrastructure and collaborate with leading experts in the field. Join us and make your mark on the next generation of scientific breakthroughs!
Tanggung Jawab
- Design, implement, and manage petabyte-scale high-performance storage solutions, including parallel file systems (e.g., Lustre, GPFS/Spectrum Scale).
- Monitor, analyze, and optimize HPC storage performance to meet stringent computational demands.
- Ensure the high availability, reliability, and data integrity of all storage systems through proactive maintenance and troubleshooting.
- Develop and implement robust backup, recovery, and disaster recovery strategies for critical HPC data.
- Collaborate with HPC system administrators, network engineers, and researchers to integrate storage solutions effectively into the overall HPC ecosystem.
- Evaluate and recommend new storage technologies and best practices to enhance infrastructure capabilities and efficiency.
- Provide expert-level support and incident resolution for complex storage-related issues.
- Document storage configurations, operational procedures, and best practices for system management and knowledge transfer.
Kualifikasi
- Bachelor's or Master's degree in Computer Science, Computer Engineering, Information Technology, or a related field.
- Minimum of 3-5 years of experience in managing and optimizing large-scale HPC storage environments.
- Proven expertise with parallel file systems such as Lustre, IBM Spectrum Scale (GPFS), or BeeGFS.
- Strong proficiency in Linux operating systems, including system administration, scripting (Bash, Python), and troubleshooting.
- Familiarity with storage hardware platforms (e.g., NetApp, Dell EMC, Pure Storage) and SAN/NAS technologies.
- Experience with network protocols relevant to HPC storage (e.g., InfiniBand, high-speed Ethernet).
- Solid understanding of data protection strategies, including RAID, snapshots, and replication.
- Excellent problem-solving skills, with the ability to diagnose and resolve complex technical issues efficiently.