Job Description
Join A*STAR's pioneering High-Performance Computing (HPC) team and shape the future of scientific discovery. As an HPC System Engineer, you'll architect, optimize, and secure cutting-edge supercomputing infrastructure that accelerates breakthrough research across computational biology, materials science, and artificial intelligence. Collaborate with world-class researchers to implement scalable solutions, ensure robust security protocols, and drive performance optimizations for mission-critical workloads. This role offers unparalleled opportunities to work with emerging technologies like GPU acceleration, containerization, and distributed computing frameworks while contributing to Singapore's national research initiatives.
Responsibilities
- Design, implement, and maintain scalable HPC infrastructure including compute clusters, storage systems, and high-speed networks
- Optimize system performance through workload analysis, resource allocation, and software tuning
- Develop and enforce security frameworks for HPC environments including access control and data protection
- Collaborate with researchers to deploy and support scientific applications and computational workflows
- Monitor system health, troubleshoot hardware/software issues, and implement preventive maintenance
- Document system architecture, configurations, and operational procedures
- Research and evaluate emerging HPC technologies for potential integration
Qualifications
- Bachelor's degree in Computer Science, Engineering, or related technical field (Master's preferred)
- 3+ years of experience in HPC system administration, Linux environments, and cluster management
- Expertise in parallel computing frameworks (MPI, OpenMP) and job schedulers (Slurm, PBS)
- Strong knowledge of network architectures (InfiniBand, Ethernet) and storage solutions (Lustre, GPFS)
- Experience with security hardening and compliance for high-performance systems
- Proficiency in scripting languages (Python, Bash) and automation tools
- Excellent problem-solving skills and ability to work collaboratively in cross-functional teams