Facility Management

Supercomputer Education and Research Center provides and supports one of the largest computational facilities to, not only the students and faculty of the Indian Institute of Science but also to the researchers outside the Institute. SERC, as a computer center is considered to be one of the biggest in all of Southeast Asia. As a facility, SERC strives to be state-of-art in providing the latest available technology to the fraternity of the Institute and the country. The computational facility is supported by a team to provide an uninterrupted service running in the mode of 24/7(24 hours, each day of the week), all through 365 days of a year.

The computational facility at SERC is conceived of as a functionally distributed supercomputing environment, housing leading-edge computing systems, with sophisticated software packages, and connected by a powerful high-speed network for access to all departments of IISc. The systems at SERC can be functionally classified as Supercomputing facility, Advanced Graphics facility, Compute servers, Access workstations and the File Servers. Each of the systems’ is integrated into the existing facility to allow users a single sign-on feature by using NIS and NFS services and batch scheduling software to ensure fair workload management on the compute servers. Apart from this, each of the category of systems is configured and installed with appropriate software tools, libraries and applications to assist users in their specific domains.

As part of the support team, I am in-charge of a host of Unix workstations, servers and supercomputers, which form the major backbone of SERC computing facility. Am responsible for establishing and training a team to manage and administer these systems on a day-to-day basis. SERC currently hosts close to 300 systems, which are predominantly Unix based, from a variety of vendors like IBM, SUN, SGI, HP, etc. These systems are hosted inside the SERC building and are available on 24/7 – 365 days basis, for local as well as remote access to all users at IISc. A team of system administrators manage the center and the systems. The administrators ensure the availability of these systems in accordance with SERC usage policies to the users. On a day-to-day basis these administrators monitor the systems for hardware, software and service failures and proactively take corrective measures to ensure uninterrupted compute facility. I started with a single project assistant and till date have successfully trained more than 50 people in the area of Unix system administration. Typically this is a group of floating population, with a stay-in period of about 8 to 18 months, wherein fresh graduates, mostly engineers, are taken and trained to support such facility. This is an ongoing activity.

I am also involved in various teams for establishing SERC facility usage policies. The system usage policies enable the optimal use of SERC computing facility. Since the facility is shared by more than 2000 users across IISc., adequate mechanisms are put in place to ensure maximum throughput and optimal resource usage of the systems. The policies are made based on the systems capabilities, targeted use and user demand for the resource. The policies are periodically monitored and changed if necessary. Apart from this I have mentored and guided in the development and deployment of system and software usage tools at SERC. The usage reports are generated on a monthly basis and help us to understand the usage patterns, which help us in decision making on up-gradation and procurement requirements raised by the SERC users.

I play a vital role as part of the User-Support team at SERC. The user-support team acts as the steering guide to using the SERC facility. As part of this team I am constantly interacting with users to help them understand the facility and its usage. This involves helping the user analyze his problem in terms of resource requirements, to identify the system/s that can meet his application demand, to give appropriate information to develop, debug, profile and if need be, optimize the program and execute according to the usage policies. This is an ongoing activity, which needs lot of interaction with users, and is also aided by the SERC helpdesk facility. The helpdesk is monitored by a team of trainees who look into the complaints registered and alert the corresponding administrator for action. The helpdesk team tries to address and solve user problems within 48 hours, with an availability of 24/7-365days. SERC also allows external users from other academic, government and industrial organizations to use the computing facility for academic or research end-use. I am the interface between SERC and such organizations right from understanding their requirements, helping in mapping their requirements to the existing facility, enabling the access to them and then ensuring appropriate charging mechanisms to keep track of such usage.

SERC also supports a variety of general-purpose software for the benefit of its users. I have been part of various technical committees for software procurement and deployment for SERC, for diverse applications like molecular modeling and analysis, engineering modeling and analysis, software modeling and development (CASE tools and IDEs), multilingual word processors, system software, security software, workload managers, etc. In this term I was involved in the software deployment feasibility assessment, procurement, installation, configuration and user hand-holding in using of software like Fluent, TecPlot and GridPro (used for computational fluid dynamics applications), Accelrys software suite,  Tripos, BioSuite and Gromos (used for Life and Material sciences applications), PBSPro for workload management on the high performance computing systems, and Synopsis and Mentor Graphics for processor design evaluation and synthesis.