ANEESUR

Home

ANEESUR CLUSTER DETAILS

Name of the HPC cluster

Aneesur

Fully Qualified Domain Name

aneesur.iitgn.ac.in

IP Address of HPC cluster

10.0.138.36

Make

Lenovo

Usable Storage

~52TB

Total CPU

520 cores Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz

Total GPU

1 x NVIDIA Tesla V100-PCIE-32GB in each GPU node

Total Compute nodes

9

Total GPU nodes

4

Job Scheduler

SLURM 18.08

User level Quota

100 GB per user in the home directory

To check your Quota

lfs quota -hu <your username> /home/<Supervisor_grp>/<username>

Usage Guidelines
  • Users are supposed to submit jobs only through scheduler. 
  • Users are not supposed to run any job on the master node. 
  • Users are not allowed to run a job by directly login to any compute node. 
  • Users should run and write their jobs from /scratch/username/foldername only; Users should NOT run and write their jobs from /home/username 
  • Users must understand that HPC is a central facility that is shared by all members of the institute. Users should therefore use an optimum number of processing cores by testing for scaleup. 
  • A sample script is provided in the home directory of each user. 
  • The quota for each user is 100 GB in home directory. There is no per-user limit for the scratch folder. 
  • An automatic email will be sent to the HPC user community once 75 % of the scratch is used. Another automatic email will be sent to the users once 85% of the scratch space is used. The deletion of the files by the administrator will commence within 24 hours of the second email. 
  • It is strongly recommended that users backup their files-folders periodically, as ISTF will not be having a mechanism to backup users’ data. 
  • Deletion of files in scratch directory will be automatically done in scratch 21 days after the last time stamp/update. 
  • Users are strictly NOT ALLOWED to run any jobs on the Master Nodes of the HPC cluster. 
  • A priority based queueing system is implemented so that all users get a fair share of available resources. The priority will be decided on multiple factors including job size, queue priority, past and present usage, time spent on queue etc. 
  • Please note that there is incentive to optimize your usage. You will get more priority! 
  • It is strongly recommended that users must request the scheduler to pick cores from the same node whenever possible. If the cores are not available on the same node, the users can request for cores from other nodes. 
  • There is no limit on the number of jobs per user. The maximum number of cores per user is set as 128. The maximum number of cores per job is currently set as 64. 
  • For any issue or requests pertaining to Aneesur, please send your email with your working-path, error logs and error screenshots and submit-script only at helpdesk.istf@iitgn.ac.in
Software
  • We are using CentOS-7.5 64-bit as the Operating System in the cluster.
  • Installed application and module details are given in the below table:
  • Specific module(s) need to be called in the submit script explicitly for running specific job(s).  Example:    module load <modulename>
  • Applications – Latest and stable version of software installed as in ANEESUR.
  • Name of Executables – First initialize the respective module. 

    module load  <modulename>

    Gromacs (cpu or gpu or patched with plumed) : gmx_mpi , Lammps: lmp_mps , Gaussian: g09 , NAMD (cpu or gpu or patched with plumed): namd2 , QE (cpu or gpu): pw.x , Uintah (cpu or gpu): sus , Charmm: charmm

Application Name

modulename

gromacs

apps/gromacs/gcc-4.8.5/ompi-3.1.1rc1/gromacs-2019.6-gpu

apps/gromacs/parallel_studio_xe_2020.0.166/gromacs-2019.6

apps/gromacs/gcc-4.8.5/ompi-3.1.1rc1/gromacs-2019.6-gpu-plumed

apps/gromacs/parallel_studio_xe_2020.0.166/gromacs-2019.6-plumed

quantum espresso

apps/quantum_espresso/parallel_studio_xe_2020.0.166/6.5

apps/quantum_espresso/pgi-19.10/6.5-gpu

apps/quantum_espresso/parallel_studio_xe_2020.0.166/6.5-plumed

apps/quantum_espresso/pgi-19.10/6.5-gpu-plumed

NAMD

apps/NAMD/parallel_studio_xe_2020.0.166/2.13

apps/NAMD/parallel_studio_xe_2020.0.166/2.13-gpu

apps/NAMD/parallel_studio_xe_2020.0.166/2.13-gpu_plumed

apps/NAMD/parallel_studio_xe_2020.0.166/2.13_plumed

lammps

apps/lammps/parallel_studio_xe_2020.0.166/22_mar_2020

apps/lammps/parallel_studio_xe_2020.0.166/22_mar_2020-gpu

apps/lammps/parallel_studio_xe_2020.0.166/22_mar_2020-gpu_plumed

apps/lammps/parallel_studio_xe_2020.0.166/22_mar_2020_plumed

Unitah

apps/uintah/gcc-4.8.5/2.1.0

apps/uintah/gcc-4.8.5/2.1.0-gpu

Plumed

libs/plumed/gcc-4.8.5/parallel-2.5.4

libs/plumed/parallel_studio_xe_2020.0.166/2.5.4

libs/plumed/parallel_studio_xe_2020.0.166/parallel-2.5.4

libs/plumed/pgi-19.10/2.5.4

Charmm

apps/charmm/parallel_studio_xe_2020.0.166/c39b2/parallel

apps/charmm/parallel_studio_xe_2020.0.166/c39b2/parallel_mkl

apps/charmm/parallel_studio_xe_2020.0.166/c39b2/parallel_mkl_fftw

apps/charmm/parallel_studio_xe_2020.0.166/c39b2/serial

apps/charmm/parallel_studio_xe_2020.0.166/c39b2/serial_mkl

apps/charmm/parallel_studio_xe_2020.0.166/c39b2/serial_mkl_fftw

Gaussian

/etc/profile.d/gaussian.sh

Cassandra

apps/cassandra/parallel_studio_xe_2020.0.166/v1.2

StarCCM+ 

Create a file under your home directory
$ vim .flexlmrc

Put the below content in that file and save that file

CDLMD_LICENSE_FILE=2000@10.0.137.114:2000@172.22.0.1

Queuing Systems & Scheduler
Queuing Systems

When a job is submitted, it is placed in a queue. There are different queues available for different purposes. The user must select any one of the queues from the ones listed below which is appropriate for his/her computation need.

Queue Details
  • Debug Queue: This queue is available to all the HPC users to quickly run a small test job to check whether it converges successfully or not.
Name of Queue = debug
No of nodes = 9
Max number of cores per job = 64
Walltime = 60 minutes
  • Main Queue: This queue is available to all the HPC users to run multicore parallel jobs.
Name of Queue = main
No of nodes = 9
Max number of cores per job = 64
Walltime = 48 hours
  • GPU Queue: This queue is available to all the HPC users and is it encouraged that jobs that utilize GPU cards should use this queue.
Name of Queue = gpu
No of nodes = 4
Max number of cores per job = 64
Walltime = 48 hours

Node Configuration

Based on the queuing system given above, the node configurations can be summarized as follows:

Queue Name

Max Wall time

Max number of cores per job

Priority

debug

60 minutes

64

1

main

48 hours

64

2

gpu

48 hours

64

2

Sample Scripts to submit job for various queue:

Quantum Espresso Gromacs
#!/bin/bash
#SBATCH –job-name=qespresso # Job name
#SBATCH –ntasks=16 # Number of MPI tasks (i.e. processes)
#SBATCH –cpus-per-task=1 # Number of cores per MPI task
#SBATCH –error=myjob.%J.err
#SBATCH –output=myjob.%J.out
#SBATCH –time=24:00:00 # walltime
#SBATCH –partition=
#SBATCH -v

module load apps/quantum_espresso/parallel_studio_xe_2020.0.166/6.5
cd /scratch//
mpirun -np 16 pw.x -inp ./ausurf.in |& tee single_node-$(date +%s).log

#!/bin/bash
#SBATCH –job-name=gromacs # Job name
#SBATCH –ntasks=16 # Number of MPI tasks (i.e. processes)
#SBATCH –cpus-per-task=1 # Number of cores per MPI task
#SBATCH –error=myjob.%J.err
#SBATCH –output=myjob.%J.out
#SBATCH –time=24:00:00 # walltime
#SBATCH –partition=
#SBATCH -v

module load apps/gromacs/parallel_studio_xe_2020.0.166/gromacs-2020.1
cd /scratch//
mpirun -np 16 gmx_mpi mdrun -deffnm pme_test -v -ntomp 2 -nsteps 1000 -noconfout -pin on -noddcheck |& tee two_node-$(date +%s).log

 

NAMD with GPU Lammps
#!/bin/bash
#SBATCH –job-name=namd # Job name
#SBATCH –ntasks=16 # Number of MPI tasks (i.e. processes)
#SBATCH –cpus-per-task=1 # Number of cores per MPI task
#SBATCH –error=myjob.%J.err
#SBATCH –output=myjob.%J.out
#SBATCH –time=24:00:00 # walltime
#SBATCH –partition=
#SBATCH -v

module load apps/NAMD/parallel_studio_xe_2020.0.166/2.13-gpu
cd /scratch//
for a in 1 2
do
mkdir ./$a
cd ./$a
cp ../npt1.namd .
mpirun -np 1 namd2 ++ppn 16 +ignoresharing npt1.namd > logfile1.log &
#rm npt.namd
cd “$OLDPWD”
done
wait

#!/bin/bash
#SBATCH –job-name=
#SBATCH –ntasks=16
#SBATCH –gres=gpu:1
#SBATCH –error=myjob.%J.err
#SBATCH –output=myjob.%J.out
#SBATCH –partition=gpu_new
#SBATCH -v

cd ~/
MACHINEFILE=machinefile
scontrol show hostname $SLURM_JOB_NODELIST > $MACHINEFILE
mpirun -batch -np 32 -machinefile $MACHINEFILE -rsh /usr/bin/ssh ~//input-file

Useful Commands

• For submitting a job: sbatch submit_script.sh
• For checking queue status: squeue -l
• For checking node status: sinfo
• For cancelling the job: scancel <job-id>
• For checking whether the job is running in GPU: nvidia-smi
• For checking the genaration of output at runtime: tail -f output.log
• For scp a file/folder from cluster to your machine: scp -r files/folders username@your-machine-IP-Address:
• For scp a file/folder from your machine to the cluster: scp -r files/folders username@10.0.138.36:

 

Useful Link

• User guide to SLURM : [https://slurm.schedmd.com/pdfs/summary.pdf]

How-To's

How to Obtain an Account in Aneesur: Please send an email to helpdesk.istf@iitgn.ac.in with a copy to your supervisor. Also please do let us know the duration of the account required and the list of software which you wish to run.

Name of the cluster: Aneesur
IP of the cluster: 10.0.138.36
Hostname of the cluster: aneesur.iitgn.ac.in

Login from Linux:
To login from Linux you simply need to open a Terminal which is installed with the base OS of any flavor of Linux.  

Login from Windows:
If you use Windows then you can use Putty which can be downloaded from here.   

Click “Yes” to continue

External Usage

Computational Resources for External Usage @HPCLab in IITGN

Access to our High Performance Computing (HPC) Facility is granted to external users (Academic/Research organizations and Industry only) through a Committee.

The Proposal from the user should reflect the

  • Technical Details of specific facility needed & its duration
  • Brief scientific narration of their proposed work

Please send your detailed proposal to support.hpc@iitgn.ac.in

Based on the review outcome and feasibility consideration of our facility, we will allocate computer resources.

Obtaining the HPC Account

  • Once the proposal is reviewed, accepted and approved by the committee, the user may download, fillup the HPC application form ink-sign, scan it and email us.
  • Thereafter a unique group/user name will be created for the external user and associated user(s) and thereafter the user credentials would be sent in reply-email.

Usage Policy

Forms

Contact

  • Email: support.hpc@iitgn.ac.in
Funding

Funding Agencies

               
Publications

Coming Soon…!

Galleries

FAQ’s

How to Request for HPC account

Please send an email to helpdesk.istf@iitgn.ac.in with a copy to your supervisor. Also please do let us know the duration of the account required and the list of software which you wish to run.

Where is my quota and how do I change it?

The quota for each user is 100 GB in home directory. There is no per-user limit for the scratch folder.
quota -v your username /home/ your Supervisor_grp/username

How many Jobs can I run with how many cores?

There is no limit on the number of jobs per user. The maximum number of cores per user is set as 96. The maximum number of cores per job is currently set as 64.

How the scheduling has been implemented?

The SLURM scheduler will automatically find the required number of processing cores from nodes (even if a node is partially used). Please do not explicitly specify the node number/number of nodes in the script. A priority based queueing system is implemented so that all users get a fair share of available resources. The priority will be decided on multiple factors including job size, queue priority, past and present usage, time spent on queue etc.

Where do I run my Jobs?

Users should run and write their jobs from /scratch/username/foldername only; Users should NOT run and write their jobs from /home/username

User Data Backup

It is strongly recommended that users backup their files-folders periodically, as ISTF will not be having a mechanism to backup users’ data. An automatic email will be sent to the HPC user community once 75 % of the scratch is used. Another automatic email will be sent to the users once 85% of the scratch space is used. The deletion of the files by the administrator will commence within 24 hours of the second email. Deletion of files in scratch directory will be automatically done in scratch 21 days after the last time stamp/update.

Can I run Jobs on the master node or in any other node in an interactive manner without using script and bypassing scheduler?

Users are strictly NOT ALLOWED to run any jobs on the Master Nodes or any other node in an interactive manner. Users must run jobs only using the scripts through the job scheduler.

Whom should I contact for any issue?

For any issue or requests pertaining to VEGA, please send your email with your working-path, error logs and error screenshots and submit-script only at helpdesk.istf@iitgn.ac.in