VASP performance issue
Moderators: Global Moderator, Moderator
-
- Newbie
- Posts: 20
- Joined: Thu Jun 14, 2007 3:23 am
VASP performance issue
Hi,
We have installed VASP in the HPC cluster of our center with the following specifications :
Kernel : Linux 2.6.18-53.el5
Architecture : x86_64
Each Node : quad-core with dual processors each
(Hyperthreading disabled)
RAM used : 16 GB per node
Swap for each node : 8 Gb
Interconnect : INFINIBAND, 20 Gbps
MPI : Intel MPI, version 3.1, 64 bit
In Makefile we used : MPI_BLOCK size = 8000 and CACHE_SIZE = 4000
O3 level of optimization
and NSIM=4 was specified in INCAR.
We ran a job with 54 atoms. The first job we submitted with 40 processors taking 4 processors each from every node. It took ~ 4 Hours and 33 minutes.
The very same job was submitted with the rest 4 processors each of the same 5 nodes. This time the job was completed in 12 Hours and 20 minutes. Its surprising.
Not only that, we also submitted a job with 128 atoms. Once we submitted using Sun Grid Engine with 40 processors and next time we submitted with out using SGE( i.e. submitted directly using mpirun ). We noticed that the job which was submitted without SGE is about 2.5 faster than the job submitted through SGE.
Is there anything wrong in our installation ? Can you please suggest whether anything we are missing ?
Regards.
Prithwish
We have installed VASP in the HPC cluster of our center with the following specifications :
Kernel : Linux 2.6.18-53.el5
Architecture : x86_64
Each Node : quad-core with dual processors each
(Hyperthreading disabled)
RAM used : 16 GB per node
Swap for each node : 8 Gb
Interconnect : INFINIBAND, 20 Gbps
MPI : Intel MPI, version 3.1, 64 bit
In Makefile we used : MPI_BLOCK size = 8000 and CACHE_SIZE = 4000
O3 level of optimization
and NSIM=4 was specified in INCAR.
We ran a job with 54 atoms. The first job we submitted with 40 processors taking 4 processors each from every node. It took ~ 4 Hours and 33 minutes.
The very same job was submitted with the rest 4 processors each of the same 5 nodes. This time the job was completed in 12 Hours and 20 minutes. Its surprising.
Not only that, we also submitted a job with 128 atoms. Once we submitted using Sun Grid Engine with 40 processors and next time we submitted with out using SGE( i.e. submitted directly using mpirun ). We noticed that the job which was submitted without SGE is about 2.5 faster than the job submitted through SGE.
Is there anything wrong in our installation ? Can you please suggest whether anything we are missing ?
Regards.
Prithwish
Last edited by prithwish on Sat Aug 22, 2009 10:31 am, edited 1 time in total.
-
- Hero Member
- Posts: 586
- Joined: Tue Nov 16, 2004 2:21 pm
- License Nr.: 5-67
- Location: Germany
VASP performance issue
Prithwish,
you did not specify the vendor and model of your CPUs. This might have quite some influence on the performance under full load (speed of data transfer from memory to cpu).
about the 54 atoms job:
I'm not sure if I fully understand: You have two jobs, one with 40 cores and one with 20, the former taking 4.5h the latter 12.3h to complete?! Have they been started at the same time? What happened to the timings per step after the 4.5h job was finished?
Reliable benchmarking is hard. ;-)
No ideas about the big job, sorry.
cheers
alex
you did not specify the vendor and model of your CPUs. This might have quite some influence on the performance under full load (speed of data transfer from memory to cpu).
about the 54 atoms job:
I'm not sure if I fully understand: You have two jobs, one with 40 cores and one with 20, the former taking 4.5h the latter 12.3h to complete?! Have they been started at the same time? What happened to the timings per step after the 4.5h job was finished?
Reliable benchmarking is hard. ;-)
No ideas about the big job, sorry.
cheers
alex
Last edited by alex on Sun Aug 30, 2009 10:44 am, edited 1 time in total.
-
- Newbie
- Posts: 20
- Joined: Thu Jun 14, 2007 3:23 am
VASP performance issue
Hi Alex,
I am sorry for the mistake. The first job was also submitted with 20 processors. Both job was started at the same time.
Prithwish
I am sorry for the mistake. The first job was also submitted with 20 processors. Both job was started at the same time.
Prithwish
Last edited by prithwish on Mon Aug 31, 2009 8:39 am, edited 1 time in total.
-
- Hero Member
- Posts: 586
- Joined: Tue Nov 16, 2004 2:21 pm
- License Nr.: 5-67
- Location: Germany
VASP performance issue
Go back and try to reproduce. Check, if others are using the machine. Look at the LOOP timings. And try to answer all questions (CPU model, cat /proc/cpuinfo helps).
alex
alex
Last edited by alex on Mon Aug 31, 2009 9:05 am, edited 1 time in total.
-
- Newbie
- Posts: 20
- Joined: Thu Jun 14, 2007 3:23 am
VASP performance issue
Hi,
No, others were not using the machine that time.
The details are as follows :
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU X5460 @ 3.16GHz
stepping : 6
cpu MHz : 3166.851
cache size : 6144 KB
Regards.
Prithwish
No, others were not using the machine that time.
The details are as follows :
vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU X5460 @ 3.16GHz
stepping : 6
cpu MHz : 3166.851
cache size : 6144 KB
Regards.
Prithwish
Last edited by prithwish on Mon Aug 31, 2009 9:38 am, edited 1 time in total.
-
- Administrator
- Posts: 2921
- Joined: Tue Aug 03, 2004 8:18 am
- License Nr.: 458
VASP performance issue
please note that there is a small part of vasp (ca 3%) which cannot be parallelized, therefore the scaling will never go linearly with the number of processors.
please also check how much of the CPU time your jobs spend swapping memory, if the switches are comparably fast,...
last but not least: I hope you do not compare wallclock times, do you?
please also check how much of the CPU time your jobs spend swapping memory, if the switches are comparably fast,...
last but not least: I hope you do not compare wallclock times, do you?
Last edited by admin on Mon Oct 12, 2009 2:00 pm, edited 1 time in total.