VASP performance issue

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
prithwish
Newbie
Newbie
Posts: 20
Joined: Thu Jun 14, 2007 3:23 am

VASP performance issue

#1 Post by prithwish » Sat Aug 22, 2009 10:31 am

Hi,
We have installed VASP in the HPC cluster of our center with the following specifications :

Kernel : Linux 2.6.18-53.el5
Architecture : x86_64
Each Node : quad-core with dual processors each
(Hyperthreading disabled)
RAM used : 16 GB per node
Swap for each node : 8 Gb
Interconnect : INFINIBAND, 20 Gbps
MPI : Intel MPI, version 3.1, 64 bit

In Makefile we used : MPI_BLOCK size = 8000 and CACHE_SIZE = 4000
O3 level of optimization
and NSIM=4 was specified in INCAR.

We ran a job with 54 atoms. The first job we submitted with 40 processors taking 4 processors each from every node. It took ~ 4 Hours and 33 minutes.
The very same job was submitted with the rest 4 processors each of the same 5 nodes. This time the job was completed in 12 Hours and 20 minutes. Its surprising.

Not only that, we also submitted a job with 128 atoms. Once we submitted using Sun Grid Engine with 40 processors and next time we submitted with out using SGE( i.e. submitted directly using mpirun ). We noticed that the job which was submitted without SGE is about 2.5 faster than the job submitted through SGE.

Is there anything wrong in our installation ? Can you please suggest whether anything we are missing ?

Regards.
Prithwish
Last edited by prithwish on Sat Aug 22, 2009 10:31 am, edited 1 time in total.

alex
Hero Member
Hero Member
Posts: 586
Joined: Tue Nov 16, 2004 2:21 pm
License Nr.: 5-67
Location: Germany

VASP performance issue

#2 Post by alex » Sun Aug 30, 2009 10:44 am

Prithwish,

you did not specify the vendor and model of your CPUs. This might have quite some influence on the performance under full load (speed of data transfer from memory to cpu).

about the 54 atoms job:
I'm not sure if I fully understand: You have two jobs, one with 40 cores and one with 20, the former taking 4.5h the latter 12.3h to complete?! Have they been started at the same time? What happened to the timings per step after the 4.5h job was finished?

Reliable benchmarking is hard. ;-)

No ideas about the big job, sorry.

cheers

alex
Last edited by alex on Sun Aug 30, 2009 10:44 am, edited 1 time in total.

prithwish
Newbie
Newbie
Posts: 20
Joined: Thu Jun 14, 2007 3:23 am

VASP performance issue

#3 Post by prithwish » Mon Aug 31, 2009 8:39 am

Hi Alex,
I am sorry for the mistake. The first job was also submitted with 20 processors. Both job was started at the same time.
Prithwish
Last edited by prithwish on Mon Aug 31, 2009 8:39 am, edited 1 time in total.

alex
Hero Member
Hero Member
Posts: 586
Joined: Tue Nov 16, 2004 2:21 pm
License Nr.: 5-67
Location: Germany

VASP performance issue

#4 Post by alex » Mon Aug 31, 2009 9:05 am

Go back and try to reproduce. Check, if others are using the machine. Look at the LOOP timings. And try to answer all questions (CPU model, cat /proc/cpuinfo helps).

alex
Last edited by alex on Mon Aug 31, 2009 9:05 am, edited 1 time in total.

prithwish
Newbie
Newbie
Posts: 20
Joined: Thu Jun 14, 2007 3:23 am

VASP performance issue

#5 Post by prithwish » Mon Aug 31, 2009 9:38 am

Hi,
No, others were not using the machine that time.
The details are as follows :

vendor_id : GenuineIntel
cpu family : 6
model : 23
model name : Intel(R) Xeon(R) CPU X5460 @ 3.16GHz
stepping : 6
cpu MHz : 3166.851
cache size : 6144 KB

Regards.
Prithwish
Last edited by prithwish on Mon Aug 31, 2009 9:38 am, edited 1 time in total.

admin
Administrator
Administrator
Posts: 2921
Joined: Tue Aug 03, 2004 8:18 am
License Nr.: 458

VASP performance issue

#6 Post by admin » Mon Oct 12, 2009 2:00 pm

please note that there is a small part of vasp (ca 3%) which cannot be parallelized, therefore the scaling will never go linearly with the number of processors.
please also check how much of the CPU time your jobs spend swapping memory, if the switches are comparably fast,...
last but not least: I hope you do not compare wallclock times, do you?
Last edited by admin on Mon Oct 12, 2009 2:00 pm, edited 1 time in total.

Post Reply