NEC-LIST: 64 bit NEC

From: Paul Carlier <pcarlier_at_email.domain.hidden>
Date: Wed, 14 Mar 2007 16:37:26 -0400

During some posts I made in August 2006 regarding getting NEC to utilise
more than 2GB RAM for large models, I promised to provide an update when I
had the 64-bit software necessary to do this. Well, I have now made it work
and I am impressed with the speed, having just run a 20,000 segment model
with NEC4 in 51minutes.

I am using Windows XPx64 OS on a 3GHz P4 630 EMT processor with 4GB RAM.
NEC4 was compiled with Intel Visual Fortran V9.1 running within MS Visual
Studio 2005 (all 64-bit). The Intel Math Kernel Library V9.0 was used for
the matrix factor and solve routines.

Some changes are needed to the NEC source code, as even 64-bit Windows
limits static variables to a maximum of 2GB. Thus the array CM has to be
removed from COMMON (which is static) and made ALLOCATABLE. Thus CM has to
be removed from 8 COMMON (and also COMPLEX) statements and declared COMPLEX,
ALLOCATABLE in a module and then ALLOCATED in the main program. Other
changes that I made were to comment out the code for the BLAS CSWAP and
CSCAL subroutines, as these are contained within the MKL library and it was
the easiest way to avoid a double definition error.

At the moment, XP is reporting that I have only 3.25GB RAM available, so I
am trying to find out what it is doing with the other 0.75GB. (I have a
video card which can take up to 128MB from the system RAM, but I can't see
what is responsible for the rest.). Hence the current 20,000 segment limit
for single precision. The DP version also works well, but obviously with a
lower segment limit.

As a benchmark I ran an 1800 segment test file on a 1.2GHz P3 with 2GB RAM,
running Windows 2000 Professional, with NEC4S compiled with Compaq Visual
Fortran V6.1a with the Intel MKL v3.2.1 for the factor and solve routines.
This took 28 secs total run time.

Running the same executable on the 3GHz P4 64-bit PC, but prior to the
64-bit upgrade when it was running Windows 2000 Pro, took about 22 secs. So
the improvement was not pro-rata to the clock speed, presumably because the
64-bit processor and 32-bit OS handle 32-bit code inefficiently.

With the 64-bit IVF compiled executable running on Windows XPx64, the run
time came down to 6 secs.

If anyone would like further information, please let me know.

Paul Carlier
FanField Ltd
England

-- 
The NEC-List mailing list
NEC-List_at_robomod.net
http://www.robomod.net/mailman/listinfo/nec-list
Received on Wed Mar 14 2007 - 20:38:08 EDT

This archive was generated by hypermail 2.2.0 : Sat Oct 02 2010 - 00:10:46 EDT