Re: NEC-LIST: CEM question

From: Jos R Bergervoet <Jos.Bergervoet_at_email.domain.hidden>
Date: Thu, 17 Feb 2000 14:11:28 +0100

Juergen von Hagen wrote:

> many times commercial codes are not really better in the
> computational core, but rather in the GUI: that's what sells and
> that's what is seen.
>
> On a C240+ our FDTD code written by a guy here, a
> 79 x 94 x 184 cells makes about 1 s / iteration or
> 731 ns / iteration / cell (pretty close to Jos' value
> for psufdtd if it was also on a C240+).

My value of 250 ns/cell/iteration was for a PA8500/440, whatever that
may be. But I would actually think that it can be made substantially
faster by hand-optimized code for the internal FDTD loops. Counting 36
floating point operations for the two curl-equations in vacuum, the
speed is only 8 percent of maximum machine speed. My findings: (all
for large jobs)

  NEC2D+Lapack: 1200 Mflops 71% of max. machine speed
  plain NEC2D: 400 Mflops 24% of max. machine speed
  psuFDTD (L&K) 144 Mflops 8% of max. machine speed
  commercial FDTD 35 Mflops 2% of max. machine speed

Max speed of 1700 Mflops can really be obtained for these machines by
certain test programs. My NEC2D+Lapack uses the optimized lapack-blas
veclib from Convex corp.

Since there are no BLAS routines that can actually do the internal
FDTD loops, one would have to design them by hand. It would be
interesting to see how far the speed could be improved. Has anyone
heard of such an approach?

(Jos)

--
  Dr. Jozef R. Bergervoet                      Electromagnetism and EMC
  Philips Research Laboratories,             Eindhoven, The Netherlands
  Building WS01                                     FAX: +31-40-2742224
  E-mail: bergervo_at_natlab.research.philips.com    Phone: +31-40-2742403
Received on Thu Feb 17 2000 - 10:02:44 EST

This archive was generated by hypermail 2.2.0 : Sat Oct 02 2010 - 00:10:40 EDT