Re: NEC-LIST: PC NEC4.1 Benchmarks...

From: <LAITINEN_at_email.domain.hidden>
Date: Wed, 14 Oct 1998 01:47:10 -0700 (PDT)

In response to questions from Paul Elliot of AJK Technology and Keith
Lysiak of SWRI regarding my recent posting of the PC NEC4.1
Benchmarks:

1. Of the compiled NEC4.1 codes tested, the DEC Visual Fortran
    Compiler produced the fastest executing NEC4.1 code. This can be
    seen in tables 11 and 12. The DEC VFC NEC4.1 executable is a true
    Windows 32-bit application. And thus it will NOT run under DOS.

    The Lahey LF90 V3.5 produced NEC4.1 code can be executed as an
    application under real DOS, from an NT Command Prompt or from NT's
    "START/RUN" menu.

    The Lahey LF77 V5.10 produced NEC4.1 code could only be executed
    under real DOS (and presumably Windows-95). It could not be
    executed under Windows-NT at all. It appears that the version of
    the Phar Lap DOS Extender in that version of LF77 is incompatible
    with NT.

    Note that in Table-12 there is the annotation NT/CP to designate
    those tests that were started from the NT (not DOS!) Command
    Prompt, rather than from the Windows-NT "START/RUN." Everything
    annotated "NT" was started from the NT "START/RUN" menu item. I
    have seen significantly different execution times under NT for the
    two different methods of starting a Lahey LF90 compiled executable
    image. E.g., look at Table-12 and compare the Lahey F90 entries
    for TEST299.NEC, TEST300.NEC, etc on the GA-686KX motherboard at
    266-MHz. Although the matrix factor times are close, the matrix
    fill times are quite a bit different. In the TEST600.NEC test the
    fill time was 13.418 seconds for the START/RUN method and 19.059
    seconds for the Command Prompt method.

    Note that in Lahey LF90 V3.5 they were still using the Phar Lap
    DOS extender. The DEC DVF compiler produces true 32-bit code
    without the DOS extender. I believe (but have not verified) that
    Lahey does that in their LF90 Version-4 compiler. I don't have
    it. The DOS extender in LF90 V3.5 probably has something to do
    with the squirrelly execution times under NT.

2. I did not know anything about the NEC4.1 Numerical Green's
    Function problem under Windows-95/NT. Thanks for bringing this to
    my attention. I wonder if caching affects this?
    
3. Sorry for the lack of a summary write up of the test results.
    This is a background project and I wanted to get the data out
    sooner rather than later... Here are a few words:

    PC performance on 80x86 chips has certainly exploded over the last
    few years. I originally started using NEC2 and then NEC4.1 a few
    years ago as a PC reliability, compatibility and performance test.
    Also, radio engineering friends were frequently asking me what is
    the best PC and/or motherboard to buy for their radio and antenna
    engineering work.

    Several years ago there was a greater difference in reliability,
    compatibility and performance among commercially manufactured name
    brand PCs, manufactured clones and homebrewed clones using buyer
    selected motherboards. You will note from the evolution of my
    testing over time that I'm biased towards Gigabyte motherboards,
    though they have not always been trouble-free. Particularly with
    the Triton-II chipset in the Pentiums and the Klamath (KX) chipset
    in the Pentium-IIs. So far I've not had any problems with the LX
    and BX chipset based Gigabyte Pentium-II motherboards.
 
    The problems with the Triton-II and KX chipset based motherboards
    usually involved memory data errors, particularly under
    Windows-NT. Likely due to very tight (marginally reliable) memory
    timing. Over time BIOS upgrades (and a downgrade in the case of
    the KX chipset) and selected memory resolved these problems.

    Twelve years ago I bought a Convex C-1 vector mini-supercomputer
    for use in a physics professor's research group. That C-1, if my
    feeble memory is any good, had 40-MFLOPS/sec single precision and
    20-MFLOPS/sec double-precision theoretical performance. It came
    with a vectorizing Fortran compiler, 16-MB (2-MW of 64-bit) ECC
    RAM, 500-MByte hard disk, Ethernet interface, 9-track tape drive,
    RS-232 mux, etc in two cabinets for $400K! Now you can see from
    Table-1 that a Pentium 200-MHz/MMX PC has about the same
    MFLOPS/sec performance in DP and yet the cost was only about $2K a
    year or so ago! Wow!

    In Table-6 you can see the effects of three different types of
    memory and Intel support chips on Pentium-II performance. The
    reference GA-686LX motherboard with the Intel LX chipset has the
    10-nsec SDRAM memory. The older Intel KX chipset supporting FPM
    type memory (60-nsec, e.g.) has much poorer performance. Thus the
    10-nsec SDRAM memory and supporting LX chipset provide reasonable
    performance with the 266 and 300-MHz Pentium-IIs shown. Note how
    the Dell D300 at 300-MHz is not significantly better than a good
    266-MHz Gigabyte GA-686LX motherboard! This is an important point
    when making a PC purchasing decision. Just because the CPU clock
    speed is higher doesn't necessarily mean that you are going to get
    significantly higher performance. Intel keeps raising the CPU
    clock speeds but the supporting chipset and memory often are not
    fast enough to get the performance out of the higher CPU clock
    speed. And the primary L1 CPU cache size often seems inadequate.

    Now look at the 350-MHz BX chipset motherboard, Gigabyte GA-686BA
    performance in Table-6. The PC100 memory bus with 8-nsec SDRAM
    and BX chipset appear to make quite an improvement in NEC4.1
    performance. Although I do not have 333-MHz LX chipset data in
    Table-6 for a single CPU motherboard, looking at the 300-MHz and
    DL2 motherboard data would suggest that the LX chipset and memory
    at 333-MHz would fall far short of the desired performance and it
    is worth the jump to the BX chipset with the PC100 memory bus.

    The differences between compilers is even more dramatic. From
    Table-11 and Table-12 you can see quite a difference in compiled
    code performance. I have older data on the Microsoft Power
    Station Fortran that shows it to be rather poor. Microsoft sold
    the compiler product line to DEC. DEC apparently improved it with
    their own technology producing a superior product. The Lahey LF90
    Version-4 compiler is claimed to have significantly faster
    executing compiled code than its V3.5 predecessor, but I've no
    experience with it nor have I received any user reports on it.
    The two products have comparable educational pricing. Note that
    the matrix factor times are often close for the two compilers --
    but the matrix fill times are often 1.5 to 3 times slower for the
    Lahey LF90 under NT.

    As usual, Caveat Emptor! Hopefully you will find the performance
    data presented here useful in your purchasing decisions...

--Larry, W7JYJ
Received on Wed Oct 14 1998 - 18:13:03 EDT

This archive was generated by hypermail 2.2.0 : Sat Oct 02 2010 - 00:10:38 EDT