NEC-LIST: PC/NEC4.1 Benchmarks updated...

From: <LAITINEN_at_email.domain.hidden>
Date: Tue, 16 Feb 1999 14:05:19 -0800 (PST)

                                                         Rev: 11-FEB-99

                        PC NEC4.1 PERFORMANCE DATA

11-FEB-99 Update:

    Tested the Dell Dimension XPS R450 450-MHz PC with Pentium-II CPU,
    256-MB SDRAM, and Windows-NT V4 SP3. The memory speed is unknown.

    The 1200-segment matrix factorization time ran 2.1-percent slower than
    the 400-MHz Gigabyte GA-686BA based PC. However, the matrix fill time
    ran 10-percent faster. Total execution time was 0.4-percent faster.

    The matrix fill time almosts scales with the increased clock frequency,
    but the matrix factorization time was actually worse than the slower clock
    frequency 400-MHz Gigabyte PC. This was probably due to inadequate main
    memory bandwidth. I.e., matrix fill calculations probably have a higher
    primary cache hit rate than matrix factorization, thus scaling better with
    the clock frequency while the larger factorization operations suffered
    from the main memory system bandwidth. Perhaps Dell used slow main
    memory...

    See Table-6 for a comparison of the matrix factorization "Effective MHz"
    ratings.

    Tests were conducted using the DEC VF, Lahey LF90 V3.5 and Lahey F77 V5.10
    compiled NEC4.1 code. The detailed execution times are contained in
    Table-12.

    Lahey was contacted about two months ago for review and comment of their
    compiler's NEC performance, but they have not yet provided a response.

28-NOV-98 Update:

    Tested Gigabyte GA-686BA motherboard with 400-MHz Pentium-II CPU, 128-MB
    8-nsec ECC/parity SDRAM, and multi-boot for DOS, Windows-NT and Windows-98.

    The 1200-segment NEC4.1 benchmark factorization ran 6.4-percent faster
    than at 350-MHz, yet the CPU clock speed was approximately 14.3-percent
    faster. Thus only about 45% of the increased clock frequency is usable.
    Not very good. The matrix fill time ran somewhat better at 8.6-percent
    faster than the 350-MHz test. This improved the the increased clock
    speed utilization to about 60-percent for filling the matrix.

    It appears that memory bandwidth, cache speed and size, etc are again
    taking their toll on increased CPU speeds. One can speculate on the
    performance improvement for the 450-MHz CPU with the BX-chipset and is
    it really worth the increased cost?

    The DEC VF compiled NEC-4.1 code was executed under both Windows-NT and
    Windows-98, with Windows-98 being only marginally slower than NT. See
    Table-12 for the comparisons.

The following NEC/PC performance tables are provided for subscribers to the
NEC list interested in benchmarking their PC hardware and software. These
performance tables were compiled over the last three years or so, first
starting with 80486 PCs in 1994. Benchmark data of newer hardware (e.g.,
Pentium-II's) has displaced the older data as time moves on, particularly
the older PCs prior to 90-MHz Pentiums.

More recently, compiler performance data has been incorporated (see tables 11
and 12). Further, benchmark emphasis has shifted from the small 299-segment
test file to the 1200-segment test file (provided by Jerry Burke) and NEC4.1
code compiled with the DEC Visual Fortran Compiler (see tables 1, 3, 5 and 6).
Some benchmark data is also provided for the original 299-segment test file to
provide continuity with the older Pentium hardware and Lahey F77 compiled
code (see tables 2 and 4).

Table-1 provides performance data using the 1200-segment test file and
DEC Visual Fortran compiled NEC4.1 code on processors from the 90-MHz Pentium
to the 350-MHz Pentium-II with the PC100 memory bus. This table also includes
the MFLOPS/sec ratings estimated from (8/3)*N^3/(Matrix Factor Time).

The DEC Visual Fortran Compiler has produced the fastest executing NEC4.1
code for the Pentium/Pentium-II platforms relative to the other compilers
tested. Table-11 provides a quick review of compiler performance. Table-12
has more detailed hardware and software performance data. Note the wide range
(factor of 1.5 to 4) of performance between the older Lahey F77 and DEC VF
compilers. Even the newer Lahey F90 V3.5 compiler falls short of the DEC VF
compiler.

Tables 3 and 4 provide performance ratios of the various CPU and motherboards
tested with a reference Intel Neptune chipset 90-MHz Pentium motherboard.
Further, these tables include clock normalized ratios to better compare the
performance of the tested hardware.

There are three cache performance tables (8, 9 and 10) that show the
performance of various Pentium and Pentium-II processors with their
L1 and L2 caches enabled and disabled. This provides some guidance on the
relative cache and memory limitations of these systems. Because of the
severe performance degradation with the primary L1 cache disabled, the
test scenario was limited to the 299-segment case to keep execution times
reasonable.

During the past year the Pentium-II has displaced the Pentium-Pro and
Pentium CPU chips. Thus a more modern reference CPU/motherboard is
appropriate. In Table-5 the Gigabyte GA-686LX motherboard with a 266-MHz
Pentium-II CPU is used as the new baseline reference. The Intel "LX"
chipset was the first to support 10-nsec SDRAM memory. It replaced the
"KX" (Klamath) chipset that supported the traditional 60-nsec
FPM memory. The "LX" chipset performance improvement for the same 266-MHz
CPU speed is clearly shown in this table. Further, this table includes
data on the latest "BX" chipset that supports the faster PC100 memory
bus (that typically uses 7 or 8-nsec SDRAM). Again, the performance
improvement is quite evident. The performance ratios for the 300 and
1200-segment test cases were used to arrive at the "effective MHz" speed
ratings shown in Table-6. This table clearly shows that a faster CPU clock
speed sometimes does NOT significantly improve NEC performance due to chipset
and memory bandwidth limitations.

Comments, questions and suggestions may be sent to laitinen_at_oregon.uoregon.edu.

--Larry, W7JYJ

Table-1. Execution times in seconds for the TEST1200.NEC 1200-segment NEC
input file run in double-precision NEC4.1 on various processors. NEC4.1 was
compiled with the DEC Visual Fortran compiler. The OS is Windows-NT V4 SP3.
The MFLOPS/sec ratings were estimated from (8/3)*N^3/(Matrix Factor Time),
for N > 1000, where N is the number of segments. This approximation was
provided by Jozef. R. Bergervoet for NEC2. NEC4.1 is believed to behave
similarly, but it has not been verified.

   
   CPU/Motherboard L2 RAM Matrix Matrix Total ESTIMATED
                          Cache Fill Factor Exec. MFLOPS/sec

 1. Pentium-II 450-MHz 512KB 256MB 14.901 73.045 89.488 63.08
     Dell Dimension pburst ??
     XPS R450
 2. Pentium-II 400-MHz 512KB 128MB 16.563 71.543 89.839 64.41
     Gigabyte GA-686BA pburst 8-ns
 3. Pentium-II 350-MHz 512KB 128MB 17.985 76.150 96.078 60.51
     Gigabyte GA-686BA pburst 8-ns
 4. Pentium-II 333-MHz 512KB 256MB 18.967 92.743 113.814 49.69
     Gigabyte GA-686DL2 pburst 10-ns
     dual-CPU motherboard
     with 1 CPU installed
 5. Pentium-II 300-MHz 512KB 128MB 21.101 95.867 119.251 48.07
     Gigabyte GA-686LX pburst 10-ns
 6. Pentium-II 300-MHz 512KB 96MB 21.251 102.778 126.271 44.83
     Dell Dimension pburst ??
     XPS D300
 7. Pentium-II 266-MHz 512KB 64MB 23.574 104.009 130.107 44.30
     Gigabyte GA-686LX pburst 10-ns
 8. Pentium-II 266-MHz 512KB 64MB 24.895 119.302 146.972 38.62
     Gigabyte GA-686KX pburst 60-ns
                                   FPM
 9. Pentium 200-MHz MMX 512KB 64MB 38.205 202.311 244.982 22.78
     Gigabyte GA-586HX pburst 60-ns
                                   FPM
10. Pentium 133-MHz 512KB 64MB 66.636 275.587 348.941 16.72
     Gigabyte GA-586HX pburst 60-ns
                                   FPM
11. Pentium 133-MHz 256KB 80MB 79.965 307.412 394.998 14.99
     Hitachi MX-133T pburst 60-ns
     Notebook EDO
12. Pentium 100-MHz 256KB 64MB 89.869 319.900 418.512 14.40
     Intel P54C-PCI pburst 60-ns
     Neptune FPM
13. Pentium 90-MHz 256KB 64MB 99.333 353.708 462.865 13.03
     Intel P54C-PCI pburst 60-ns
     Neptune FPM
14. Pentium 90-MHz 256KB 96MB 99.543 354.560 463.757 13.00
     Intel P54C-PCI pburst 60-ns
     Neptune FPM

     Note: Jozef reports that he has observed significant matrix factor time
            improvement, in other applications, by using the Intel Math Kernel
            Library with the LAPACK linear algebra library. From this he has
            deduced that substituting the MKL/LAPACK codes in NEC4.1 may

            improve the NEC4.1 matrix factor time by up to 3.7 times. The
            Intel MKL routines contain assembly language code optimized for
            the Pentium-II CPU/FPU. This implies 238-MFLOPS/sec NEC4.1 DP
            performance for the 400-MHz Pentium-II.

Table-2. Execution times in seconds for the TEST299.NEC 299-segment input
file run in double-precision NEC4.1 on various processors. NEC4.1 was
compiled with Lahey Fortran-77 V5.10, an old compiler now obsolete. The OS
is DOS V6.22 with HIMEM and EMM386 memory managers installed.

   
   CPU/Motherboard L2 RAM Matrix Matrix Total
                                Cache Fill Factor Exec.

 1. Pentium-II 450-MHz 512KB 256MB 6.086 1.299 7.631
     Dell Dimension
     XPS R450
 2. Pentium-II 400-MHz 512KB 128MB 6.840 1.344 8.385
     Gigabyte GA-686BA pburst 8-nsec
 3. Pentium-II 350-MHz 512KB 128MB 7.131 1.499 8.775
     Gigabyte GA-686BA 8-nsec
 4. Pentium-II 333-MHz 512KB 256MB 7.286 1.699 9.629
     Gigabyte GA-686DL2 pburst 10-nsec
     dual-CPU motherboard SDRAM
     with one CPU installed
 5. Pentium-II 300-MHz 512KB 64MB 7.785 1.799 9.829
     Gigabyte GA-686LX pburst SDRAM
 6. Pentium-II 300-MHz 512KB 96MB 7.885 1.844 10.13
     Dell Dimension XPS D300 pburst SDRAM
 7. Pentium-II 266-MHz 512KB 64MB 8.430 1.944 10.63
     Gigabyte GA-686LX pburst SDRAM
 8. Pentium-II 266-MHz 512KB 64MB 8.576 2.053 10.93
     Gigabyte GA-686KX pburst FPM
 9. Pentium 200-MHz MMX 512KB 64MB 8.430 4.342 13.07
     Gigabyte 586HX pburst FPM
10. Pentium 200-MHz MMX 512KB 64MB 8.430 4.297 13.13
     Asus 430TX pburst ??
11. Pentium-Pro 200-MHz 256KB 128MB 11.12 2.753 14.17
     Gateway-2000 CPU ??
12. Pentium-Pro 200-MHz 256KB 64MB 11.53 2.843 14.67
     HP Vectra XU 6/200 CPU EDO
13. Pentium 166-MHz MMX 512KB 64MB 9.738 5.087 15.22
     Gigabyte 586HX pburst FPM
14. Pentium 166-MHz MMX 256KB 80MB 9.729 5.242 15.37
     Hitachi MX-166T Notebook pburst EDO
15. Pentium 166-MHz 512KB 64MB 10.83 6.086 17.36
     Gigabyte 586HX pburst FPM
16. Pentium 133-MHz 512KB 32MB 12.63 6.731 20.10
     Gigabyte 586HX pburst FPM
17. Pentium 133-MHz 512KB 32MB 12.73 6.986 20.46
     Shuttle pburst FPM
18. Pentium 133-MHz 256KB 48MB 12.573 7.286 20.46
     Hitachi MX-133T Notebook pburst EDO
19. Pentium 133-MHz 512KB 32MB 12.73 8.530 22.10
     Epox pburst FPM

20. Pentium 100-MHz 512KB 64MB 15.86 8.385 24.96
     Gigabyte 586HX pburst FPM
21. Pentium 133-MHz none 32MB 15.47 10.97 27.20
     Hitachi M-134T NoteBook EDO
22. Pentium 100-MHz 256KB 64MB 16.860 9.629 27.244
     Intel Neptune P54C-PCI SRAM FPM
23. Pentium 90-MHz 512KB 64MB 17.61 9.384 27.74
     Gigabyte 586HX pburst FPM
24. Pentium 100-MHz 256KB 16MB 18.66 8.685 28.89
     Dell Optiplex GXM-5100 ?? EDO
25. Pentium 90-MHz 256KB 16MB 18.80 10.60 30.23
     Intel Neptune P54C-PCI SRAM FPM
26. Pentium 100-MHz none 40MB 18.26 11.23 30.29
     Hitachi M-100T Notebook EDO

Table-3. Comparison of performance ratios for NEC4.1 on various CPU chip and
and motherboard configurations. Source data is from Table-1 using the
1200-segment TEST1200.NEC input file. The reference is the 90-MHz Pentium
CPU on the Intel Neptune P54C-PCI motherboard with 64-MB 60-nsec FPM parity
RAM.

    CPU/Motherboard Clock M-Fill M-Fact Exec Normalized-by-CPU-Clock
                         Ratio Ratio Ratio Ratio M-Fill M-Fact T-Exec

 1. Pentium-II 450-MHz 5.00 6.680 4.854 5.182 1.336 0.971 1.036
     Dell Dimension
     XPS R450
 2. Pentium-II 400-MHz 4.44 6.010 4.956 5.162 1.352 1.115 1.161
     Gigabyte GA-686BA
 3. Pentium-II 350-MHz 3.89 5.523 4.645 4.816 1.420 1.194 1.238
     Gigabyte GA-686BA
 4. Pentium-II 333-MHz 3.70 5.237 3.814 4.065 1.415 1.031 1.099
     Gigabyte GA-686DL2
     dual-CPU motherboard
     with 1 CPU installed
 5. Pentium-II 300-MHz 3.33 4.708 3.690 3.880 1.412 1.107 1.164
     Gigabyte GA-686LX
 6. Pentium-II 300-MHz 3.33 4.674 3.441 3.664 1.402 1.032 1.099
     Dell Dimension
     XPS D300
 7. Pentium-II 266-MHz 2.96 4.214 3.401 3.556 1.424 1.149 1.201
     Gigabyte GA-686LX
 8. Pentium-II 266-MHz 2.96 3.990 2.965 3.148 1.348 1.002 1.064
     Gigabyte GA-686KX
 9. Pentium 200-MHz MMX 2.22 2.600 1.748 1.889 1.171 0.788 0.851
     Gigabyte GA-586HX
10. Pentium 133-MHz 1.48 1.491 1.283 1.326 1.007 0.867 0.896
     Gigabyte GA-586HX
11. Pentium 133-MHz 1.48 1.242 1.151 1.172 0.839 0.777 0.792
     Hitachi MX-133T
     Notebook
12. Pentium 100-MHz 1.11 1.105 1.106 1.106 0.996 0.996 0.996
     Intel P54C-PCI
     Neptune

13. Pentium 90-MHz 1.00 1.000 1.000 1.000 1.000 1.000 1.000
     Intel P54C-PCI
     Neptune (64-MB)
14. Pentium 90-MHz 1.00 0.998 1.002 1.002 0.998 1.002 1.002
     Intel P54C-PCI
     Neptune (96-MB)
 

Table-4. Comparison of performance ratios for NEC4.1 on various CPU chip and
motherboard configurations. Source data is from Table-2 using TEST299.NEC.

    CPU/Motherboard Clock M-Fill M-Fact Exec Normalized-by-CPU-Clock
                         Ratio Ratio Ratio Ratio M-Fill M-Fact T-Exec

 1. Pentium-II 450-MHz 5.00 3.089 8.160 3.961 0.618 1.632 0.792
     Dell Dimension
     XPS R450
 2. Pentium-II 400-MHz 4.44 2.749 7.887 3.605 0.618 1.775 0.811
     Gigabyte GA-686BA
 3. Pentium-II 350-MHz 3.89 2.636 7.071 3.445 0.678 1.818 0.886
     Gigabyte GA-686BA
 4. Pentium-II 333-MHz 3.7 2.580 6.239 3.139 0.697 1.686 0.848
     Gigabyte GA-686DL2
     with one CPU
 5. Pentium-II 300-MHz 3.33 2.415 5.892 3.075 0.725 1.768 0.923
     Gigabyte GA-686LX
 6. Pentium-II 300-MHz 3.33 2.384 5.748 2.985 0.715 1.725 0.895
     Dell Dim XPS D300
 7. Pentium-II 266-MHz 2.96 2.23 5.453 2.844 0.755 1.845 0.963
     Gigabyte GA-686LX
 8. Pentium-II 266-MHz 2.96 2.192 5.163 2.766 0.742 1.747 0.936
     Gigabyte GA-686KX
 9. Pentium 200-MHz MMX 2.22 2.230 2.440 2.313 1.004 1.099 1.041
     Gigabyte 586HX
10. Pentium 200-MHz MMX 2.22 2.230 2.466 2.303 1.004 1.110 1.036
     Asus 430TX
11. Pentium-Pro 200-MHz 2.22 1.691 3.850 2.133 0.761 1.733 0.960
     Gateway-2000
12. Pentium-Pro 200-MHz 2.22 1.631 3.728 2.061 0.735 1.679 0.928
     HP Vectra XU 6/200
13. Pentium 166-MHz MMX 1.84 1.931 2.084 1.986 1.049 1.132 1.079
     Gigabyte 586HX
14. Pentium 166-MHz MMX 1.84 1.932 2.022 1.967 1.048 1.096 1.066
     Hitachi MX-166T
     Notebook PC
15. Pentium 166-MHz 1.84 1.736 1.742 1.741 0.943 0.947 0.946
     Gigabyte 586HX
16. Pentium 133-MHz 1.48 1.489 1.575 1.504 1.006 1.064 1.016
     Gigabyte 586HX
17. Pentium 133-MHz 1.48 1.477 1.517 1.478 0.998 1.025 0.998
     Shuttle
18. Pentium 133-MHz 1.48 1.495 1.455 1.478 1.012 0.984 1.000
     Hitachi MX-133T
     Notebook

19. Pentium 133-MHz 1.48 1.477 1.243 1.368 0.998 0.840 0.924
     Epox
20. Pentium 100-MHz 1.11 1.186 1.264 1.211 1.067 1.138 1.090
     Gigabyte 586HX
21. Pentium 133-MHz 1.48 1.215 0.966 1.111 0.821 0.653 0.751
     Hitachi M-134T
     Notebook PC
22. Pentium 100-MHz 1.11 1.115 1.101 1.110 1.005 0.992 0.996
     Intel Neptune
     P54C-PCI
23. Pentium 90-MHz 1.00 1.067 1.130 1.090 1.067 1.130 1.090
     Gigabyte 586HX
24. Pentium 100-MHz 1.11 1.008 1.221 1.046 0.907 1.098 0.942
     Dell Optiplex
     GXM-5100
25. Pentium 90-MHz 1.00 1.000 1.000 1.000 1.000 1.000 1.000
     Intel Neptune
     P54C-PCI
26. Pentium 100-MHz 1.11 1.030 0.944 0.998 0.927 0.850 0.898
     Hitachi M-100T
     Notebook PC

Table-5. Performance ratios of the Pentium-II motherboards and CPU chips
for NEC4.1 matrix factorization. This table includes the 300 and 1200-segment
cases. The reference motherboard is the Gigabyte GA-686LX with 266-MHz CPU
chip and 64-MB ECC SDRAM memory. The GA-686BA has 128-MB ECC SDRAM memory.
The GA-686KX has 64-MB of ECC FPM memory. The GA-686DL2 has 256-MB of
ECC SDRAM memory.

   PENTIUM-II MOTHER RAM CLOCK MATRIX FACTOR CLOCK NORMALIZED
   CPU SPEED BOARD SPEED RATIO RATIO M-FACT RATIO
                                         300-SEG 1200-SEG 300-SEG 1200-SEG

    450-MHz Dell R450 ?? 1.689 1.4098 1.4239 0.8346 0.8430
    400-MHz GA-686BA 8-ns 1.500 1.4946 1.4538 0.9964 0.9692
    350-MHz GA-686BA 8-ns 1.3138 1.3939 1.3658 1.0610 1.0396
    333-MHz GA-686DL2 10-ns 1.250 1.0981 1.1215 0.8785 0.8972
    300-MHz GA-686LX 10-ns 1.125 1.0688 1.0849 0.9501 0.9644
    300-MHz Dell D300 ?? 1.125 1.0081 1.0120 0.8961 0.8995
    266-MHz GA-686LX 10-ns 1.00 1.0000 1.0000 1.0000 1.0000
    266-MHz GA-686KX 60-ns 1.00 0.8107 0.8718 0.8107 0.8718

Table-6. Effective MHz ratings of the Pentium-II CPUs and motherboards for
the 300 and 1200-segment matrix factorization tests. The 266-MHz Pentium-II
in a Gigabyte GA-686LX motherboard is the reference. Effective speed
ratings are relative to this reference.

   PENTIUM-II MOTHERBOARD RAM EFFECTIVE CPU
   CPU CLOCK SPEED CLOCK SPEED (MHz)
                                         300-SEG 1200-SEG

    450-MHz Dell R450 ?? 375.6 379.3
    400-MHz GA-686BA 8-ns 398.6 387.7
    350-MHz GA-686BA 8-ns 371.3 363.8
    333-MHz GA-686DL2 10-ns 292.5 298.8
    300-MHz GA-686LX 10-ns 284.7 289.0
    300-MHz Dell D300 ?? 268.6 269.6
    266-MHz GA-686LX 10-ns 266.4 266.4 (ref)
    266-MHz GA-686KX 60-ns 216.0 232.2

Table-7. Performance ratios of the Gigabyte motherboards and CPU chips for
NEC4.1 matrix factorization. The reference is the Gigabyte GA-586HX motherboard
with 512KB pipeline burst cache, 64-MB of ECC FPM RAM and Intel 90-MHz Pentium
CPU chip. Source data is from Table-2 using TEST299.NEC.

     CPU & SPEED MOTHER PRI L1 CLOCK M-FACT CLOCK NORMALIZED
                      BOARD CACHE RATIO RATIO M-FACT RATIO

    400-MHz Pent-II GA-686GA 32KB 4.44 6.982 1.571
    350-MHz Pent-II GA-686BA 32KB 3.89 6.260 1.610
    333-MHz Pent-II GA-686DL2 32KB 3.70 5.523 1.493
    300-MHz Pent-II GA-686LX 32KB 3.33 5.216 1.565
    266-MHz Pent-II GA-686LX 32KB 2.96 4.827 1.631
    266-MHz Pent-II GA-686KX 32KB 2.96 4.571 1.544
    200-MHz MMX GA-586HX 32KB 2.22 2.161 0.973
    166-MHz MMX GA-586HX 32KB 1.84 1.845 1.003
    166-MHz GA-586HX 16KB 1.84 1.542 0.838
    133-MHz GA-586HX 16KB 1.48 1.394 0.942
    100-MHz GA-586HX 16KB 1.11 1.119 1.007
     90-MHz GA-586HX 16KB 1.00 1.000 1.000

Table-8. Effects of the CPU L1 (primary) and L2 (secondary) caches on
performance for the Pentium-II 350-MHz CPU on the Gigabyte GA-686BA
motherboard with 128-MB 8-nsec ECC SDRAM. NEC4.1 was compiled with the
Lahey LF77 compiler and the input file was TEST299.NEC. Execution times
are in seconds.

   CPU L1 L2 FILL FACTOR TOTAL

    on on 7.140 1.490 8.875
    on off 7.386 1.890 9.529
    off on 213.863 156.622 384.257
    off off 213.908 156.568 384.148

Table-9. Effects of the CPU L1 (primary) and L2 (secondary) caches on
performance for the Pentium-II 300-MHz CPU on the Gigabyte GA-686LX
motherboard. NEC4.1 was compiled with the Lahey LF77 compiler and the input
file was TEST299.NEC. Execution times are in seconds.

   CPU L1 L2 FILL FACTOR TOTAL

    on on 7.785 1.790 9.875
    on off 8.285 2.289 10.874
    off on 257.059 184.766 458.894
    off off 257.113 184.766 458.894

Table-10. Effects of the CPU L1 (primary) and L2 (secondary) caches on
performance for the P-166 MMX and P-166 standard CPUs on the Gigabyte 586HX
motherboard. NEC4.1 was compiled with Lahey F77 and the input data file was
TEST299.NEC. Execution times are in seconds.

   CPU L1 L2 FILL FACTOR TOTAL
   cache cache MMX STD MMX STD MMX STD

    on on 9.74 10.84 5.087 6.086 15.22 17.36
    on off 11.03 16.03 8.130 10.62 19.72 27.19
    off on 44.76 51.20 30.93 39.66 78.29 93.86
    off off 151.38 158.97 107.98 129.63 268.29 298.03

Table-11. Compiler comparisons for 299, 300, 600 and 1200 segments on the
Gigabyte GA-686BA Pentium-II 350-MHz PC with 128-MB 8-nsec ECC SDRAM.
Execution times are in seconds.

INPUT FILE/SEGMENTS COMPILER OS FILL FACTOR TOTAL

TEST299.NEC:
  NEC4D300 DEC NT 1.723 0.901 2.904
  NEC4D LF90 NT 4.497 1.090 5.687
  DNEC4 LF77 DOS 7.131 1.499 8.775

TEST300.NEC:
  NEC4D300 DEC NT 1.242 0.891 2.284
  NEC4D LF90 NT 2.843 1.099 3.988
  DNEC4 LF77 DOS 7.140 1.644 8.975

TEST600.NEC:
  NEC4D600 DEC NT 4.676 9.314 14.621
  NEC4D LF90 NT 11.028 10.129 21.612
  DNEC4 LF77 DOS 28.997 18.759 48.646

TEST1200.NEC:
  NEC4D120 DEC NT 17.985 76.150 96.078
  NEC4D* LF90 NT 49.346 150.881 203.425
  DNEC4* LF77 DOS 112.564 188.263 309.357

*Out of core solutions, MAXMAT=1000. See notes following Table 12.

Table 12. Detailed compiler, motherboard and number of segments test data.
Execution times are in seconds.

              COMP/OS MOTHERBOARD CLOCK FILL FACTOR TOTAL
                                       MHz
TEST299.NEC:

  NEC4D300 DEC NT Dell R450 450 1.392 0.882 2.444
  NEC4D LF90 NT Dell R450 450 3.643 0.990 4.742
  DNEC4 LF77 DOS Dell R450 450 6.086 1.299 7.631

  NEC4D300 DEC NT GA-686BA 400 1.522 0.831 2.614
  NEC4D300 DEC W98 GA-686BA 400 1.480 0.880 2.690
  DNEC4 LF77 DOS GA-686BA 400 6.840 1.344 8.385
  
  NEC4D300 DEC NT GA-686BA 350 1.723 0.901 2.904
  NEC4D LF90 NT GA-686BA 350 4.497 1.090 5.687
  DNEC4 LF77 DOS GA-686BA 350 7.131 1.499 8.775
  
  NEC4D300 DEC NT GA-686DL2 333 1.793 1.121 3.064
  NEC4D LF90 DOS GA-686DL2 333 3.643 1.245 5.142
  NEC4D LF90 NT/CP GA-686DL2 333 4.742 1.245 6.132
  DNEC4 LF77 DOS GA-686DL2 333 7.331 1.699 9.629

  NEC4D300 DEC NT GA-686LX 300 2.003 1.162 3.475
  DNEC4 LF77 DOS GA-686LX 300 7.785 1.799 9.829

  NEC4D300 DEC NT DELL D300 300 2.013 1.222 3.535
  DNEC4 LF77 DOS DELL D300 300 7.885 1.844 10.129

  NEC4D300 DEC NT GA-686LX 266 2.243 1.242 3.835
  NEC4D LF90 DOS GA-686LX 266 4.987 1.399 6.532
  DNEC4 LF77 DOS GA-686LX 266 8.430 1.944 10.629

  NEC4D300 DEC NT GA-686KX 266 2.303 1.442 4.126
  NEC4D LF90 DOS GA-686KX 266 4.942 1.499 6.686
  NEC4D LF90 NT GA-686KX 266 4.987 1.599 6.731
  NEC4D LF90 NT/CP GA-686KX 266 6.286 1.544 8.130
  DNEC4 LF77 DOS GA-686KX 266 8.630 1.999 10.883

  NEC4D300 DEC NT GA-586HX 200 MMX 3.615 2.233 6.159
  DNEC4 LF77 DOS GA-586HX 200 MMX 8.376 4.397 13.172

  NEC4D300 DEC NT GA-586HX 133 5.958 3.525 10.215
  DNEC4 LF77 DOS GA-586HX 133 12.673 6.986 20.358

  NEC4D300 DEC NT Neptune 100 7.891 4.336 13.139
  DNEC4 LF77 DOS Neptune 100 16.860 9.584 27.244

  NEC4D300 DEC NT Neptune 90 8.722 4.797 14.300
  DNEC4 LF77 DOS Neptune 90 18.805 10.583 30.287

    
TEST300.NEC:

  NEC4D300 DEC NT Dell R450 450 1.022 0.881 2.013
  NEC4D LF90 NT Dell R450 450 2.298 0.945 3.343
  DNEC4 LF77 DOS Dell R450 450 6.132 1.453 7.731

  NEC4D300 DEC NT GA-686BA 400 1.092 0.831 2.043
  NEC4D300 DEC W98 GA-686BA 400 1.090 0.880 2.080
  DNEC4 LF77 DOS GA-686BA 400 6.786 1.553 8.485

  NEC4D300 DEC NT GA-686BA 350 1.242 0.891 2.284
  NEC4D LF90 NT GA-686BA 350 2.843 1.099 3.988
  DNEC4 LF77 DOS GA-686BA 350 7.140 1.644 8.975
  
  NEC4D300 DEC NT GA-686DL2 333 1.362 1.131 2.824
  NEC4D LF90 NT GA-686DL2 333 2.989 1.254 4.388
  NEC4D LF90 DOS GA-686DL2 333 2.898 2.389 5.487
  DNEC4 LF77 DOS GA-686DL2 333 7.286 1.890 9.784

  NEC4D300 DEC NT GA-686LX 300 1.442 1.162 3.475
  DNEC4 LF77 DOS GA-686LX 300 7.776 1.999 9.984

  NEC4D300 DEC NT DELL D300 300 1.452 1.232 2.854
  DNEC4 LF77 DOS DELL D300 300 7.885 1.989 10.283

  
  NEC4D300 DEC NT GA-686LX 266 1.642 1.242 3.255
  NEC4D LF90 DOS GA-686LX 266 3.343 1.390 4.942
  DNEC4 LF77 DOS GA-686LX 266 8.439 2.089 10.774
  
  NEC4D300 DEC NT GA-686KX 266 1.643 1.532 3.375
  NEC4D LF90 DOS GA-686KX 266 3.398 1.499 5.087
  NEC4D LF90 NT GA-686KX 266 3.443 1.499 5.087
  NEC4D LF90 NT/DCP GA-686KX 266 4.896 1.544 6.786
  DNEC4 LF77 DOS GA-686KX 266 8.536 2.198 10.983

  NEC4D300 DEC NT GA-586HX 200 MMX 2.604 2.293 5.408
  DNEC4 LF77 DOS GA-586HX 200 MMX 7.931 4.688 12.972

  NEC4D300 DEC NT GA-586HX 133 4.437 3.615 8.602
  DNEC4 LF77 DOS GA-586HX 133 11.728 7.086 19.559

  NEC4D300 DEC NT Neptune 100 5.869 4.396 10.986
  DNEC4 LF77 DOS Neptune 100 15.561 9.729 26.090
 
  NEC4D300 DEC NT Neptune 90 6.540 4.827 12.418
  DNEC4 LF77 DOS Neptune 90 17.315 10.729 28.934

  

TEST600.NEC:

  NEC4D600 DEC NT Dell R450 450 3.846 8.933 13.239
  NEC4D LF90 NT Dell R450 450 8.975 9.330 18.659
  DNEC4 LF77 DOS Dell R450 450 25.300 17.015 43.014

  NEC4D600 DEC NT GA-686BA 400 4.126 8.672 13.389
  NEC4D600 DEC W98 GA-686BA 400 4.120 8.790 13.510
  DNEC4 LF77 DOS GA-686BA 400 22.689 17.669 46.148
  
  NEC4D600 DEC NT GA-686BA 350 4.676 9.314 14.621
  NEC4D LF90 NT GA-686BA 350 11.028 10.129 21.612
  DNEC4 LF77 DOS GA-686BA 350 28.997 18.759 48.646
 
  NEC4D600 DEC NT GA-686DL2 333 4.947 11.537 17.155
  NEC4D LF90 NT/CP GA-686DL2 333 11.773 11.728 23.955
  NEC4D LF90 DOS GA-686DL2 333 11.219 21.711 33.63
  DNEC4 LF77 DOS GA-686DL2 333 34.675 42.560 84.975

  NEC4D DEC NT GA-686LX 300 5.448 11.847 18.006
  DNEC4 LF77 DOS GA-686LX 300 31.486 21.957 54.542
  
  NEC4D600 DEC NT DELL D300 300 5.498 12.688 18.898
  DNEC4 LF77 DOS DELL D300 300 32.831 25.454 59.720

  NEC4D600 DEC NT GA-686LX 266 6.109 12.878 19.819
  NEC4D LF90 DOS GA-686LX 266 13.218 12.827 26.590
  DNEC4 LF77 DOS GA-686LX 266 34.684 31.032 68.214

  NEC4D600 DEC NT GA-686KX 266 6.299 14.932 22.122
  NEC4D LF90 NT GA-686KX 266 13.418 14.326 28.289
  NEC4D LF90 DOS GA-686KX 266 13.926 14.317 28.888
  NEC4D LF90 NT/CP GA-686KX 266 19.059 14.871 34.475
  DNEC4 LF77 DOS GA-686KX 266 35.374 27.589 64.117
  
  NEC4D600 DEC NT GA-586HX 200 MMX 9.764 22.372 33.469
  DNEC4 LF77 DOS GA-586HX 200 MMX 33.884 45.649 83.676

  NEC4D600 DEC NT GA-586HX 133 18.657 33.838 54.478
  DNEC4 LF77 DOS GA-586HX 133 50.045 66.860 121.094

  NEC4D600 DEC NT Neptune 100 22.492 39.047 64.152
  DNEC4 LF77 DOS Neptune 100 64.662 85.074 157.467

  NEC4D600 DEC NT Neptune 90 24.976 43.323 71.132
  DNEC4 LF77 DOS Neptune 90 72.093 92.406 172.793

TEST1200.NEC:

  NEC4D1200 DEC NT Dell R450 450 14.901 73.045 89.488
  NEC4D* LF90 NT Dell R450 450 36.673 140.053 179.079
  DNEC4* LF77 DOS Dell R450 450 97.892 177.580 281.414

  NEC4D1200 DEC NT GA-686BA 400 16.563 71.543 89.839
  NEC4D1200 DEC W98 GA-686BA 400 15.920 72.290 90.080
  DNEC4* LF77 DOS GA-686BA 400 107.376 180.323 295.531

  NEC4D1200 DEC NT GA-686BA 350 17.985 76.150 96.078
  NEC4D* LF90 NT GA-686BA 350 49.346 150.881 203.425
  DNEC4* LF77 DOS GA-686BA 350 112.564 188.263 309.357

  NEC4D1200 DEC NT GA-686DL2 333 18.967 92.743 113.814
  NEC4D* LF90 NT GA-686DL2 333 49.246 156.477 208.621
  NEC4D* LF90 DOS GA-686DL2 333 76.145 321.975 415.534
  DNEC4* LF77 DOS GA-686DL2 333 134.82 461.837 633.130

  NEC4D DEC NT GA-686LX 300 21.101 95.867 119.251
  DNEC4* LF77 DOS GA-686LX 300 122.048 214.553 345.621

  NEC4D1200 DEC NT DELL D300 300 21.251 102.778 126.271
  DNEC4* LF77 DOS DELL D300 300 127.689 253.915 391.088

  NEC4D1200 DEC NT GA-686LX 266 23.574 104.009 130.107
  NEC4D* LF90 DOS GA-686LX 266 73.001 276.372 358.557
  DNEC4* LF77 DOS GA-686LX 266 134.566 247.484 392.533

  NEC4D1200 DEC NT GA-686KX 266 24.895 119.302 146.972
  NEC4D* LF90 NT GA-686KX 266 56.386 171.239 231.922
  NEC4D* LF90 NT/DCP GA-686KX 266 70.449 228.025 303.815
  NEC4D* LF90 DOS GA-686KX 266 70.004 240.498 318.786
  DNEC4* LF77 DOS GA-686KX 266 137.109 270.240 418.723

  NEC4D1200 DEC NT GA-586HX 200 MMX 38.205 202.311 244.982
  DNEC4* LF77 DOS GA-586HX 200 MMX 132.676 433.694 584.875

  NEC4D1200 DEC NT GA-586HX 133 66.636 275.587 348.941
  DNEC4* LF77 DOS GA-586HX 133 192.551 586.673 800.182

  NEC4D1200 DEC NT Neptune 100 89.869 319.900 418.512
  DNEC4* LF77 DOS Neptune 100 249.782 756.423 1036.137
 
  NEC4D1200 DEC NT Neptune 90 99.333 353.708 462.865
  DNEC4* LF77 DOS Neptune 90 278.770 815.852 1127.598

  Notes:
       *Out of core solutions, MAXMAT=1000.
       LF90 - Lahey Fortran-90 V3.5 with Phar Lap DOS Extender
       LF77 - Lahey Fortran-77 V5.10 with Phar Lap DOS Extender
       DEC - DEC Visual Fortran-90 Compiler
       NT - Windows-NT V4 SP3
       NT/CP - Executing NEC4D from NT's command line prompt.
       W98 - Windows-98
       GA-686BA - Gigabyte Pentium-II motherboard with the "BX" chipset,
                   128-MB 8-nsec ECC SDRAM.
       GA-686DL2 - Gigabyte dual-processor PII motherboard equipped with a
                   single CPU, 256-MB 10-nsec ECC SDRAM.
       GA-686LX - Gigabyte PII motherboard, using the "LX" chipset. The
                   266-MHz tests were with 64-MB and the 300-MHz tests were
                   with 128-MB, all 10-nsec ECC SDRAM.
       GA-686KX - Gigabyte PII motherboard with 64-MB 60-nsec ECC FPM RAM,
                   using the "KX" chipset.
       GA-586HX - Gigabyte Pentium motherboard with 64-MB 60-nsec FPM
                   ECC/parity RAM, 512-KB secondary pburst cache.
       Neptune - Intel Neptune P54C-PCI/Pentium motherboard with 64-MB
                   60-nsec parity FPM RAM, 256-KB secondary cache.

INPUT DATA FOR TIMING TESTS

David Pinion, P.E., submitted the following TEST299.NEC input file for
the 299 segment tests:

CE CENTER FED HORIZONTAL HALF-WAVE DIPOLE OVER EXCELLENT GROUND.
GW 1,299,-139.,0, 6.,+139.,0, 6., .001,
GE 0,
GN 1,
FR 0,0,0,0, 0.54,
EX 0, 1,150,0,1., 0.,
RP 1, 1, 1,0000, 1.5, 0., 0., 0., 1000.,
EN

Jerry Burke, LLNL, provided the following NEC input file for the
TEST300, TEST600 and TEST1200 tests. Note that for the 600 and 1200
segment tests the parameter 2 in the second GM0 statement becomes 5 and
11, respectively.

CE NEC timing test - 300 segments
GW0,10,0.,0.,0.,0.,0.,1.,.001,
GM0,9,0.,0.,0.,.2,0.,0.,
GM0,2,0.,0.,0.,0.,.2,0.,
GE
EX0,0,5,0,1.,
XQ
EN

---------------------------------------------------------------
Received on Wed Feb 17 1999 - 02:54:05 EST

This archive was generated by hypermail 2.2.0 : Sat Oct 02 2010 - 00:10:39 EDT