.Version 8.0.3 of FFTPROF .(sequential version, prepared for a x86_64_linux_gnu4.6 computer) .Copyright (C) 1998-2024 ABINIT group . FFTPROF comes with ABSOLUTELY NO WARRANTY. It is free software, and you are welcome to redistribute it under certain conditions (GNU General Public License, see ~abinit/COPYING or http://www.gnu.org/copyleft/gpl.txt). ABINIT is a project of the Universite Catholique de Louvain, Corning Inc. and other collaborators, see ~abinit/doc/developers/contributors.txt . Please read https://docs.abinit.org/theory/acknowledgments for suggested acknowledgments of the ABINIT effort. For more information, see https://www.abinit.org . .Starting date : Mon 4 Apr 2016. - ( at 22h45 ) Tool for profiling and testing the FFT libraries used in ABINIT. Allowed options are: fourdp --> Test FFT transforms of density and potentials on the full box. fourwf --> Test FFT transforms of wavefunctions using the zero-pad algorithm. gw_fft --> Test the FFT transforms used in the GW code. all --> Test all FFT routines. ==== OpenMP parallelism is ON ==== - Max_threads: 2 - Num_threads: 2 - Num_procs: 4 - Dynamic: F - Nested: F Real(R)+Recip(G) space primitive vectors, cartesian coordinates (Bohr,Bohr^-1): R(1)= 20.0000000 0.0000000 0.0000000 G(1)= 0.0500000 0.0000000 0.0000000 R(2)= 0.0000000 20.0000000 0.0000000 G(2)= 0.0000000 0.0500000 0.0000000 R(3)= 0.0000000 0.0000000 20.0000000 G(3)= 0.0000000 0.0000000 0.0500000 Unit cell volume ucvol= 8.0000000E+03 bohr^3 Unit cell volume ucvol= 8.0000000E+03 bohr^3 Angles (23,13,12)= 9.00000000E+01 9.00000000E+01 9.00000000E+01 degrees Angles (23,13,12)= 9.00000000E+01 9.00000000E+01 9.00000000E+01 degrees ==== FFT setup for fftalg 110 ==== FFT mesh divisions ........................ 100 100 100 Augmented FFT divisions ................... 101 101 100 FFT algorithm ............................. 110 FFT cache size ............................ 16 ==== FFT setup for fftalg 111 ==== FFT mesh divisions ........................ 100 100 100 Augmented FFT divisions ................... 101 101 100 FFT algorithm ............................. 111 FFT cache size ............................ 16 ==== FFT setup for fftalg 112 ==== FFT mesh divisions ........................ 100 100 100 Augmented FFT divisions ................... 101 101 100 FFT algorithm ............................. 112 FFT cache size ............................ 16 ==== FFT setup for fftalg 410 ==== FFT mesh divisions ........................ 100 100 100 Augmented FFT divisions ................... 101 101 100 FFT algorithm ............................. 410 FFT cache size ............................ 16 ==== FFT setup for fftalg 411 ==== FFT mesh divisions ........................ 100 100 100 Augmented FFT divisions ................... 101 101 100 FFT algorithm ............................. 411 FFT cache size ............................ 16 ==== FFT setup for fftalg 412 ==== FFT mesh divisions ........................ 100 100 100 Augmented FFT divisions ................... 101 101 100 FFT algorithm ............................. 412 FFT cache size ............................ 16 ==== FFT setup for fftalg 312 ==== FFT mesh divisions ........................ 100 100 100 Augmented FFT divisions ................... 101 101 100 FFT algorithm ............................. 312 FFT cache size ............................ 16 ==== FFT setup for fftalg 512 ==== FFT mesh divisions ........................ 100 100 100 Augmented FFT divisions ................... 101 101 100 FFT algorithm ............................. 512 FFT cache size ............................ 16 ============================================================== ==== fourwf with option 0, cplex 0, ndat 1, istwf_k 1 ==== ============================================================== Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|> - Goedecker (110) 0.0870 0.0870 1 (100%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1090 0.0665 2 ( 65%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1325 0.0590 3 ( 49%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1450 0.0710 4 ( 31%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.0810 0.0810 1 (100%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.0925 0.0665 2 ( 61%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.1115 0.0650 3 ( 42%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.1295 0.0735 4 ( 28%) 2 0.00E+00 0.00E+00 - Goedecker (112) 0.0385 0.0390 1 (100%) 2 5.86E-14 1.94E-15 - Goedecker (112) 0.0480 0.0260 2 ( 75%) 2 5.86E-14 1.94E-15 - Goedecker (112) 0.0560 0.0220 3 ( 59%) 2 5.86E-14 1.94E-15 - Goedecker (112) 0.1075 0.0430 4 ( 23%) 2 5.86E-14 1.94E-15 - Goedecker2002 (410) 0.0925 0.0925 1 (100%) 2 6.08E-14 1.96E-15 - Goedecker2002 (410) 0.1010 0.1010 2 ( 46%) 2 6.08E-14 1.96E-15 - Goedecker2002 (410) 0.0925 0.0925 3 ( 33%) 2 6.08E-14 1.96E-15 - Goedecker2002 (410) 0.0975 0.0975 4 ( 24%) 2 6.08E-14 1.96E-15 - Goedecker2002 (411) 0.0345 0.0345 1 (100%) 2 6.08E-14 1.96E-15 - Goedecker2002 (411) 0.0340 0.0340 2 ( 51%) 2 6.08E-14 1.96E-15 - Goedecker2002 (411) 0.0335 0.0335 3 ( 34%) 2 6.08E-14 1.96E-15 - Goedecker2002 (411) 0.0305 0.0305 4 ( 28%) 2 6.08E-14 1.96E-15 - Goedecker2002 (412) 0.0310 0.0310 1 (100%) 2 6.08E-14 1.96E-15 - Goedecker2002 (412) 0.0305 0.0305 2 ( 51%) 2 6.08E-14 1.96E-15 - Goedecker2002 (412) 0.0315 0.0315 3 ( 33%) 2 6.08E-14 1.96E-15 - Goedecker2002 (412) 0.0360 0.0360 4 ( 22%) 2 6.08E-14 1.96E-15 - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A Consistency check: MAX(Max_|Err|) = 6.08E-14, Max(<|Err|>) = 1.96E-15, reference_lib: Goedecker (110) ============================================================== ==== fourwf with option 1, cplex 1, ndat 1, istwf_k 1 ==== ============================================================== Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|> - Goedecker (110) 0.1045 0.1045 1 (100%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1240 0.0705 2 ( 74%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1510 0.0670 3 ( 52%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1515 0.0645 4 ( 41%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.0605 0.0605 1 (100%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.0985 0.0655 2 ( 46%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.1195 0.0700 3 ( 29%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.1495 0.0805 4 ( 19%) 2 0.00E+00 0.00E+00 - Goedecker (112) 0.0385 0.0385 1 (100%) 2 2.18E-11 1.42E-14 - Goedecker (112) 0.0385 0.0210 2 ( 92%) 2 2.18E-11 1.42E-14 - Goedecker (112) 0.0575 0.0290 3 ( 44%) 2 2.18E-11 1.42E-14 - Goedecker (112) 0.0970 0.0455 4 ( 21%) 2 2.18E-11 1.42E-14 - Goedecker2002 (410) 0.1065 0.1065 1 (100%) 2 2.18E-11 1.44E-14 - Goedecker2002 (410) 0.1100 0.1100 2 ( 48%) 2 2.18E-11 1.44E-14 - Goedecker2002 (410) 0.0985 0.0990 3 ( 36%) 2 2.18E-11 1.44E-14 - Goedecker2002 (410) 0.0830 0.0830 4 ( 32%) 2 2.18E-11 1.44E-14 - Goedecker2002 (411) 0.0375 0.0370 1 (100%) 2 2.18E-11 1.44E-14 - Goedecker2002 (411) 0.0605 0.0605 2 ( 31%) 2 2.18E-11 1.44E-14 - Goedecker2002 (411) 0.0420 0.0415 3 ( 30%) 2 2.18E-11 1.44E-14 - Goedecker2002 (411) 0.0445 0.0445 4 ( 21%) 2 2.18E-11 1.44E-14 - Goedecker2002 (412) 0.0380 0.0380 1 (100%) 2 2.18E-11 1.44E-14 - Goedecker2002 (412) 0.0320 0.0320 2 ( 59%) 2 2.18E-11 1.44E-14 - Goedecker2002 (412) 0.0365 0.0370 3 ( 34%) 2 2.18E-11 1.44E-14 - Goedecker2002 (412) 0.0400 0.0400 4 ( 24%) 2 2.18E-11 1.44E-14 - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A Consistency check: MAX(Max_|Err|) = 2.18E-11, Max(<|Err|>) = 1.44E-14, reference_lib: Goedecker (110) ============================================================== ==== fourwf with option 2, cplex 1, ndat 1, istwf_k 1 ==== ============================================================== Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|> - Goedecker (110) 0.1600 0.1600 1 (100%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1810 0.0990 2 ( 81%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.2665 0.1035 3 ( 52%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.3130 0.1315 4 ( 30%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.1140 0.1140 1 (100%) 2 2.22E-16 1.86E-19 - Goedecker (111) 0.1550 0.0950 2 ( 60%) 2 2.22E-16 1.86E-19 - Goedecker (111) 0.1935 0.0920 3 ( 41%) 2 2.22E-16 1.86E-19 - Goedecker (111) 0.2335 0.1095 4 ( 26%) 2 2.22E-16 1.86E-19 - Goedecker (112) 0.0505 0.0505 1 (100%) 2 2.23E-16 2.39E-19 - Goedecker (112) 0.0570 0.0285 2 ( 89%) 2 2.23E-16 2.39E-19 - Goedecker (112) 0.0700 0.0235 3 ( 72%) 2 2.23E-16 2.39E-19 - Goedecker (112) 0.1355 0.0520 4 ( 24%) 2 2.23E-16 2.39E-19 - Goedecker2002 (410) 0.1605 0.1610 1 (100%) 2 3.33E-16 2.53E-19 - Goedecker2002 (410) 0.1730 0.1690 2 ( 48%) 2 3.33E-16 2.53E-19 - Goedecker2002 (410) 0.1845 0.1750 3 ( 31%) 2 3.33E-16 2.53E-19 - Goedecker2002 (410) 0.1820 0.1690 4 ( 24%) 2 3.33E-16 2.53E-19 - Goedecker2002 (411) 0.0700 0.0705 1 (100%) 2 3.33E-16 2.53E-19 - Goedecker2002 (411) 0.0585 0.0580 2 ( 61%) 2 3.33E-16 2.53E-19 - Goedecker2002 (411) 0.0575 0.0575 3 ( 41%) 2 3.33E-16 2.53E-19 - Goedecker2002 (411) 0.0725 0.0730 4 ( 24%) 2 3.33E-16 2.53E-19 - Goedecker2002 (412) 0.0570 0.0570 1 (100%) 2 3.33E-16 2.53E-19 - Goedecker2002 (412) 0.0510 0.0510 2 ( 56%) 2 3.33E-16 2.53E-19 - Goedecker2002 (412) 0.0505 0.0510 3 ( 37%) 2 3.33E-16 2.53E-19 - Goedecker2002 (412) 0.0565 0.0565 4 ( 25%) 2 3.33E-16 2.53E-19 - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A Consistency check: MAX(Max_|Err|) = 3.33E-16, Max(<|Err|>) = 2.53E-19, reference_lib: Goedecker (110) ============================================================== ==== fourwf with option 3, cplex 0, ndat 1, istwf_k 1 ==== ============================================================== Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|> - Goedecker (110) 0.0645 0.0645 1 (100%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.0955 0.0480 2 ( 67%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1215 0.0565 3 ( 38%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.1430 0.0710 4 ( 23%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.0510 0.0515 1 (100%) 2 1.12E-16 6.13E-20 - Goedecker (111) 0.0900 0.0620 2 ( 42%) 2 1.12E-16 6.13E-20 - Goedecker (111) 0.0870 0.0415 3 ( 41%) 2 1.12E-16 6.13E-20 - Goedecker (111) 0.1150 0.0675 4 ( 19%) 2 1.12E-16 6.13E-20 - Goedecker (112) 0.0530 0.0530 1 (100%) 2 1.12E-16 6.13E-20 - Goedecker (112) 0.0650 0.0425 2 ( 62%) 2 1.12E-16 6.13E-20 - Goedecker (112) 0.0900 0.0380 3 ( 46%) 2 1.12E-16 6.13E-20 - Goedecker (112) 0.1115 0.0545 4 ( 24%) 2 1.12E-16 6.13E-20 - Goedecker2002 (410) 0.0500 0.0500 1 (100%) 2 2.22E-16 5.34E-20 - Goedecker2002 (410) 0.0510 0.0495 2 ( 51%) 2 2.22E-16 5.34E-20 - Goedecker2002 (410) 0.0635 0.0595 3 ( 28%) 2 2.22E-16 5.34E-20 - Goedecker2002 (410) 0.0650 0.0600 4 ( 21%) 2 2.22E-16 5.34E-20 - Goedecker2002 (411) 0.0260 0.0260 1 (100%) 2 2.22E-16 5.34E-20 - Goedecker2002 (411) 0.0280 0.0280 2 ( 46%) 2 2.22E-16 5.34E-20 - Goedecker2002 (411) 0.0265 0.0265 3 ( 33%) 2 2.22E-16 5.34E-20 - Goedecker2002 (411) 0.0280 0.0280 4 ( 23%) 2 2.22E-16 5.34E-20 - Goedecker2002 (412) 0.0260 0.0260 1 (100%) 2 2.22E-16 5.34E-20 - Goedecker2002 (412) 0.0280 0.0280 2 ( 46%) 2 2.22E-16 5.34E-20 - Goedecker2002 (412) 0.0260 0.0255 3 ( 34%) 2 2.22E-16 5.34E-20 - Goedecker2002 (412) 0.0235 0.0235 4 ( 28%) 2 2.22E-16 5.34E-20 - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A Consistency check: MAX(Max_|Err|) = 2.22E-16, Max(<|Err|>) = 6.13E-20, reference_lib: Goedecker (110) ============================================================== ==== fourwf with option 2, cplex 2, ndat 1, istwf_k 1 ==== ============================================================== Library CPU-time WALL-time nthreads ncalls Max_|Err| <|Err|> - Goedecker (110) 0.1900 0.1900 1 (100%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.2580 0.1535 2 ( 62%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.2340 0.0920 3 ( 69%) 2 0.00E+00 0.00E+00 - Goedecker (110) 0.3080 0.1580 4 ( 30%) 2 0.00E+00 0.00E+00 - Goedecker (111) 0.1755 0.1760 1 (100%) 2 3.33E-16 2.19E-19 - Goedecker (111) 0.2015 0.1480 2 ( 59%) 2 3.33E-16 2.19E-19 - Goedecker (111) 0.1815 0.0890 3 ( 66%) 2 3.33E-16 2.19E-19 - Goedecker (111) 0.2175 0.1150 4 ( 38%) 2 3.33E-16 2.19E-19 - Goedecker (112) 0.0640 0.0640 1 (100%) 2 3.33E-16 3.05E-19 - Goedecker (112) 0.0715 0.0405 2 ( 79%) 2 3.33E-16 3.05E-19 - Goedecker (112) 0.0720 0.0325 3 ( 66%) 2 3.33E-16 3.05E-19 - Goedecker (112) 0.0955 0.0430 4 ( 37%) 2 3.33E-16 3.05E-19 - Goedecker2002 (410) 0.1830 0.1830 1 (100%) 2 3.34E-16 3.19E-19 - Goedecker2002 (410) 0.1480 0.1430 2 ( 64%) 2 3.34E-16 3.19E-19 - Goedecker2002 (410) 0.1730 0.1625 3 ( 38%) 2 3.34E-16 3.19E-19 - Goedecker2002 (410) 0.1735 0.1595 4 ( 29%) 2 3.34E-16 3.19E-19 - Goedecker2002 (411) 0.0715 0.0715 1 (100%) 2 3.34E-16 3.19E-19 - Goedecker2002 (411) 0.0690 0.0685 2 ( 52%) 2 3.34E-16 3.19E-19 - Goedecker2002 (411) 0.0615 0.0615 3 ( 39%) 2 3.34E-16 3.19E-19 - Goedecker2002 (411) 0.0635 0.0630 4 ( 28%) 2 3.34E-16 3.19E-19 - Goedecker2002 (412) 0.0495 0.0495 1 (100%) 2 3.34E-16 3.19E-19 - Goedecker2002 (412) 0.0505 0.0505 2 ( 49%) 2 3.34E-16 3.19E-19 - Goedecker2002 (412) 0.0495 0.0495 3 ( 33%) 2 3.34E-16 3.19E-19 - Goedecker2002 (412) 0.0510 0.0510 4 ( 24%) 2 3.34E-16 3.19E-19 - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - FFTW3 (312) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A - DFTI (512) N/A N/A N/A N/A N/A N/A Consistency check: MAX(Max_|Err|) = 3.34E-16, Max(<|Err|>) = 3.19E-19, reference_lib: Goedecker (110) Analysis completed.