WRFblog: March 2008

Here are my recent trials of WRF v.2.2.1 on a dual-quad Mac Pro with Intel Fortran (10.1.012) and gcc. The 64 bit version of the compilers are used. The machine has 8 GB of memory. Trials include OMP and MPI versions, the latter based on mpich2.

--------------------------------------------------------------
netcdf compilation for 64 bit (based on netcdf-3.6.0-p1)
--------------------------------------------------------------
export CC=/usr/bin/gcc export CPPFLAGS="-O -DNDEBUG -DpgiFortran" export CFLAGS="-O -m64" export CXX=/usr/bin/c++ export CXXFLAGS="-O -m64" export FC=ifort export F77=ifort export F90=ifort export FFLAGS="-O3 -mp" export F90FLAGS="-O3 -mp" ./configure --prefix=/usr/local/netcdf make make test sudo mkdir /usr/local/netcdf sudo make install

--------------------------------------------------------------
jasper compilation for 64 bit (based on jasper-1.701.0)
--------------------------------------------------------------
setenv CC /usr/bin/gcc setenv CFLAGS "-O -m64" setenv CXX /usr/bin/c++ setenv CXXFLAGS "-O -m64" ./configure --prefix=/usr/local/jasper make sudo make install

--------------------------------------------------------------
mpich2 compilation for 64 bit (based on mpich2-1.0.5)
--------------------------------------------------------------
setenv FC ifort setenv F90 ifort setenv CC "gcc -m64" setenv RSHCOMMAND "/usr/bin/ssh" setenv CXX "/usr/bin/c++ -m64" setenv FFLAGS "-xP -vec- -fp-model precise" setenv F90FLAGS "-xP -vec- -fp-model precise" ./configure --with-comm=shared

--------------------------------------------------------------
test run
--------------------------------------------------------------

The test run is a short simulation with three telescoping, two-way domains (142x100, 100x100 and 100x100, with 35 vertical levels). Flags were chosen to reproduce output file cksums of a completely unoptimized simulation requesting strict arithmetic. The OMP version occasionally produces different results, apparently randomly.

Timing plot:

The plot below adds results from a single 2.4GHz quad-core machine, running Mandriva Linux, for OMP runs built using the same configuration file as linked below (save Mac-specific references removed). For the four thread run, the 2.4 GHz run is 33% slower, though the clock speed difference is only 17%. Checksums were the same for all the runs.

Configuration files are here: OMP version, MPI version.

[edited 1 April 2008 to include the compilation for netcdf, jasper and mpich2 and to clarify this is for 64 bit compilers.]

WRFblog

Wednesday, March 26, 2008

WRFV221 on dual quad Mac Pro

Blog Archive

About Me