Here are some notes on a recent attempt to run WRF on a dual quad-core Intel Mac, which I received as a loaner from Apple. In the two weeks I had the machine, I made progress, but it's clear from my notes below I didn't pass the finish line. For one thing, I was unable to get even medium-sized jobs to run in the 64 bit environment -- and the tricks for accessing memory that worked with 32 bits fail. In 32 bit land, there is still a limit on the job size, which is apparently due to Apple's internal memory allocation restrictions.
Executive summary: I was able to get all 8 cpus working for me, tho the scaling wasn't the best, and the key turned out to be: (a) moving to mpich-2; AND (b) using --with-comm=shared. An important goal for me is getting results that do not vary with the number of processors used. With OMP, that required the ifort flag '-fp-model precise'. I also used this flag in the WRF code and when compiling mpich-2.
** These notes assume the 32 bit ifort compiler. I used ifort 10.0.017. These notes presume WRF model changes documented on previous posts. Compilation and execution took place on an HFSX-formatted disk.
(1) netcdf-3.6.2
export CC=/usr/bin/gcc
export CPPFLAGS="-O -DNDEBUG -DpgiFortran"
export CFLAGS="-O"
export CXX=/usr/bin/c++
export CXXFLAGS="-O"
export FC=ifort
export F77=ifort
export F90=ifort
export FFLAGS="-O3"
export F90FLAGS=
./configure --prefix=/usr/local/netcdf
make
make test
make check
sudo mkdir /usr/local/netcdf
sudo make install
(2) MPICH-2 (mpich2-1.0.5p4)
setenv FC ifort
setenv F90 ifort
setenv CC "gcc"
setenv RSHCOMMAND "/usr/bin/ssh"
setenv CXX "/usr/bin/c++"
setenv FFLAGS "-xP -vec- -fp-model precise"
setenv F90FLAGS "-xP -vec- -fp-model precise"
./configure --with-comm=shared
make
(3) Modify external/RSL_LITE/buf_for_proc.c to add "extern" before "char mess" (resolves a problem that crops up specifically with mpich-2)
(4) configure.wrf files I used for these tests: APPLE_LOANER_WRF.zip.
Sunday, September 16, 2007
Subscribe to:
Posts (Atom)