「view this page in B3 βῆτα server」

Revisions №57638

branch: master 「№57638」
Commited by: Vikram K. Mulligan
GitHub commit link: 「e8b0389a486134ee」 「№333」
Difference from previous tested commit:  code diff
Commit date: 2015-02-25 20:46:40

Merge pull request #333 from RosettaCommons/vmullig/bluegene_mpi_problem Vmullig/bluegene mpi problem On the Argonne "Mira" Blue Gene/Q machine, I was having a problem with the MPI version of the RosettaScripts app. Every time a slave process asked for a job, the master process would send it a nonzero job number, but the slave process would interpret it as "0". This fixes that problem, at least with the MPIWorkPoolJobDistributor. In MPIWorkPoolJobDistributor, core::Size values were being sent and received as MPI_INTs. On little-endian systems (e.g. x86 processors), the data truncation resulted in the value being preserved for any reasonably small value, which is why, I think, no one had seen the problem anywhere else, but on big-endian systems (like Blue Gene), for a small value like a job number, only the leading zeroes were sent and received. I switched all the MPI_INTs to MPI_UNSIGNED_LONGs (which should match core::Size on most systems -- core::Size is actually of type std::size_t, which is usually unsigned long), and adjusted a few ints defined locally to be core::Sizes. This got RosettaScripts working properly in MPI mode on Blue Gene. Note that MPI_Send and MPI_Recv take a virtual pointer to the datum being sent or recieved, so there is no check that its type matches the type that MPI thinks it's sending or receiving. Data truncation errors are easy with MPI.

...