Opened 10 years ago
Last modified 5 years ago
#383 new defect
Accelerate multiprofile processing on HPC
Reported by: | Ian Culverwell | Owned by: | idculv,cburrows |
---|---|---|---|
Priority: | normal | Milestone: | Whenever |
Component: | ROPP (all) | Version: | 7.1 |
Keywords: | HPC | Cc: |
Description
Some of the multifile tests in the Test Folder, like MT-IO-03, IT-FM-07 and IT-PP-01, take ages to run on the HPC. Why? It shouldn't take hours to process ~500 profiles on a supercomputer, especially when it takes minutes on a linux box. I/O?
Look into it. Possible solutions: compiler options, netCDF 'chunking' options?
Change history (2)
comment:1 by , 8 years ago
Milestone: | 9.0 → 10.0 |
---|
comment:2 by , 5 years ago
Milestone: | 10.0 → Whenever |
---|
Note:
See TracTickets
for help on using tickets.
Although this was originally a problem on the Met Office IBM supercomputer, it is still an issue on the replacement - a Cray.
One consideration was that the jobs were not being submitted via the 'PBS' submission system, but were being run via ssh which may not be the most efficient way. A test of this showed the same behaviour by submitting the job either way. Furthermore (for the example of IT-1DVAR-OP), it was seen that the first ~60 profiles were processed very quickly, but subsequent profiles suddenly became processed extremely slowly. The file being appended to is very small ~1MB so it is unlikely that chunking would help. Reviewing the test folder timings for the various integration tests at version 9.0, it seems the HPC is approximately the same speed as Linux when the input multifiles contain less than 50-100 profiles, but much slower for files with more profiles. Perhaps the Lustre file system is penalising multiple file open/close commands in quick succession?? The Met Office HPC optimisation team thinks not:
This would need some restructuring of the ROPP tools to modify the calls to the low-level dependency (netCDF) routines. As it is quite unlikely that ROPP would be used on a supercomputer to process large mutifiles in this way (users are more likely to embed the subroutines in larger software packages, thus avoiding the I/O issue) this ticket is being deferred to ROPP10.