|
« Previous 1 2 HPC Storage – Getting Started with I/O ProfilingWhat Do I Do with This Information?Now that you have all of this statistical information about your applications, what do you do with it? I'm glad you asked. The answer is that you can do quite a bit. The first thing I always look for is how many processes in the MPI application actually do I/O, primarily write(). You will be very surprised that many applications have only a single MPI process, typically the rank-0 process, doing all of the I/O for the entire application. However, there is a second class of applications in which a fixed number of MPI processes perform I/O, and this number is less than the total number of MPI processes. Finally, a third class of applications have all, or virtually all, processes writing to the same file at the same time (MPI-IO). If you don't know whether your applications fall into one of these three classes, you can use mpi_strace_analyzer to help determine that. Just knowing whether your application has a single process doing I/O or whether it uses MPI-IO is a huge step toward making an informed decision about HPC storage. The simple reason is that running MPI-IO codes on NFS storage is really not recommended, although it is possible. Instead, the general rule of thumb is to run MPI-IO codes on parallel distributed storage that have tuned MPI-IO implementations. (Please note that these are general rules of thumb, and it is possible to run MPI-IO codes on NFS storage and non-MPI-IO codes on parallel distributed storage). Other useful information obtained from examining the application I/O pattern, such as
can be used to determine not only what kind of HPC storage you need, but also how much performance you need from the storage. Is your application dominated by throughput performance or IOPS? What is the peak IOPS obtained from the strace output? How much time is spent doing I/O versus total run time? If you look at the specific example in the Appendix, you could make the following observations from the HTML report.
SummaryHPC storage is definitely a difficult problem for the industry right now. Designing systems to meet our storage needs has become a headache that is difficult to cure. However, you can make this headache easier to manage if you understand the I/O patterns of your applications. In this article, I talk about different ways to measure the performance of your current HPC storage system, your applications, or both, but this requires a great deal of coordination to capture all of the relevant information and then piece it together. Some of the tools, such as iotop, iostat, nfsiostat, collectl, collectd, and blktrace, can be used to understand what is happening with your storage and your applications. However, they don't really give you the details of what is going from the perspective of the application. These tools are all focused on what is happening on a particular server (compute node). For HPC, you would have to gather this information for all nodes and then coordinate it to understand what is happening at an application (MPI) level. Using strace can give you more information from the perspective of the application, although it also requires you to gather all of this information on each node in the job run and coordinate it. To help with this process, two applications – strace_analyzer and mpi_strace_analyzer – have been written to help sort through the mounds of strace data and produce some useful statistical information. The tools were applied to a LS-Dyna run over eight cores that used an NFS filesystem (NFS over GigE). Portions of the strace analysis of a single process was presented, and the entire MPI strace analysis was presented in an Appendix, showing the sort of information produced by the analysis tools to help you better understand the I/O pattern of your application. I hope this article has presented some ideas about how to analyze your I/O needs from the perspective of an application. After all, making your applications run more efficiently, and hopefully faster, is the whole point of HPC. « Previous 1 2 |
