" Cluster 2004 Abstract: NWPerf: A System Wide Performance Monitoring Tool for Large Linux Clusters

NWPerf: A System Wide Performance Monitoring Tool for Large Linux Clusters

Ryan Mooney, et. al


The paper describes NWPerf, a system for analyzing fine granularity performance metric data on large-scale supercomputing clusters. This system addresses the problem of measuring application efficiency on a system wide basis from both a global system perspective as well as providing a detailed view of individual applications. NWPerf provides this service while causing a minimal impact on user applications. We show some examples of the types of information that can be derived from the system, and demonstrate how the system was used to improve the performance of some applications by up to several thousand percent.

Back to Program