skip to main content
Ngôn ngữ:
Giới hạn tìm kiếm: Giới hạn tìm kiếm: Dạng tài nguyên Hiển thị kết quả với: Hiển thị kết quả với: Chỉ mục

Optimizing Strided Remote Memory Access Operations on the Quadrics QsNetII Network Interconnect

Nieplocha, Jarek ; Tipparaju, Vinod ; Krishnan, Manoj Kumar; Pacific Northwest National Lab. (Pnnl), Richland, Wa (United States) (Corporate Author)

02 November 2006

DOI: 10.1109/HPCASIA.2005.62

Toàn văn không sẵn có

Trích dẫn Trích dẫn bởi
  • Nhan đề:
    Optimizing Strided Remote Memory Access Operations on the Quadrics QsNetII Network Interconnect
  • Tác giả: Nieplocha, Jarek ; Tipparaju, Vinod ; Krishnan, Manoj Kumar
  • Pacific Northwest National Lab. (Pnnl), Richland, Wa (United States) (Corporate Author)
  • Chủ đề: General And Miscellaneous//Mathematics, Computing, And Information Science ; Computer Networks ; Data Transmission ; Memory Management ; Optimization ; Remote Memory Access ; Quadrics ; Non-Contiguous Data Transfers
  • Là 1 phần của: 02 November 2006
  • Mô tả: This paper describes and evaluates protocols for optimizing strided non-contiguous communication on the Quadrics QsNetII high-performance network interconnect. Most of previous related studies focused primarily on NIC-based or host-based protocols. This paper discusses merits for using both approaches and tries to determine for types and data sizes in the communication operations these protocols should be used. We focus on the Quadrics QsNetII-II network which offers powerful communication processors on the network interface card (NIC) and practical and flexible opportunities for exploiting them in context of user. Furthermore, the paper focuses on non-contiguous data remote memory access (RMA) transfers and performs the evaluation in context of standalone communication and application microbenchmarks. In comparison to the vendor provided noncontiguous interfaces, proposed approach achieved very significant performance improvement in context of microbenchmarks as well as application kernels- dense matrix multiplication and the Co-Array Fortran version of the NAS BT parallel benchmark. For example, for NAS BT Class B 54 % improvement in overall communication time and a 42% improvement in matrix multiplication was achieved for 64 processes.
  • Ngôn ngữ: English
  • Số nhận dạng: DOI: 10.1109/HPCASIA.2005.62

Đang tìm Cơ sở dữ liệu bên ngoài...