TY - GEN
T1 - Efficient access to many small files in a filesystem for grid computing
AU - Thain, Douglas
AU - Moretti, Christopher
PY - 2007
Y1 - 2007
N2 - Many potential users of grid computing systems have a need to manage large numbers of small files. However, computing and storage grids are generally optimized for the management of large files. As a result, users with small flies achieve performance several orders of magnitude worse than possible. Archival tools and custom storage structures can be used to improve small-file performance, but this requires the end user to change the behavior of the application, which is not always practical. To address this problem, we augment the protocol of the Chirp filesystem for grid computing to improve small file performance. We describe in detail how this protocol compares to FTP and NFS, which are widely used in similar situations. In addition, we observe that changes to the system call interface are necessary to invoke the protocol properly. We demonstrate an order-of-magnitude performance improvement over existing protocols for copying files and manipulating large directory trees.
AB - Many potential users of grid computing systems have a need to manage large numbers of small files. However, computing and storage grids are generally optimized for the management of large files. As a result, users with small flies achieve performance several orders of magnitude worse than possible. Archival tools and custom storage structures can be used to improve small-file performance, but this requires the end user to change the behavior of the application, which is not always practical. To address this problem, we augment the protocol of the Chirp filesystem for grid computing to improve small file performance. We describe in detail how this protocol compares to FTP and NFS, which are widely used in similar situations. In addition, we observe that changes to the system call interface are necessary to invoke the protocol properly. We demonstrate an order-of-magnitude performance improvement over existing protocols for copying files and manipulating large directory trees.
UR - http://www.scopus.com/inward/record.url?scp=47249140783&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47249140783&partnerID=8YFLogxK
U2 - 10.1109/GRID.2007.4354139
DO - 10.1109/GRID.2007.4354139
M3 - Conference contribution
AN - SCOPUS:47249140783
SN - 1424415608
SN - 9781424415601
T3 - Proceedings - IEEE/ACM International Workshop on Grid Computing
SP - 243
EP - 250
BT - Proceedings - 8th IEEE/ACM International Conference on Grid Computing, GRID 2007
T2 - 8th IEEE/ACM International Conference on Grid Computing, GRID 2007
Y2 - 19 September 2007 through 21 September 2007
ER -