Efficient access to many small files in a filesystem for grid computing

Douglas Thain, Christopher Moretti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Scopus citations

Abstract

Many potential users of grid computing systems have a need to manage large numbers of small files. However, computing and storage grids are generally optimized for the management of large files. As a result, users with small flies achieve performance several orders of magnitude worse than possible. Archival tools and custom storage structures can be used to improve small-file performance, but this requires the end user to change the behavior of the application, which is not always practical. To address this problem, we augment the protocol of the Chirp filesystem for grid computing to improve small file performance. We describe in detail how this protocol compares to FTP and NFS, which are widely used in similar situations. In addition, we observe that changes to the system call interface are necessary to invoke the protocol properly. We demonstrate an order-of-magnitude performance improvement over existing protocols for copying files and manipulating large directory trees.

Original languageEnglish (US)
Title of host publicationProceedings - 8th IEEE/ACM International Conference on Grid Computing, GRID 2007
Pages243-250
Number of pages8
DOIs
StatePublished - Dec 1 2007
Externally publishedYes
Event8th IEEE/ACM International Conference on Grid Computing, GRID 2007 - Austin, TX, United States
Duration: Sep 19 2007Sep 21 2007

Publication series

NameProceedings - IEEE/ACM International Workshop on Grid Computing
ISSN (Print)1550-5510

Conference

Conference8th IEEE/ACM International Conference on Grid Computing, GRID 2007
CountryUnited States
CityAustin, TX
Period9/19/079/21/07

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint Dive into the research topics of 'Efficient access to many small files in a filesystem for grid computing'. Together they form a unique fingerprint.

  • Cite this

    Thain, D., & Moretti, C. (2007). Efficient access to many small files in a filesystem for grid computing. In Proceedings - 8th IEEE/ACM International Conference on Grid Computing, GRID 2007 (pp. 243-250). [4354139] (Proceedings - IEEE/ACM International Workshop on Grid Computing). https://doi.org/10.1109/GRID.2007.4354139