Can I keep source code in MooseFS? Why do small files occupy more space than I would have expected?
The system was initially designed for keeping large amounts (like several thousands) of very big files (of tens of gigabytes) and has a hard-coded chunk size of 64MiB and block size of 64KiB. Using a consistent block size helps improve the networking performance and efficiencies, as all nodes in the system are able to work with a single ‘bucket’ size. That’s why even a small file will occupy 64KiB plus additionally 4KiB of checksums and 1KiB for the header. The whole transfer which takes place in the system is done in blocks of 64KiB. However it doesn’t have any impact on the performance. (A normal file system will typically also use some degree of block read-ahead, while sometimes will fetch some superfluous data). The issue regarding the occupied space of a small file stored inside a MooseFS chunk is really more significant, but in our opinion it is still negligible. Let’s take 25 million files with a goal set to 2. Counting the storage overhead, this could create about 50 million 69
Related Questions
- A very small number of users seem to be affected by this issue, and it seems that Windows is "blocking" these files - maybe Flight Simulator is still running in the background?
- Can I run a mail server application on MooseFS? Mail server is a very busy application with a large number of small files - will I not lose any files?
- How can I check out BIRT source code according to certain map files?