Paperclip reprocess attachments: Too many open files
November 06, 2016 • Need to reprocess a bunch of paperclip attachments, but running into the 'Too Many Open Files' exception? Here's a workaround using find_in_batches to quickly reprocess each group in a background job.
Recently I ran into a problem reprocessing thousands of records with attachments using
rake paperclip:refresh CLASS=User (Thumbnail Generation). After exactly 995 files would reprocess, the exception "Too many open files" was thrown.
- https://github.com/thoughtbot/paperclip/issues/1980, and many more
I didn't want to go down the path of raising the
ulimit (the amount of open files your OS permits), so I decided to create a workaround rake task.
Before coming to a final solution, I tried using
find_in_batches (documentation) on the models I wanted to reprocess, even manually triggering GC in an attempt to clean up the tempfiles. Turns out a
find_in_batches with a sufficiently small batch size avoided the "Too many open files" exception, but took almost 2 hours to complete for 2000 records. I needed to reprocess ~80,000 attachments, so this task needed to be faster.
In order to speed up reprocessing thousands of images, I decided to throw each batch into a job. Instead of <2 hours, it took around <10 minutes.