ImageRouter used to have it's own caching mechanizm, and it worked quite well! The only thing it was missing was clearing up old cached images, so the disk usage went up pretty quickly. We tried mitigating it with nginx cache, but it's notoriously tricky to set up, clears its cache seemingly without reason, causing the website to slow down as many images are re-generated.
This task is about bringing back the caching functionality to the way it was, meaning:
1. If someone requests an image that is already currently being converted, don't run a new conversion thread, wait for the currently running to finish and return its result
2. Store all the resized and converted images on the HDD and use them instead of resizing and converting a new image on each request
The above can be relatively easily achieved by reverting some of the changes from the commit that took away the caching functionality: https://hub.sealcode.org/rRIMAGEROUTER872653e19b9f5f9c9f6ec9d6e8dd5500e094a804
Additionally, the caching now needs to be a little bit more sophisticated:
1. There needs to be a parameter to the ImageRouter that would tell how many gigs of HDD space can the cache take up
2. Every time there's a new cached image added to the cash storage, a cleanup action must run that checks if the total amount of hdd space used by the cache exceeds the limit set up in point 1. If it does, the action will remove a certain amount of least recently accessed files (see https://www.howtogeek.com/517098/linux-file-timestamps-explained-atime-mtime-and-ctime/) so the total amount of used cache storage comes down just below the limit
3. The action from point 2 does not run concurrently. If someone adds a new image to the cache while the action is running, it doesn't get executed another time. We avoid running two or more instances of the action from point 2 in parallel
4. The action described in point 2 does not block the http response for the request that triggered it. The action starts running only after closing HTTP request
Make sure to also cache the results of cropping, to avoid running smart-crop multiple times unnecessarily.