TLDR: CHECKING YOUR WEB/PUMA LOGS AND SIDEKIQ DEAD QUEUE FOR S3 ReadTimeout ERRORS FOR FUN AND PROFIT!!!
I recently started investigating the high bandwidth being used by my Mastodon instance, and noticed a lot of errors (similar to below) showing failures related to uploading files to my S3 provider Wasabi.
Aws::S3::MultipartUploadError (multipart upload failed: Net::ReadTimeout with #<TCPSocket:(closed)>):
lib/paperclip/attachment_extensions.rb:87:in `block in save'
lib/paperclip/attachment_extensions.rb:93:in `save'
app/controllers/api/v2/media_controller.rb:5:in `create'
app/controllers/concerns/localized.rb:11:in `set_locale'
lib/mastodon/rack_middleware.rb:9:in `call'
and to resolve the ReadTimeout issue. I tried to modify the default timeouts from 5 to 15, and added the option to retry the failed upload.
S3_MULTIPART_THRESHOLD=52428800 ## 50MB I believe this isn't required
S3_OPEN_TIMEOUT=15
S3_READ_TIMEOUT=15
S3_RETRY_LIMIT=1
Daily bandwidth for Aus.Social (Before and After)
As you can see, my daily usage dropped by 50%

Wasabi Usage for Aus.Social (Before and After)
I’ve added a bucket lifecycle to delete failed multipart uploads after 1 day, and this has resulted in my Wasabi bucket dropping from 12TB to 7.45TB (or a 35% drop)

Plus: this makes the Mastodon tootctl media usage look much closer to the reality invoiced from Wasabi!

Outcome
My understanding is Mastodon was spending half of the bandwidth getting stuck in a loop.
- Download remote media from another mastodon instance
- Try to upload the media and fail.
- Download the media again and fail the upload again (repeat multiple times)
Now my instance is:
- downloading the remote media
- uploading once without issue.
This has dropped my bandwidth costs and storage costs dramatically!
and lowering my CPU usage because it’s not testing/transcoding the remote media over and over
win win win
Categories: Uncategorised
Leave a Reply