Menu Home

Call to Action: Fediverse Media Server

Edit: So to put my money where my mouth is – Right now storage is costing me around $150AUD a month – so if somebody can build a media server we can share/deploy and offer it for all of the Fediverse instances… I will give them a minimum of $3600AUD (2 years of Wasabi money) as a bounty once it’s in production!

Preamble

I am the “proud” owner of an overpowered Proxmox cluster hosted in colocation at an Australian data center. My cluster is two 2U AMD EPYC servers with a 1U IBM for backup (Only two of those servers have the blessing of the machine god).

Unit 0 is AMD EPYC 7642 (Rome Zen2) with 512GB of DDR4-3200
https://en.wikichip.org/wiki/amd/epyc/7642
Unit 1 is AMD EPYC 7713 (Milan Zen3) with 256GB of DDR4-3200
https://en.wikichip.org/wiki/amd/epyc/7713
Notice the unused memory and low CPU usage.

and I’m using a Mikrotik RB5009UG+S+IN Ethernet router
https://mikrotik.com/product/rb5009ug_s_in


The Situation

I’ve been hosting Fediverse services such as Aus.Social and Pixelfed.au for the last 5 years, and I’ve been working hard to maintain the quality of service while lowing the total cost of ownership. During this time, I’ve seen hundreds of other administrators shutdown their instances due to high stress and high cost of hosting, and I want to offer support to these admins in my region (including taking ownership of abandoned instances).

After migrating all of my Fediverse services to my dedicated servers, I became hyper aware of having excess compute/memory/storage capability, and as such I started to consider if I could offer to migrate other mastodon instances to my colo at a massive discount compared to other cloud providers. (Notice 500GB of unused memory in the screenshots above).

Sadly I’m currently unable to offer this to anybody right now due to my primary mastodon instances remote media caching policy requirements. A.S is currently eating more than 300GB of data usage every day since go-live. The bandwidth shown below is currently hosting just Aus.Social, and this alone is costing me half of my bill (or $330 a month) because Mastodon is currently downloading, testing and uploading every piece of media to my S3 bucket from every single toot it sees.

If I can lower my usage using shared services, CDNs, de-duplication (or hopefully Jortage Rivot style APIs), I can start to offer VMs to other administrators VMs on my cluster without the additional bandwidth overhead of their server downloading and uploading every media attachment they see (including from my server likely hosted on the same box). This is my mission with this post!

(I will move to another colocation with cheaper bandwidth in 12 months once my contract is over but that’s besides the point.. sadly I live in Australia is a fake country and bandwidth is just expensive everywhere).


Economics of the Fediverse 101

The current model is wasteful, so there is a motivating factor for administrators to lower their costs and be more environmentally friendly! Win win.

Storage: Let’s consider an example where there are 1000 active Mastodon instances, each with at least 1 user following the user on my instance, and all of these instances are using Wasabi S3. In this scenario, the minimum cost of those 1001 Wasabi buckets is $5 USD per TB minimum, totaling $5005 USD a month to store the same file.

My actual instance sits at around 10TB, costing $69 USD a month… so I’d imagine those 1000 imaginary instance would be between 1TB and 10TB and that’s still a massive amount of expense just on S3 buckets.

If we could split the one-instance-one-bucket model up, the total cost of ownership for the entire Fediverse would drop dramatically, and considering there are almost 10,000 Mastodon instances (excluding Pixelfed and the other Fediverse servers)… I’d hate to calculate how much the Fediverse costs simply in storage (local or S3).

Compute: If the validation/transcoding were offloaded from the instances, this would lower the CPU/memory required by Sidekiq to maintain the instances. This would allow people to use smaller instance types and lower the total cost of ownership globally once again.

Update: I remembered that Claire made a PR a long time ago to resolve a similar problem “Add optional cross-instance media processing synchronization mechanism”. https://github.com/mastodon/mastodon/pull/14371


Fediverse Media Model V1

Bandwidth/Storage Calculator (Based on S3 remote storage)

  • A.S account @shlee posts a 1MB image
  • A.S transcodes and uploads the 1MB image to my S3
  • 1000 instances download the 1MB file
  • 1000 instances waste compute resources transcoding/validating the file.
  • 1000 instances upload the 1MB file to their S3
  • Total: 2002MB of bandwidth and 1001MB of storage used across 1001 instances.

or as explained again by the Jortage team at https://jortage.com/

The current Mastodon Media Modal requires every single instance that has ingested a toot with media to download a copy of that media (validate it/transcode it if required) and upload to their local storage or remote storage (S3 bucket).

As shown in my diagram, this requirement means my instance is downloading 150GB and uploading 150GB of media every day! If I was hosting five other mastodon instances, they would all be downloading and uploading files without any coordination or awareness of each other. My media bucket is exclusively used by my instance.

This is bad because it wastes bandwidth, it wastes storage (no de-duplication) and it wastes compute… so I’m after a new modal which shares media between my primary instance A.S and any other instances hosted on my service.

but it also doesn’t provide high availability (or backup) because every instance still holds all of their media resources in one location as a single point of failure.


What I need/What I want

Pushing media

A possible media workflow looks something like this – with a combination of “shared media servers” or Fediverse Delivery Networks (FDNs) and original V1 style instances running their own buckets.

  • An A.S user posts a toot with an original piece of media. (Media is validated/transcoded/signed as such by A.S)
  • A.S uploads the new toots’ media to the media server.
  • Media Server 1 confirms the file is valid/signed and stores it locally (Transcodes if required).
  • Toot is replicated to the other fediverse servers.
  • Local instances (sharing the same media server) ask “media server 1” if the file exists. It does / nothing is downloaded.
  • External instances ask their “media server 2” if the file exists. it is not in the cache and downloads it from the origin media server.
  • Servers using the V1 modal download the media as usual.

Pulling media

Grabbing external media is a similar flow.. with the media server downloading and validating/transcoding the media on behalf of the instance.

+ I would believe this would duplicate Claires “cross-instance media processing synchronization” because the processing would be completed on the media server.

The end user experience, As the Media Servers would likely have CDNs like bunny or fastly in front of them. Their CDNs could download a copy of the original media once and then cache it on their edge servers globally. Limiting the traffic from the FDN media-server to the global end users, but allowing for fast ingestion.


Bandwidth Calculator Redux

(Based on S3 remote storage with 100 media servers supporting 700 instances and 300 current style V1 instances with their own S3 buckets)

  • A.S account @shlee posts a 1MB image
  • A.S transcodes and uploads the 1MB image to my Media server
  • 100 Media Servers download the 1MB file
  • 100 media servers validate the signature on the file (no transcode required)
  • 300 instances download/transcode/upload the 1MB file to their S3
  • Total: 702MB of bandwidth and 401MB of storage used across 1001 instances
  • Bonus: If all 1000 instances media was stored in 100 Media Servers… 102MB of bandwidth and 101MB of storage (compared to 2002MB bandwidth and 1001MB storage of the current design)

Instances using media servers would also have less CPU/memory utilisation allowing admins to use smaller instance types due to offloading the transcoding.. this would also lower the cost of the fediverse.

Note: This concept should work for single user instances, standard instances and mass-hosted instances. At best, moving all of the media verification/transcoding from the instance sidekiq jobs to a dedicated service would make sense for single user instances as well by having a dedicated process/vm/container for the media processing similar to the streaming server being a dedicated task, and just makes the ecosystem a little more “microservice based”.


Prior Art from Team Jortage

Jortage developer Una mentions “Rivet” in their breakdown and this is an API concept which enables my example above.

I don’t believe Rivet is actually “in the works atm” due to circumstances.


Decentralised Media Servers for all!

Mastodon, Pixelfed and most of the other Fediverse servers operates on ActivityPub, which serves as a protocol for enabling actors to communicate with each other. This is the type of framework that should influence the media server. It could speak AP.

The beauty of this concept lies in the flexibility it offers. Only the API between the Mastodon Instance and the Media server (FDN) needs to be firmly established, while the design of the media server can vary greatly.

From regional ingestion points to globally distributed anycast FDNs, the infrastructure can be built in numerous ways. It could rely solely on local storage for hot caching and utilize S3 for warm storage, or incorporate services like Fastly/Bunny as a CDN layered atop the media server. There are no rigid requirements for the actual design; it simply needs to communicate securely with the instances it serves and provide files on demand.

I’d like to believe all of the media servers should be open source and deployable by anybody with the will to do so. This follows the decentralised ethos of the Fediverse. It means friendly instances can band together into groups to pool resources and save money. Win win.

Jortage has a GitHub, but it doesn’t have any instructions on how to deploy it… I’ve contacted the Jortage admin and told them I’m more than happy to just give them the $100USD a month I’m giving Wasabi if they build a Jortage AU ingestion node but I need the Rivet API to take full advantage of that to solve my problem completely.


Finally, if the model is built to separate the instances and the media, my current hosted instances total bandwidth on my colo VMs could drop from 300GB a day every single day, to only serving the web, api and streaming traffic (which is a handful of GB a day)… with the media downloaded/uploaded/cached on my friendly neighborhood (FDN) Media server.

Once my monthly data usage per instance is minimal (TBs to GBs), I could host a lot of Fediverse instances (DB/Web/Streaming/Sidekiq) while halving my bill…. these cost savings would allow me to give the Fediverse developers money instead of Wasabi (and my colo provider)… and I bet I’m not the only one!

Remember, if we could go from 10,000 S3 buckets to 5000 S3 buckets, the cost of the entire ecosystem drops dramatically… the cost savings increase per instance on a shared media server are shared as more people join this media server.


but wait, there’s more.

These are some brainstorming concepts that V2 and beyond could offer:

  • Motivation for single instances: I could store the media server on a cheap linode instance (free bandwidth) and not have to worry about media bandwidth on my cluster. Right now, there is a tight connection between the instance and the media via Sidekiq and the DB/Redis… We could disconnect this relationship and turn it into an API between the mastodon instance and the media server which could tolerate higher latencies. (This would solve my problem by just moving my media ingestion/processing/caching to a cloud provider).
  • High Availability: An instance could list a primary and multiple secondary FDN media servers as their media source and this would protect against outages or failures on a single source.

  • Failure recovery: If the media from the fediverse is transcoded/validated and then hashed/signed correctly. An instance or media server suffering a failure without a backup, could authoratively access other media servers and recover the instances local media as long as the media was shared with at least one known media server.

  • AP Ingestion: If the V2 modal speaks AP, it can be attached to relays to download media head of time for a better end user experience by having the media before the toot reaches the instance (this is a bit of a weird use case but worth discussion).

  • AP Deleting: If the V2 modal speaks AP, it can receive a signed delete from an actor and delete the media for all instances at the same time. Offering people higher level of safety.

  • Toot De-duplication: Bots are known to upload the exact same picture/video over and over on schedule, and deduce would only use the space of one upload or possibly one image URL as it’s the same hash – Weekend bot for example.

  • Edge Cases Accounts: There are a lot of accounts just uploading endless large videos on a schedule – @flameReactor is using 1.5% of my total storage for a single account. (if I was sharing my storage with more than one instance, this kind of thing might be less annoying).

Categories: Mastodon Rants

shlee

6 replies

  1. > A.S is currently eating more than 300GB of data usage every day since go-live.

    😳

    1. You’re kidding me – I was working on *exactly* this pretty recently, because to me it seemed crazy to me that this exact solution doesn’t exist. Yes storage and bandwidth requirements for some types of fedi servers are absurd and it seems like this would be a big help.

      It’s not yet working or even real close to it, but the basic design that this *will* have once it’s reached alpha status is:

      * Media volumes exist as a FUSE mount which exposes a media volume
      * Multiple nodes can share the same volume in a swarm; it’s stored via content hash in a Merkel tree, which means deduplication, cache coherency, and trustworthiness of the data from random nodes. You don’t need to have the whole volume in your local cache in order to have the mount (like IPFS)
      * You can choose to pin volumes on your node (requesting replication in full to local storage – this obviously must be done from at least one connected node to guarantee that data isn’t lost)
      * There’s a service worker which can cut out the middleman and grab stuff directly from the swarm nodes, to cut down on bandwidth and storage requirements on the central server

      I had shelved it for some other things, but if you want to give a bounty, I’m psyched to pick it up and start working on it again, because I thought it was a great idea. I left my email – reach out to me and let’s talk more; if you give me a week or so to clean up the code I can probably give you a little demo of the (very, very small) part of it that’s actually working now and we can see if it can fit what you were thinking, and talk more about timelines and features.

  2. Pingback: Last Week in Fediverse – ep 70 – The Fediverse Report
      1. Now, each server has its own administrators, and those administrators are accountable for ensuring their server stays within the law in the territory where the server resides. This does indeed become a problem if servers in multiple countries are sharing a common media server.

        That doesn’t invalidate the idea at all, but it adds some constraints. One approach is for servers sharing a media server to share the same moderation policy too, and that moderation policy should forbid anything that’s illegal in any of the countries where servers reside. Probably a better approach is that all servers can fallback to storing content locally. So when one server reports to the media server that an item is unacceptable, all other servers are notified that if they want that content they should cache it locally, and then it’s deleted.

        It’s interesting to compare this to Bluesky’s Relay server, which does a similar job and has similar issues. But with Bluesky, everything must go through the Relay. This means Relays must be huge, and only large territories could afford to run them. In practice, that leaves everyone stuck with America’s censorship policies. This media server idea is much more flexible.

Leave a Reply

Your email address will not be published. Required fields are marked *