Yes, I’m certain I could final answers to all these questions via research, but I’m coming here as part of the Reddit diaspora. My guess is that there’s a benefit to others like me to have this discussion.
I can vaguely understand the federation concept, the idea that my account is hosted at an individual Lemmy server and that other servers trust that one to validate my account. What’s the network flow like? I’m posting this to the lemmy.ml /asklemmy community, but I’m composing it on the sh.itjust.works interface. I’m assuming sh.itjust.works hands this over to lemmy.ml. How does my browsing work? Is all of my traffic routed through sh.itjust.works?
Assuming there’s a mass influx of redditors, what does it look like as things fail? I’m assuming some servers can keep up under the load and some can’t. If sh.itjust.works goes down under the load, can I still browse other servers? Or, do those servers think I should have some token from sh.itjust.works, because my cookies say I’m still logged in, and I can’t even do that?
Are there easy mechanisms to allow me to grab my post history?
I’m assuming most (all?) Lemmy servers are hosted in home labs? The idea of Lemmy excites me, but the growth pain that could be coming scares me. Anybody using a CDN in front of their servers? That could be good, but with unconstrained growth, that could be costly, which is very bad.
I can imagine lots of different worse case scenarios, but I’m curious what those of you who run servers imagine for the best case scenario? A manageable growth that just gets more vibrant communities, which can’t ever lead to the breadth and variety of Reddit?
Also, for those running servers, have any of you experienced issues during this growth? What scares you?
sh.itjust.works
, that’s where all the info you care about resides. Your list of subscribed communities resides there. When you read a post, it gets fetched out of the db onsh.itjust.works
(irrespective of where the home instance for that post’s community is… when you read it it comes out of the database on your home instance), and when you comment on a post, that gets written to the db on your home instance. Your home instance a standalone fully functioning thing.sh.itjust.works
subscribe to the same community… there’s no incremental overhead. All ya’lls instance is ALREADY subscribed to that sub. So other users on your instance can sub to it for free, it’s already in the instance’s database.lemmy.ml
(where this community is homed) falls over from being overloaded or just is broken for whatever reason, your instance is unaffected. You can still read posts and make comments. This community however… is affected. New posts and comments for this community might come through intermitently or not at all for you (and everyone in the lemmyverse) because the community’s home server isn’t working well enough to reliably deliver them over federated replication. You can still read older posts and comments that have already been synced to your home instance, but new ones might not arrive. You might also see weird stuff like being able to see new comments from othersh.itjust.works
users on this community, since those get written to your db before getting federated back to the community’s home server. But mostly updates from other instances stop or get unreliable.sh.itjust.works
falls over for some reason… well… that sucks for you. You can’t log in or browse anything on it. You can still visit this sub at https://lemmy.ml/c/asklemmy/ as long aslemmy.ml
is working and you’ll be able to see the posts and comments that other accounts make. But you’ll be an anonymous read-only browser, you won’t be able to post or comment untilsh.itjust.works
comes back online (or you make a new account elsewhere and lose all your comment history and subscription list).There’s a github issue for this, but it’s not done yet: https://github.com/LemmyNet/lemmy/issues/506.
I don’t think that’s a good assumption.
lemmy.ml
is hosted on OVH, a cloud provider. My home instance onlemmy.world
is hosted by admins that run something like a 32 CPU mastodon instance. Most instances with over 100 users are running on some kind of probably modest but “real” cloud instance. The admins are volunteers, but often smart technical folks paying for small but real compute infrastructure.Anticipating growing pains isn’t wrong, it’s probably gonna happen. But the devs are gonna find and work on the biggest performance problems so that people can viably run bigger instances, and instance admins are gonna run bigger hardware and ask for donations or run patreons to cover the cost. In my opinion, the bigger worry is that Lemmy will fizzle… not that it will spectacularly explode. As long as people join and contribute and are interested, we’ll find a way to improve scalability and performance. The death knell would be if people get bored and leave, but compute capacity won’t be the problem in that scenario.
Thanks. That was an incredibly detailed response that answers the questions I was asking.
Doesn’t the fact that every Lemmy server has a copy of every federated post mean that if Lemmy takes off, only a few people with strong donation feeds can afford to survive?
If there’s an active forum (sub-lemmy?) on a server that has to spin down, the history stays on the remaining active ones, but I assume the only option is forking?
Moderation can only happen on the server hosting a forum, or each server can moderate posts in that server’s db?
It’s not precisely true that every Lemmy instance stores every post. A given Lemmy instance will store a given post if and only if:
The first of these is most important though, because it means that posts and comments that no one is interested in don’t get shipped around the federated network. And this leads to the property that the size/cost of a Lemmy instance is going to depend on the size of the “active” usage. A single user Lemmy instance subscribing to a handful of communities will always be small and cheap, because it doesn’t subscribe to much content. A bigger Lemmy instance need not scale to the entirely of content in the lemmyverse, but rather to the “active set” of posts and comments its users interact with this month. That could get big, but what the Lemmy devs are saying (sorry no link, I’ve read too many posts lately to remember all my sources) is that user-traffic browsing the local DB of the Lemmy instance is dwarfing the replication load, which is great news because user browsing is much easier to optimize than federated replication.
(FYI, the thing you subscribe to is called a
community
in Lemmy. Some folks say sublemmy, but this is a redditism that isn’t used in the code or official docs. It’s a “community”, which is why the url for a community ishxxp://my.lemmy.social/c/mycommunity
. The “c” in the middle stands for community.)Well, we’ve already talked about caching and expiry. It’s not clear to me than any Lemmy instance other than the one that hosts the community is required to keep the ENTIRE post/comment history (though yeah the active/recent ones will be all over the federated network).
I haven’t lived through a major instance shutdown, maybe an old-timer can weigh in here. Speculating, I’d think there would be 2 options:
!lemmy@lemmy.ml
to coordinate), make a new community on a different server and the mods post telling everyone to subscribe there. The new community would be… well… new. It wouldn’t have the old posts, it would be made from scratch. The only things that would bind it to the old community are the mods that come over, the users that follow them, and the culture.Yes I’ve seen something like that on mastodon already. Though the caching is scaled by time so you can just say to cache only last 24 hours (or less) which will scale down storage requirements.
Also, didn’t know Ruud was running a .world lemmy instance. Cool!
I’m wondering if Lemmy (and maybe the fediverse generally) has the potential to offer incoming Reddit mods something Reddit can’t: compensation.
Obviously it won’t be huge, but I feel like there’s a greater chance that a Lemmy user will make at least small one-off or even regular donations to help keep their communities and/or instances running.
Like if I’m running a server, voluntarily paying $50/month as, say, the mod of a 20M+ user subreddit might, even if 20 users contribute $2.5, that’s a full month’s worth of server time paid for.
As I say, the scale would be quite low, but wonder if it could be an interesting idea to try out, even if just as a proof-of-concept.
There’s definitely scope for this.
I know I would never pay for facebook / twitter / reddit. Not 1 cent, but I’m happily contributing to fosstodon on patreon each month - and I don’t think I’m alone.
I think there’s a not-insignificant segment of the population that are weary enough of the advertising revenue model that we’re happy to pay to avoid it. The expectation that everything on the net should be free has ruined the net IMO.
In a maybe ironic twist, I’m more than happy to pay for sites that are truly free but completely unwilling to pay for “free” sites backed by ads, tracking, and corporate bullshit. I subscribed to the Lemmy dev’s Patreon and I sponsor a few open source projects on GitHub but I would never pay for Twitter Blue, Reddit Gold, Discord Nitro, etc. I want to reward good behavior, not support bad. I also supported Mastodon for a while on Patreon but when they changed the Mastodon onboarding process to make it more centrallized I pulled back on that. I don’t want to reward restriction of the openness the Fediverse provides, even if some subset of users can’t figure it out.
Hero response right here!