• 1 Post
  • 20 Comments
Joined 1 year ago
cake
Cake day: June 11th, 2023

help-circle


  • It clearly reads as autogenerated reply. It seems ambiguous to me still whether it’s thinking you’re trying to move your domains to squarespace and wondering if google sill keep data or if it’s about them moving domains to squarespace.

    Though I’m general I’d assume if you move all your domains out of Google Domains before the transition, there shouldn’t be anything for them to transfer to squarespace.


  • I’m not sure that’s true. Most private trackers accept donations. Some even require you to buy some seedbox plan they get commission from (even though that’s generally frowned upon).

    All the high profile trackers I can think of that were shutdown through legal notice (Mininova, isoHunt, KickassTorrent, ThePirateBay, etc) were all public trackers. Maybe they had ads or something on their website, but their shutdown had nothing to do with them making money. They were shutdown for piracy even though they never “hosted” any content. They were just trackers.

    Hell, even Popcorn Time, a software that just let you easily search torrents and stream them, it hosted nothing, just connected you to trackers that had movies was too shutdown by legal notice.

    Trackers that survive are usually hosted behind VPNs and are physically located in Russia or China.




  • The search engine market isn’t quite as diverse as it may appear https://www.searchenginemap.com/

    There are maybe 4 or so ‘crawlers’, and the rest buys access to the part of their data they are willing to sell to others.

    Running a crawler with the current size and complexity of the internet is expensive, and complicated. Then there is sifting and sorting the data in a reasonable searchable format, and then there is the quality problem, etc.

    Much easier to license data access from a provider (Usually Bing or Google or both) and just offer some added features on top, like no tracking, different result UI, custom filtering values per Bing or Google’s APIs that make your own “secret sauce”, etc.


  • I think you sort of answer your own question when you wonder what the value of having tons of smaller instances versus a smaller number of larger ones as you discuss the stress of federation from the larger instances soon after.

    I’m not sure I have. I was saying you must have/will end up with few large instances PERIOD. There is just no way around it. Otherwise you have a discovery/usability problem. Absolutely 0 non-techies will setup their own Lemmy instance. It’s not even reasonable to assume they could. All I do for work is setup cloud deployments of applications and build tools to maintain and monitor them, and I have 0 interest in setting up a Lemmy or Mastodon instance. I like Mastodon in theory, if the main instance maintainers shit the bed, create an account on another instance. But I’m not setting up my own instance.

    For example, I’m into woodworking, selfhosting, programming, and cooking. I’m not setting an instance for any of that. I just want to know what are the popular instances that have woodworking, programming, and cooking communities and just subscribe to them. I’m not gonna go looking for hundreds of small instances all over the internet, nor am I gonna host my own.

    As for moderation, everything does not federate everywhere at once, there needs to be some interaction between the instances first (like subscriptions to communities) for things to really start transferring over. Yes, moderation will still take work, but you don’t have to address the entire fediverse at once.

    Yes, that makes sense if the problematic aspects of another instance is just their content or culture. Like if there is a say ultra-religious instance, or a toxic incel instance, or a Nazi instance, etc. Sure, you never federate with them and you never have to worry about their culture.

    This doesn’t work for spammers/malicious actors instances though. As a spammer or malicious actor, all I have to do is setup dozens and dozens of random instances that are just constantly spamming every instance that’s on a list somewhere. Forcing it to federate with my spam instances and putting load on them. As a bad actor, I actively come to you, not the other way around.

    Then you’re back to the allowlist, which breaks the “federation” idea in general or introduce a lot more mod/support cost needing to manually allow each instance that you “think” is not spam/malicious.

    So finally in that a world of few inevitable large instances, and hundreds of thousands of small ones that are all polling/pushing activity to the big ones, that’s just significantly more load on the system as a whole. There is no magic around cache, latency, inconsistency, slowness, etc except to be on the instance where you want to converse with people.

    I tried googling for how ActivityPub scales to see if there is some write up and this came up for example https://lucumr.pocoo.org/2022/11/14/scaling-mastodon/

    from that article:

    One thing seems relatively certain: if Mastodon wants to host a sizable community where some people have followers from most other instances, then the size of an individual instance will matter a lot and I’m pretty sure that the only sensible approach will be to either not permit small instances to participate at all, or for those to come with some other restrictions that will require special handling.

    Many developers don’t want to accept the problem of back-pressure. (A topic I wrote about quite a bit incidentally). Unfortunately some bad servers can really break you, and you will have to avoid federating to them. In general too many small servers will likely cause issues for very popular accounts on popular servers.



  • Thanks for the reply. If you don’t mind, I still have few questions.

    I understand the value of a distributed architecture and federation. What I wasn’t sure about is the value of tons (thousands? hundreds of thousands? millions?) of small instances vs few hundred or thousand large ones.

    This spread out architecture allows for lesser hosting costs per instance and if an instance goes down it does not mean the entire service goes down as a whole.

    It seems that federation would put more pressure on all popular instances, no? the more popular an instance, the more likely others to want to federate with it, the more work it needs to do to push data, the more calls, etc. I understand that relays could spread out the load, but you’re just pushing the problem one more level. I already see wildly different numbers of comments on the same thread between the different instances depending on the home vs federated, with low usage (talking about <100 comments). It seems to take a long time for things to sync, and some comments don’t seem to sync.

    And while sure, your own personal instance of Lemmy might be up and fine, if the popular instances you federate with are down, you’re essentially cut off still, right?

    Additionally, it allows for easier moderation as moderators (admins?) are instance specific. You don’t have to moderate the whole of Lemmy, keeping your own house clean is enough.

    You have to moderate any instance you allow to federate with still, right? Like either you lock down which instances you want to federate with (have an allowlist) or you block abusive instances (have a denylist) either way it’s a lot of management still. More flexible, for sure, but not exactly a walk in the park, right?


  • well, there was a long thread about this on /r/selfhosted where @[email protected] @[email protected] was saying pretty much what I said, but with a tad more mental gymnastics mostly about EU laws regarding reverse engineering and lack of a formal agreement between them and YouTube.

    Unfortunately (or fortunately?), /r/selfhosted is private atm due to the blackout, so I’m unable to find and share thread link.

    The facts are:

    • Invidious (as an OSS project) calls undocumented internal YouTube APIs (they call it InnerTube).
    • Anyone can host an Invidious instance.
    • The main Invidious instance, i.e: https://invidious.io/ received a cease and desist from YouTube.

    @[email protected] @[email protected] posted all about this on GitHub, reddit, their personal blog, and contacted random media outlets like the one linked here, to complain about how “we have nothing to do with YouTube, why is YouTube bullying us”. And since everyone obviously wants to give the little guy the benefit of the doubt, everyone starts wondering how it could be that a project that’s all about providing an alternative UI for YouTube, doesn’t call YouTube.

    It’s like if a movie pirating website is trying to argue

    “Endgame.mp4” is just a file name. It has nothing to do with Marvel or Disney. What the hell are those greedy companies have to do with us??

    I’m all for invidious, piracy, etc. But seriously?







  • PGP email has nothing to do with the email protocol. All your message metadata and headers are still not encrypted/can’t be encrypted. You can only encrypt some payload with a PGP key, and it’s up to the receiver to figure out whether or not they want to trust any of the message metadata. The entire envelope is still plaintext everywhere. PGP email is just email, but you’re sending some random encrypted text in it.



  • the best thing to do is to migrate by deleting your content from Reddit and moving it elsewhere

    That’s not really realistic for the type of content that is Reddit. It’s not like blogs or videos or photos that the majority of people have on Reddit. Most people’s “content” on Reddit are bookmarks/links or comments in a discussion threads.

    It doesn’t make sense to just re-share a dump of all the links you once shared on Reddit even if you have a list of them.

    It also doesn’t make sense to re-share comments out of their discussion context else where.


  • I don’t disagree, but I think it’s a bit of an oversimplification to attribute it all to capital. There is a failure in how the original internet (and traditional FOSS for that matter) envisioned the world.

    The original vision was that everything will be distributed. There are protocols, there are implementations, and there are “users”. Where the term “user” encapsulated everyone from the person developing/contributing/maintaining the code, the person deploying and operating it, all the way to the grandparent or child or otherwise absolutely non-technical end user.

    The idea was sound. You are a technical user, you could run email server for a set of people you know. Others could do the same. Small companies could start offering paid services, etc.

    But the devil is always in the details. Who is maintaining it? Who is keeping everything secure and updated? How does it scale? How frequently do you need to migrate everything because the operator is going out of business or has come down with health issues, or has died. How much trust do you have to put in every operator? People don’t want downtime. People don’t want frequent migrations. People don’t want to have to trust hundreds of small providers and have churn all the time in services they rely on for their day to day.

    The rise of a centralized, large, and popular operators of each type of service is inevitable in that case. A couple of large email providers were always distant to happen. Same with storage, messaging, etc. It’s difficult to selfhost everything yourself, and it’s incredibly burdensome to do it for free for a large number of people.


  • Can someone explain to me what’s the point in having a lot of small instances of something like Lemmy?

    I’m very familiar with Azure, and looking at the docker-compose file and AWS setup, it’s very straightforward to setup a simple instance on Azure container apps. How much it costs you will highly depend on what you want to do with it and how you expect it to be used.

    Like, how much traffic are you expecting?