• Strawberry@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      4
      ·
      18 hours ago

      The bots scrape costly endpoints like the entire edit histories of every page on a wiki. You can’t always just cache every possible generated page at the same time.

    • nutomic@lemmy.ml
      link
      fedilink
      arrow-up
      11
      arrow-down
      1
      ·
      1 day ago

      Cache size is limited and can usually only hold a limited number of most recently viewed pages. But these bots go through every single page on the website, even old ones that are never viewed by users. As they only send one request per page, caching doesnt really help.

      • jagged_circle@feddit.nl
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        3
        ·
        22 hours ago

        Cache size is definitely not an issue, especially for these companies using cloudflare

        • nutomic@lemmy.ml
          link
          fedilink
          arrow-up
          5
          arrow-down
          1
          ·
          20 hours ago

          It is an issue for the open source projects discussed in the article.

    • LiveLM@lemmy.zip
      link
      fedilink
      English
      arrow-up
      41
      ·
      2 days ago

      I’m sure that if it was that simple people would be doing it already…