This video shows that Reddit refused to delete all comments and posts of its users when they close their account via a CCPA / GDPR request.

  • crowsby@kbin.social
    link
    fedilink
    arrow-up
    38
    ·
    1 year ago

    The creator of tildes.net is a former Reddit backend developer, and believes this behavior is likely due to how Reddit caching works (or doesn’t work), rather than an intentional subversion of user intent:

    Yes, this is almost certainly a technical issue. The way reddit caches things probably isn’t the standard way you’re thinking of, like a short-term cache that expires and refreshes itself. There are multiple layers of “cached” listings and items for almost everything, and a lot of these caches are actually data that’s stored permanently and kept up to date individually.

    For example, when you view your comments page, Reddit uses a cached (permanent) list of which comments are in that page. There is a separate list stored for each sorting method. For example, maybe you’d have something like this with some made-up comment IDs:

    Deimos’s comments by new: 948, 238, 153
    Deimos’s comments by hot: 238, 153, 948
    Deimos’s comments by controversial: 153, 238, 948
    If I post a new comment, it will go through each list and add the new ID in the right spot (for example, in the “new” list it always just goes at the start). If I delete a comment, it goes through every list, and removes the ID if it can find it in there.

    One of the problems with this system (which is probably what’s causing @phedre’s issues, and affecting many other people trying to delete their whole history) is that all of these listings are capped at 1000 items. If you already have more than 1000 comments and you post a new one, the 1000th comment currently in the new list gets “pushed off the end”. The comment still exists, but you won’t be able to see it by looking through your comments page, because it’s no longer in that listing.

    Deleting comments also doesn’t cause previously “pushed off” ones to get re-added. If you have 5000 comments, your listing will only include 1000 of them. If you delete 50 of the ones in the listing, your listing now has 950 comments in it. If you delete all 1000 from the listing, your comments page will appear empty, but you actually still have 4000 comments that will be visible in the comments pages they were posted in.

    And this is only one aspect of it. There are also multiple other places and ways that comments are cached—comment trees are cached (order and nesting of comments on a comments page, for all the different sorting methods), rendered HTML versions of comments are cached, API data is probably cached, and so on.

    All of these issues are probably just some combination of all of your posts being difficult to find and access due to the listing limits or certain cached representations of posts not being cleared or updated properly.

    • eleitl@lemmy.world
      link
      fedilink
      arrow-up
      50
      ·
      1 year ago

      Luckily GDPR deletion requests don’t care about how they are implemented. And failures to comply en masse tends to get really expensive.

      • JohnEdwa@kbin.social
        link
        fedilink
        arrow-up
        10
        ·
        edit-2
        1 year ago

        Yup. I’m waiting for Reddit to come back with my GDPR data request (which has a time limit of 30 days, after which they can tell their excuses to extend it by another 30 days I believe), and assuming they have not reversed the API decision I’m ordering them to delete it all afterwards. And they even now have a handy list, the one they just gave me, of everything they have to purge - if they didn’t, it wouldn’t be on that list in the first place :)

        • ja534@kbin.social
          link
          fedilink
          arrow-up
          8
          ·
          1 year ago

          Still waiting for the GDPR request i made at the start of this shitshow, will be funny to witness the mass GDPR deletion requests of accounts at the start of July

        • dan@upvote.au
          link
          fedilink
          arrow-up
          4
          ·
          1 year ago

          It’s been 3-4 weeks since I submitted my CCPA request, and I still haven’t gotten my data yet. CCPA has a time limit of 45 days.

          • abff08f4813c@kbin.social
            link
            fedilink
            arrow-up
            2
            ·
            1 year ago

            That’s what’s so awful about this. Prices were announced May 31st, so for a CCPA request that was done that very instant, they can delay until mid July, when the API changes will make it much more difficult to delete your data, and there’s no recourse.

            Even for GDPR, maybe you’d get it the day before, for the shorter 30 day limit. But a day of a few hours could easily mean you’ve gone past and API is also a problem for you.

            This is some messed up timing, mates.

            • DrNeurohax@kbin.social
              link
              fedilink
              arrow-up
              2
              ·
              1 year ago

              I would hope that someone reaching out to press from ModCoord would pass these concerns on to journalists. A persistent journalist can uncover the extent of compliance to the GDPR and CCPA through proper questions. “Have you seen an increase in GDPR/CCPA requests wince the controversy started? What percent of those have you completed? What about reports that users are unable to delete their data?” etc. (only better because I’m not a journalist and probably oversimplifying).

              • BrikoX@vlemmy.net
                link
                fedilink
                arrow-up
                3
                ·
                1 year ago

                Reddit stopped answering requests for comment from objective journalists.

                People just need to start filling complains with their Data Protection Authority. Then the mainstream media will be forced to cover the stories to get the clicks.

    • PositiveNoise@kbin.social
      link
      fedilink
      arrow-up
      21
      ·
      1 year ago

      Based on this, I’d say that Reddit fully deserves to be banned in Europe and California, and fined into potential bankruptcy. Having deeply flawed technology that prevents them from ever being in compliance of a very serious law is no excuse.

      • lemmyvore@feddit.nl
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Not necessarily, although Reddit can definitely choose to play it that way.

        A lot of systems made in the pre-GDPR era (which is most of them) were not designed with the capability to decouple and erase content at a moment’s notice.

        Btw incompetence won’t hold up as a valid defence for violating GDPR. At most it can give them some stalling room.

    • Economizer@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      Oh God. Somewhat unrelated, but I felt like I knew the name “Deimos” from somewhere. Couldn’t put my finger on it. Finally realized who he was.