Robert F. Kennedy Jr. sues Meta, citing chatbot’s reply as evidence of shadowban

jeffw@lemmy.world · 6 months ago

Robert F. Kennedy Jr. sues Meta, citing chatbot’s reply as evidence of shadowban

rottingleaf@lemmy.zip · 6 months ago

Because a good person would never need those. If you want to have shadowbans on your platform, you are not a good one.

A bit like animal protection, while animals can’t have rights balanced by obligations, you would want to keep people cruel to animals somewhere where you are not.

hedgehog@ttrpg.network · 6 months ago

Because a good person would never need those. If you want to have shadowbans on your platform, you are not a good one.

This basically reads as “shadow bans are bad and have no redeeming factors,” but you haven’t explained why you think that.

If you’re a real user and you only have one account (or have multiple legitimate accounts) and you get shadow-banned, it’s a terrible experience. Shadow bans should never be used on “real” users even if they break the ToS, and IME, they generally aren’t. That’s because shadow bans solve a different problem.

In content moderation, if a user posts something that’s unacceptable on your platform, generally speaking, you want to remove it as soon as possible. Depending on how bad the content they posted was, or how frequently they post unacceptable content, you will want to take additional measures. For example, if someone posts child pornography, you will most likely ban them and then (as required by law) report all details you have on them and their problematic posts to the authorities.

Where this gets tricky, though, is with bots and multiple accounts.

If someone is making multiple accounts for your site - whether by hand or with bots - and using them to post unacceptable content, how do you stop that?

Your site has a lot of users, and bad actors aren’t limited to only having one account per real person. A single person - let’s call them a “Bot Overlord” - could run thousands of accounts - and it’s even easier for them to do this if those accounts can only be banned with manual intervention. You want to remove any content the Bot Overlord’s bots post and stop them from posting more as soon as you realize what they’re doing. Scaling up your human moderators isn’t reasonable, because the Bot Overlord can easily outscale you - you need an automated solution.

Suppose you build an algorithm that detects bots with incredible accuracy - 0% false positives and an estimated 1% false negatives. Great! Then, you set your system up to automatically ban detected bots.

A couple days later, your algorithm’s accuracy has dropped - from 1% false negatives to 10%. 10 times as many bots are making it past your algorithm. A few days after that, it gets even worse - first 20%, then 30%, then 50%, and eventually 90% of bots are bypassing your detection algorithm.

You can update your algorithm, but the same thing keeps happening. You’re stuck in an eternal game of cat and mouse - and you’re losing.

What gives? Well, you made a huge mistake when you set the system up to ban bots immediately. In your system, as soon as a bot gets banned, the bot creator knows. Since you’re banning every bot you detect as soon as you detect them, this gives the bot creator real-time data. They can basically reverse engineer your unpublished algorithm and then update their bots so as to avoid detection.

One solution to this is ban waves. Those work by detecting bots (or cheaters, in the context of online games) and then holding off on banning them until you can ban them all at once.

Great! Now the Bot Overlord will have much more trouble reverse-engineering your algorithm. They won’t know specifically when a bot was detected, just that it was detected within a certain window - between its creation and ban date.

But there’s still a problem. You need to minimize the damage the Bot Overlord’s accounts can do between when you detect them and when you ban them.

You could try shortening the time between ban waves. The problem with this approach is that the ban wave approach is more effective the longer that time period is. If you had an hourly ban wave, for example, the Bot Overlord could test a bunch of stuff out and get feedback every hour.

Shadow bans are one natural solution to this problem. That way, as soon as you detect it, you can prevent a bot from causing more damage. The Bot Overlord can’t quickly detect that their account was shadow-banned, so their bots will keep functioning, giving you more information about the Bot Overlord’s system and allowing you to refine your algorithm to be even more effective in the future, rather than the other way around.

I’m not aware of another way to effectively manage this issue. Do you have a counter-proposal?

Out of curiosity, do you have any experience working in content moderation for a major social media company? If so, how did that company balance respecting user privacy with effective content moderation without shadow bans, accounting for the factors I talked about above?