The Role of Deep Learning in Filtering Hate Speech

Social media networks are in a great position to deploy deep learning to eradicate some of the darker aspects of online behavior. Read how it’s only a matter of time before society benefits from deep learning’s potential to filter abusive language.

By Mark Stone, Contributor

According to a Pew Research Center report, 41 percent of American adults have experienced some form of online harassment, and two-thirds have witnessed abusive online behavior toward others. As the cost of AI technology declines, social media networks are in a greater position to harness deep learning to eradicate some of these darker aspects of online behavior— specifically, to identify hateful or abusive language. In some cases, deep learning is already used to analyze online communication, particularly among youths.

Although we are not in a place where deep learning can inhibit abusive language entirely, tools to monitor hate speech are becoming more prevalent. They’re emerging as internet apps, services, and sites.

The questions many business leaders are curious to know are, can these deep learning tools curb troublesome social behavior? And, if so, how can humans best work with this technology?

Processing Normal Behavior, Together

One company making headway in the online filtering space is Two Hat Security. Two Hat’s primary product, an AI-based communication filter called Community Sift, identifies and thwarts bullying, harassment, and child exploitation in chat forums for large communities, such as the popular social network apps Animal Jam, SuperCell games, and Roblox.

At a high level, Two Hat’s algorithm is based on pre-selected words or multi-word rules that are vetted by its Language & Culture specialists—experts in linguistic patterns and cultural knowledge, including popular lingo. Incoming client text is fed through Two Hat’s algorithms and normalized—stripped of punctuation, capital letters, and map Unicodes—to discover any hidden subversion or meaning.

The AI software then assesses the “risk level” for the client, defining the level of abuse potentially involved. High-risk incidents can always be vetted by clients; mid-risk incidents, which represent a gray area, almost always require human intervention since it’s highly context-dependent.

Chris Priebe, the company’s founder and CEO, developed his skills as the former online security guru for Disney, where he had a license to hack anything with a Mickey Mouse logo. In the AI space, he’s helping his company become a specialist in what he calls “unnatural language processing,” a multidisciplinary branch of AI that helps computers interpret complicated, contextual human language.

While AI technology is good as predicting common behavior—like with autocorrect on a smartphone—it lacks an understanding of context, making it harder to support outlier cases. In the case of hate speech, “normal” behavior is flagrant, predictable, but Priebe and his company are interested in sifting through the more complex cases of abusive language.

By focusing on unnatural language processing, Community Sift can detect when people try to trick standard filtering systems. Priebe said that anyone who wants to get around the system may go to great lengths to bypass filters, such as spelling swear words backwards, putting parenthesis around words, or using Unicode characters. (They’ve seen almost everything.)

Yet to eradicate the problem, there is only so much deep learning can do, said Carlos Figueiredo, Two Hat’s director of community trust and safety. The company processes around 22 billion lines of text per month—enough data for the machine to learn as it goes—nevertheless, he said, human interaction is required.

“We don’t believe in artificial intelligence alone,” Figueiredo clarified. “You can’t wait for a machine to learn something and then deploy a new model and retrain it, which is the case with traditionally pure AI.” By intentionally choosing a hybrid of AI and a human expert system, the two can work together to address incidents as they arise.

“We empower our clients and partners with potential flows they can use to warn users, suspend them, block the text, escalate it for review, and other actions,” Figueiredo said. The eventual goal is to use AI to identify the issue, then escalate and block it on the spot.

Big Tech, Big Challenges

Two Hat’s AI is making a difference in some larger online communities, yet there is still a long way to go before witnessing a similar revolution on massive social networks.

As the co-founder of The Fair Play Alliance—a cross-industry initiative spanning over 30 gaming companies whose mission is to drive positive change—Figueiredo collaborates with heavyweights like Facebook, Blizzard Entertainment, Inc. and Epic Games to tackle the problem of abusive online behavior, and also participates in industry roundtables around the globe. And while AI is at the forefront of ways to reduce hateful communication, its adoption is another story.

What may be standing in the way is the sheer scale of data these communities must process. When talking about billions of users, the associated data is too overwhelming for any current deep learning system to manage.

Yet, according to Figueiredo, the effort is there. “[Social media giants] are at the forefront of the hardest challenge,” he said. “They have been more forthcoming about it, and they are hiring thousands of moderators. They’re putting in the effort, and you can see the change of tone over the last three years.”

Google, for instance, has a language filtering AI called Perspective, which any developer or publisher can include in their code. Twitter, too, recently acquired an anti-abuse provider called Smyte, whose service fights online abuse, harassment, and spam. Austin-based Authenticated Reality, a blockchain company, takes it one step further—imagining an entirely new Internet in which everyone is authenticated before going online.

And while 100 percent authentication might be an extreme case, deep learning technologies are nonetheless having a profound effect on allowing humans to decide what enters their feed. In the internet of the future, it is likely that we will not only choose what content we want to watch or read, but also curate the type of language we see.

“It’s really a societal challenge, where we don’t take all the risk away from people, be it children or adults,” Figueiredo said. “Without risk or conflict, we don’t learn, we don’t grow, and we don’t win.”