Top Banner
Automated detection of criminal offences in social media postings An AI-based case study focusing on 'Incitement to Hatred‘ (§ 130 StGB) in German law Prof. Dr.-Ing. Torsten Zesch Dr. Semire Yekta Sprachtechnologie Abteilung Informatik und Angew. Kognitionswissenschaft Fakultät Ingenieurwissenschaften Dr. iur. Frederike Zufall Law, Science, Technology and Society Research Group Free University Brussels (VUB)
25

Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

Sep 01, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

Automated detection of criminal offences

in social media postings An AI-based case study focusing on 'Incitement to Hatred‘

(§ 130 StGB) in German law

Prof. Dr.-Ing. Torsten ZeschDr. Semire YektaSprachtechnologieAbteilung Informatik und Angew. KognitionswissenschaftFakultät Ingenieurwissenschaften

Dr. iur. Frederike ZufallLaw, Science, Technology and Society Research GroupFree University Brussels (VUB)

Page 2: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

Interdisciplinary Team

Language Technology / AI

Inputs:

▪ Written / Spoken / Handwriting / Pictures

▪ especially for data from social media

Outputs:

▪ Deep semantic analysis

▪ Sentiment

▪ Argumentation

NLP/AI Framework DKPro

(https://dkpro.github.io)

Legal expertise

▪ fully-qualified German lawyer

(Volljuristin)

▪ EU law, IT law

▪ computational law

▪ foundational research on data-driven law

▪ interdisciplinary background

▪ Law Science Technology & Society

Research Group (LSTS), Brussels

▪ Waseda University Institute for Advanced

Study (WIAS), Tokyo

Page 3: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Hate Speech Expertise

Hate speech definitions

▪ B. Ross et.al, Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee

Crisis. In Proceedings of NLP4CMC III: 3rd Workshop on Natural Language Processing for Computer-

Mediated Communication (Michael Beißwenger, Michael Wojatzki, Torsten Zesch, eds.), 2016.

Implicitness

▪ Benikova, D., Wojatzki, M., & Zesch, T. (2017). What does this imply? Examining the Impact of

Implicitness on the Perception of Hate Speech. In GSCL 2017, Berlin, Germany.

Hate speech towards women

▪ Gold, D., Wojatzki, M., Horsmann, T., & Zesch, T. (2018) Do Women Perceive Hate Differently:

Examining the Relationship Between Hate Speech, Gender, and Agreement Judgments. In KONVENS.

Hate speech detection systems

▪ Zhang, H., Wojatzki, M., Horsmann, T., & Zesch, T. (2019). ltl. uni-due at SemEval-2019 Task 5: Simple

but Effective Lexico-Semantic Features for Detecting Hate Speech in Twitter. In SemEval 2019.

▪ Aggarwal, P., Horsmann, T., Wojatzki, M., & Zesch, T. (2019). LTL-UDE at SemEval-2019 Task 6: BERT

and Two-Vote Classification for Categorizing Offensiveness. In SemEval 2019.

Legal perspective (basis of this talk)

▪ Zufall, F., Horsmann, T., & Zesch, T. (2019). From Legal to Technical Concept: Towards an Automated

Classification of German Political Twitter Postings as Criminal Offenses. In NAACL.

Page 4: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Hate Speech – Scientific Definition

Social Media

Comments

yes

Hate speech?no

AI

Scientist

Page 5: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Hate Speech – Scientific Definition

Merkel Vasallen sind bei Twitter unerwünscht!

Merkel minions are not wanted on Twitter!

Deutsche Medien, Halbwahrheiten und einseitige Betrachtung, wie

bei allen vom Staat finanzierten "billigen" Propagandainstitutionen

😜

German media, half-truths and one-sided consideration, as with all

"cheap" propaganda institutions financed by the state 😜

Source: GermEval 2018 Annotation categoy: abusive comments (highest category)

Page 6: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Hate Speech – Facebook’s Definition

Social Media

Comments

yes

illegal?

not wanted? no

• probably overblocking

Page 7: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Hate Speech – Facebook’s Definition

Social Media

Comments

yes

illegal?

not wanted? no

Page 8: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Hate Speech – Legal Definition

Social Media

Comments

no

• relatively few decided cases

illegal?

Page 9: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Deconstructing the Law

Page 10: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Deconstructing the Law

Is it a group?

Page 11: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Deconstructing the Law

• LGBTQ+

• Jews

• Refugees

• Muslims

• Politicians

• Disabled

• ...

Is it a group?

Set of target groups

Page 12: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Deconstructing the Law

Is it a group?

Set of target groups

Which target act?

Page 13: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Example

Target group?

Akademiker sind alles Lügner.

Academics are all liars.

Page 14: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Example

• incitement of hatred

• call for violence

• call for arbitrary measures

• assault human dignity by

• insult

• maliciously maligning

• defaming

Target group? Target act?

Akademiker sind alles Lügner.

Academics are all liars.

Page 15: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

More examples

Donald Trump is a liar

➔ no target group + no target act = legal

Muslims are nice

➔ target group + no target act = legal

Muslims are rapists

➔ target group + target act = illegal

Kill all liars

➔ no target group + target act = legal

Kill all muslims

➔ target group + target act = illegal

Page 16: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Challenges – Target Group

Akademiker sind alles Lügner. / Academics are all liars.

Spelling

▪ Akdemiker ... / Akademics ...

➔ solution: spell checking

Synonyms / Dynamic language use

▪ Akademtischks ... / Academchiks ...

➔ solution: contextualized distributional replacement vectors

Implicit language use

▪ Diese universitären Typen ... / Those college guys ...

➔ ongoing work

Page 17: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Challenges – Target Act

▪ Akademiker sind alles Lügner. / Academics are all liars.

▪ Akademiker sind Helden / Academics are heroes

▪Muslime sind Gläubige / Muslims are believers

▪Muslime sind Vergewaltiger / Muslims are rapists

▪ Flüchtlinge sind Schmarotzer / Refugees are scroungers

▪ ...

X is Y

Page 18: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Challenges – Target Act

Assault human dignity by defaming

▪ Akademiker sind alles Lügner.

▪ Alle Akademiker lügen.

▪ Akademiker sagen nie die Wahrheit. / Academics are all liars. / All academics are lying.

/ Academics never tell the truth.

Call for violence

▪ Man sollte alle Akademiker an die Wand nageln.

▪ Ertränken das Akademikerpack.

▪ Akademiker in die Tonne treten. / All academics should be nailed to the wall. / Drown

the academician pack. / Kick the academics into the bin.

Page 19: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Challenges – Target Act

Call for arbitrary measures

▪ Einsperren allesamt die Herren in ihren Talaren. / Lock up all the men in their gowns.

Implicitness

▪ Eigentlich waren das doch alles Akademiker in Köln am Hbf. / Actually, they were all

academics on Cologne Central Station.

▪ Akademikern sollte man jeden Tag einen Finger zuschicken. / Academics should be

sent a finger every day.

Page 20: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

General Challenge – Irony and Sarcasm

Akademiker sind alles solche Lügner 😜

Academics are all such liars😜

Wie können es diese Rapefugees wagen ohne

Berufsabschluss vor Krieg abzuhauen.

How dare those rapefugees to flee war without a professional

qualification.

Page 21: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Proof of Concept – Legal or not?

Page 22: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Possible Use Cases

AI / NLPTarget group

Target act

?

Ranking comments

Find similar

comment

???

Page 23: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Beyond §130 StGB

§ 130 para. 1, para. 2 StGB → incitement to hatred

▪ this presentation, paper submitted

§§ 185, 186, 187 StGB → defamatory conduct

▪ Zufall, F., Horsmann, T., & Zesch, T. (2019). From Legal to Technical Concept:

Towards an Automated Classification of German Political Twitter Postings as

Criminal Offenses. In NAACL 2019.

§ 130 para. 3, para. 4 StGB → incitement to hatred with Nazi background

▪ future work

Page 24: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Beyond German

Artificial Intelligence / Natural Language Processing

▪ relatively language independent

Legal situation

▪ Different in most countries

▪ But EU Framework quite similar to German law

Page 25: Automated detection of criminal offences in social media ... · / Akademics ... solution: spell checking Synonyms / Dynamic language use Akademtischks ... / Academchiks ... solution:

LTLab | Europäischer Polizeikongress - Forum: 2I Künstliche Intelligenz in der Polizeiarbeit

Summary

AI-based system

▪ State-of-the-art language technology → go beyond keyword search

▪ Operationalization of legal assessment in a working system

Limitations

▪ Not enough (annotated) comments for developing a truly robust system

▪ Implicitness still challenging (but manageable)

▪ Irony / Sarcasm really, really challenging

Thank You!