Top Banner
Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017, bsk
24

Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

Jul 06, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

Swiss Asylum Lottering

Scraping the Federal Administrative Court's Database and Analysing the Verdicts

17. Feb., Python Summit 2017, bsk

Page 2: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

How the story began

17. Feb., Python Summit 2017, bsk

Page 3: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

Where’s the data?

17. Feb., Python Summit 2017, bsk

Page 4: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

Textfiles

17. Feb., Python Summit 2017, bsk

Page 5: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

The Court

17. Feb., Python Summit 2017, bsk

Page 6: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

Who are these judges?

17. Feb., Python Summit 2017, bsk

Page 7: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

The PlanA Scrape judges

B Scrape appeal DB

C Analyse verdicts

17. Feb., Python Summit 2017, bsk

Page 8: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

A The judgesScraping the BVGer site using BeaufitulSoup

import requestsfrom bs4 import BeautifulSoupimport pandas as pd

Code on Github

17. Feb., Python Summit 2017, bsk

Page 9: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

B THE APPEALSThis is a little trickier, want to go a little more

into depth here

import requestsImport seleniumfrom selenium import webdriverimport timeimport glob

17. Feb., Python Summit 2017, bsk

Page 10: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

driver = webdriver.Firefox()search_url = 'http://www.bvger.ch/publiws/?lang=de'driver.get(search_url)driver.find_element_by_id('form:tree:n-3:_id145').click()driver.find_element_by_id('form:tree:n-4:_id145').click()driver.find_element_by_id('form:_id189').click()

#Navigating to text files

17. Feb., Python Summit 2017, bsk

Page 11: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

for file in range(0,last_element):

Text = driver.find_element_by_class_name('icePnlGrp')counter = filecounter = str(counter)file = open('txtfiles/' + counter + ".txt", "w")file.write(Text.text)file.close()

driver.find_element_by_id("_id8:_id25").click()

#Visiting and saving all the text files

17. Feb., Python Summit 2017, bsk

Page 12: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

C Analyse verdictsThe is the trickiest part. I wont go through the whole

code

import reimport pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport glob

import reimport timeimport dateutil.parserfrom collections import Counter

%matplotlib inline

The entire code

Page 13: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

whole_list_of_names = []for name in glob.glob('txtfiles/*'):

name = name.split('/')[-1]whole_list_of_names.append(name)

#Preparing list of file names

17. Feb., Python Summit 2017, bsk

Page 14: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

def extract_aktennummer(doc):nr = re.search(r'Abteilung [A-Z]+\n[A-Z]-[0-9]+/[0-9]+', doc)return nr

def extracting_entscheid_italian(doc):entscheid = re.findall(r'le Tribunal administratif fédéral

prononce\s*:1.([^.]*)', doc)return entscheid[0][:150]

#Developing Regular Expressions

17. Feb., Python Summit 2017, bsk

Page 15: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

def decision_harm_auto(string):gutgeheissen = re.search(

R'gutgeheissen|gutzuheissen|admis|accolto|accolta', string)

if gutgeheissen != None:string = 'Gutgeheissen'

else:string = 'Abgewiesen'

#Categorising using Reguglar Expressions

17. Feb., Python Summit 2017, bsk

Page 16: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

for judge in relevant_clean_judges: judge = re.search(judge, doc)if judge != None:

judge = judge.group()short_judge_list.append(judge)

else:continue

#Looking for the judges

17. Feb., Python Summit 2017, bsk

Page 17: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

#First results and visuals

17. Feb., Python Summit 2017, bsk

Page 18: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

#And after a bit of pandas wrangling, the softest judges...

17. Feb., Python Summit 2017, bsk

Page 19: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

#...and toughest ones

17. Feb., Python Summit 2017, bsk

Page 20: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

3%This is the number of appeals I could not categorise

automatically. So approximatly 300. (If I was a scientist I would do this by hand now, but I’m a lazy journalist…)

Page 21: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

publishingAfter talking to lawyers, experts and the court, we published out story “Das Parteibuch der Richter beeinflusst die Asylentscheide” on 10 Oct 2016. The whole research took us 3 weeks.

Page 22: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,
Page 23: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

The court couldn’t except the results (at first)

Page 24: Swiss Asylum Lottering - Swiss Python Summit · Swiss Asylum Lottering Scraping the Federal Administrative Court's Database and Analysing the Verdicts 17. Feb., Python Summit 2017,

Thanks!Barnaby Skinner, Datajournaist @tagesanzeiger & @sonntagszeitung.

@barjack, github.com/barjacks, www.barnabyskinner.com