Measuring the Impact of Sharing Abuse Data with Web Hosting Providers Marie Vasek, Matthew Weeden, and Tyler Moore University of Tulsa WISCS 24 October 2016 1 of 27
Measuring the Impact of Sharing Abuse Data with WebHosting Providers
Marie Vasek, Matthew Weeden, and Tyler Moore
University of Tulsa
WISCS24 October 2016
1 of 27
StopBadware
• Founded in 2006 by Harvard’s Berkman Klein Center for Internetand Society
• Now housed at the University of Tulsa
• Provides independent reviews of websites appearing on 3 malwareblacklists
3 of 27
Review Requests for Individual URLs
4 of 27
Review Requests for Bulk URLs
5 of 27
Research Questions
Does sending bulk reports help?
• Short term:◦ Do reported URLs get cleaned up?◦ Which URLs are more likely to get cleaned up?
• Long term:◦ Do ASes get better at cleaning URLs after receiving bulk reports?
6 of 27
Overview
• Brief overview of study
• Define metrics
• Direct impact of sharing abuse data
• Indirect impact of sharing abuse data
• Conclusions
7 of 27
Bulk Requests over Time
2010 2011 2012 2013 2014 2015
15
5050
050
00
Date shared
# U
RLs
sha
red
8 of 27
Summary Statistics
• Google Safebrowsing Data used exclusively
• 6 year time frame (2010 - 2015)
• 69 stakeholders requested reports
• 41 web hosting providers in our study◦ Responsible for entire AS◦ Sent Google Safebrowsing Data◦ Had at least a month of data before/after
• 28 548 URLs reported
9 of 27
Malware Cleanup Metrics
• Clean◦ Off the blacklist◦ Stays off for 3 weeks
• Recompromise◦ A previously blacklisted URL is clean and then is reblacklisted
10 of 27
Measuring Direct and Indirect Impact of Reporting
• Direct Impact◦ Are the URLs we shared cleaned up?
• Indirect Impact◦ Are networks “better” after receiving a bulk review from
StopBadware?
• Do they clean malware URLs faster?• Do they clean malware URLs more effectively?
11 of 27
Measurement Timeline
blacklisted reported clean
blacklist to report
report to clean
blacklist to clean
12 of 27
Cleanup of URLs Shared with ASes
1 5 50 500
0.0
0.2
0.4
0.6
0.8
1.0
URLS shared with ASes
Report to Clean (days)
Pr(
repo
rt to
cle
an d
ays
>=
X)
13 of 27
Measurement Timeline
blacklisted reported clean
blacklist to report
report to clean
blacklist to clean
14 of 27
Long Lived Malware Takes Longer to Clean
0−10%
10−20%
20−30%
30−40%
40−50%
50−60%
60−70%
70−80%
80−90%
90−100%
010
020
030
040
050
0
●●
●●
●
●
●
●
●
●
020
040
060
080
010
00
Decile for Blacklist to Report (Days)
Med
ian
Rep
ort t
o C
lean
(D
ays)
[Bar
]
Bla
cklis
t to
Rep
ort (
Day
s) [L
ine]
15 of 27
Pre- vs. Post-Contact Cleanup
1 2 5 10 50 200
0.0
0.2
0.4
0.6
0.8
1.0
Survival probability before and after contact
Blacklist to Clean (days)
Pr(
blac
klis
t to
clea
n da
ys >
=X
) pre−contactpost−contact
16 of 27
Pre- vs. Post-Contact Cleanup: Improved AS
17 of 27
Pre- vs. Post-Contact Cleanup: Worsened AS
18 of 27
Pre- vs. Post-Contact Cleanup: Unclear effect AS
19 of 27
Change in Metrics Pre- and Post- Sharing
# ∆ days to clean ∆ recomp. rate
Improved 13 58 0.010Worsened 3 -176 0.085Unclear 17 13 0.008
20 of 27
Comparing Change in Metrics by AS
●
●
●
●
●
●
−300 −200 −100 0 100
−0.
10−
0.05
0.00
0.05
0.10
0.15
Median blacklist to clean pre−sharing − post−sharing
Med
ian
reco
mpr
omis
e ra
te p
re−
shar
ing
− p
ost−
shar
ing
● Top Quartile Report to Clean2nd Quartile Report to Clean3rd Quartile Report to CleanBottom Quartile Report to Clean
21 of 27
Matched Pair Analysis
• What would happen if StopBadware had not sent out reviews?
• Matched pairs between reported-to ASes and similar ASes
• Similar?◦ Same country◦ Similar level of badness
• Key Assumption: All else equal, ASes would exhibit similar patterns
22 of 27
Measurement Timeline
blacklisted reported clean
blacklist to report
report to clean
blacklist to clean
23 of 27
Matched Pair: Cleanup of URLs Shared with ASes
1 5 50 500
0.0
0.2
0.4
0.6
0.8
1.0
URLS shared with ASes
Report to Clean (days)
Pr(
repo
rt to
cle
an d
ays
>=
X)
reported ASesmatched pairs
24 of 27
Matched Pair: Pre- vs. Post-Contact Cleanup
1 2 5 10 50 200
0.0
0.2
0.4
0.6
0.8
1.0
Survival probability before and after contact
Blacklist to Clean (days)
Pr(
blac
klis
t to
clea
n da
ys >
=X
) pre−contactpost−contactpre−contact (mp)post−contact (mp)
25 of 27
Responsive ASes Improve Long Term after Report
●
●
●
●
●
●
−300 −200 −100 0 100
−0.
10−
0.05
0.00
0.05
0.10
0.15
Median blacklist to clean pre−sharing − post−sharing
Med
ian
reco
mpr
omis
e ra
te p
re−
shar
ing
− p
ost−
shar
ing
● Top Quartile Report to Clean2nd Quartile Report to Clean3rd Quartile Report to CleanBottom Quartile Report to Clean
26 of 27
Conclusions
• Directly sharing URLs helps clean up those URLs◦ Consistent with prior work on individual reports◦ This work finds it to be true for bulk reporting
• No evidence for long term change overall◦ Improvements on individual providers
• Long lived malware a scourge◦ Lots of efforts concentrating on newly infected websites◦ Lurking infections continue to harm, perhaps compounding◦ Current efforts not sufficient for stopping this “immortal” malware
27 of 27