Kevin Borgolte Christopher Kruegel Giovanni Vigna seclab THE COMPUTER SECURITY GROUP AT UC SANTA BARBARA August 13th, 2015 USENIX Security 2015 [email protected][email protected][email protected]MEERKAT Detecting Website Defacements through Image-based Object Recognition
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Kevin Borgolte Christopher Kruegel Giovanni Vigna
seclabTHE COMPUTER SECURITY GROUP AT UC SANTA BARBARA
• Traditional evaluation has problems: • Same defacement possibly in two bins • Defacements from 1998 vs. 2014
seclab
Kevin Borgolte Meerkat: Detecting Website Defacements through Image-based Object Recognition 14
Limitations• Fingerprinting and delayed defacements• Tiny defacements• Huge advertisements• Concept drift (natural and adversarial)
• Major: learn new features from new data (no feature engineering) • Minor: adjust weights of deeper classification layer
seclab
Kevin Borgolte Meerkat: Detecting Website Defacements through Image-based Object Recognition 15
Limitations: Minor Concept Drift & Fine-Tuning• Train on Dec 2012 to Dec 2013
• 1.78 million defacements • 1.76 million legitimate pages
• Test on Jan to May 2014
• 1.54 million samples, 50/50 split
• Fine-tune Jan, Feb, Mar, Apr
• BDR in Jan: 98.583% • w/o FT drops to 97.177% • w/ FT increases to 98.717%
• Team System Dz started Jan 2014!
0.9700.9750.9800.9850.9900.9951.000
Tru
ePos
itiv
eR
ate
Time-wise Split, with and without Fine-Tuning
with fine-tuning
without fine-tuning
0.0100.0150.0200.0250.0300.0350.040
Fals
ePos
itiv
eR
ate
January February March April MayMonth of 2014
-0.015-0.010-0.0050.0000.0050.0100.015
Di↵
eren
cew
/F
T-
w/o
FT True Positive Rate
False Positive Rate
seclab
Kevin Borgolte Meerkat: Detecting Website Defacements through Image-based Object Recognition 16
Conclusion• Introduced MEERKAT • Learns features automatically, match domain knowledge • Does not require prior version of website • Outperforms state of the art • Gracefully tackles minor and major concept drift