Application of Machine Learning and Crowdsourcing to Detection of Cyber Threats Jaime G. Carbonell Eugene Fink Mehrbod Sharifi.

Application of Machine Learning and Crowdsourcingto Detection of Cyber Threats

Jaime G.Carbonell

EugeneFink

MehrbodSharifi

http://www.cmu.edu/

Individual user differences• Security needs

- Data confidentiality- Data-loss tolerance- Recovery costs

• Usage patterns• Computer knowledge

Different users need different security tools.

Problems

• “Advanced user” assumption- Complicated customization- Unclear security warnings

• Inflexible engineered solutionswith “too much security”- Too high security at high costs- Insufficient customization

Population statistics

• Almost everyone uses a computer

• Most users are naïve, with limited technical knowledge

• Many security problems aredue to the user naïveté

Long-term goalWe need an intelligent security assistant that... • Learns the user needs • Detects complex threats• Prevents human mistakes• Helps the user to apply available security tools

• Crowdsourcing architecture

• Identification of web scams

• Detection of cross-siterequest forgery

Initial results

Crowdsourcing architectureGathering, sharing, and integration of opinions and warnings about web security threats.

Crowdsourcing architecture

Crowdsourcing architecture

Browser Extension

Web Browser MultipleUsers

Web Service

External DataSources

Identification of web scamsA web scam is fraudulent or intentionally misleading information posted on the web (e.g. work at home and miracle cures).

Identification of web scamsMachine learning approach:

• Collect data about websites, available from various public services

• Collect human opinions

• Apply machine learning (currently, logistic regression) to recognize scams based on the available data

Accuracy: 98%

Detection of cross-site request forgeryA cross-site request forgery is an attack through a browser, in which a malicious website uses a trusted session to send unauthorized requests to a target site.

Email

Malicious

Ads

News

Bank

……

… …

Detection of cross-site request forgery

Machine learning approach:

• Learn patterns of legitimate requests

• Detect deviations from these patterns

• Warn the user about potentially malicious sites and requests

Future research

• ... newly evolving threats, not yet addressed by the standard defenses

• ... cyber attacks by their observed “symptoms” in addition to using direct analysis of attacking code

• ... “nontraditional” threats that go beyond malware attacks, such as scams and other social engineering

Application of machine learning and crowdsourcing to detect...

Application of Machine Learning and Crowdsourcing to Detection of Cyber Threats Jaime G. Carbonell Eugene Fink Mehrbod Sharifi.

Documents

web security threats

high security

user needs

target site

intelligent security

nontraditional threats

complex threats

evolving threats