Top Banner
SOPS: Stock Prediction using Web SOPS: Stock Prediction using Web Sentiment Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29 Summarized by Jaeseok Myung
18

SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Dec 16, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

SOPS: Stock Prediction using Web SentimentSOPS: Stock Prediction using Web Sentiment

Presented by Vivek sehgal, Charles Song

Department of Computer Science, University of Maryland

ICDMW 2007

2009-05-29

Summarized by Jaeseok Myung

Page 2: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

In this talk..In this talk..

Introducing some papers about sentiment analysis in finance

[1] 0Event and Sentiment Detection in Financial Markets (ISWC 08)

– Simple Architecture

[2] SOPS: Stock Prediction using Web Sentiment (ICDMW 07)

– Entire Process

[3] Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web (Management Science 07)

– An Idea that can improve prediction performance

We will focus on SOPS, but brief introductions about the others will also be presented

Center for E-Business Technology

Page 3: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

Sentiment Analysis in Financial Sentiment Analysis in Financial MarketsMarkets

Sentiment analysis is one of my favorite research topic

I’ve conducted some researches by using product reviews

In my opinion, finance is more suitable domain than product

Product sales statistics is not publicly available

– Stock values are always opened

Financial markets are really related to investors’ sentiment

– ‘ 경제는 심리’

– Behavioral finance

– Lots of evidences

Interesting & Worth

Center for E-Business Technology

Page 4: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

Research Problem from [1][2][3]Research Problem from [1][2][3]

How can information from various, heterogeneous sources be integrated?

Different formats

How can the opinions in the documents be extracted?

Statistical, NLP ways

How can the important opinions be filtered?

Reliable Source(news, blog), Trusted Author, Promising Alg.

How can the users’ trading decisions be supported?

Finding out the relationships between investors’ sentiment and stock values

Center for E-Business Technology

Page 5: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

An Architecture from [1]An Architecture from [1]

Center for E-Business Technology

Monitor a huge number of relevant sources

Monitor a huge number of relevant sources

Extract metadata and Make a single representation

Extract metadata and Make a single representation

Decide whether the information has to be analyzed or not

Decide whether the information has to be analyzed or not

Page 6: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: System OverviewSOPS: System Overview

Center for E-Business Technology

Collect data from a message board

Collect data from a message board

Remove HTML tags and extract features

Remove HTML tags and extract features

Identify reliable users in order to filter noise

Identify reliable users in order to filter noise

Use several classifiersUse several classifiers

Page 7: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: Data CollectionSOPS: Data Collection

260,000 messages for 52 popular stocks on Yahoo! Finance

The messages covered over 6 month time period

A message board exists for each stock traded on major stock exchange such as NYSE and NASDAQ

Users must sign up before they can post messages

Every message posted is associated with the author

Center for E-Business Technology

Page 8: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: Data CollectionSOPS: Data Collection

Center for E-Business Technology

Page 9: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: Feature RepresentationSOPS: Feature Representation

After the relevant information has been extracted

Converting each message to a vector of words and author names

The value of each entry in the vector is then calculated using TFIDF formula

Center for E-Business Technology

M : set of all messagesm : a messagew : a term

M : set of all messagesm : a messagew : a term

( 3.2, 1.6, 1.09, 3.37. 90, 0.5, …)

“good” “stop” “asdf” date % of change in stock price

Page 10: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: Sentiment PredictionSOPS: Sentiment Prediction

Center for E-Business Technology

a message(undisclosed)

Classifier

Strong Buy Strong Sell

Buy Sell

Hold

What How

a message(disclosed)

Classifier(Training)

Strong Buy Strong Sell

Buy Sell

Hold

Page 11: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: Sentiment PredictionSOPS: Sentiment Prediction

The sentiment for a message m at time instant i is modeled as follows:

Center for E-Business Technology

m : a messageMi : set of all messagesSVi : Stock value

m : a messageMi : set of all messagesSVi : Stock value

Classifier

1.Naïve Bayes2.Decision Trees3.Bagging

Strong Buy, Buy, Hold, Sell, Strong SellStrong Buy, Buy, Hold, Sell, Strong Sell

Strong Buy Strong Sell

Buy Sell

Hold

0.2 0.3 0.1 0.4

Page 12: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

TrustValue CalculationTrustValue Calculation

Some authors are more knowledgeable than others about the stock market

Trusted author’s posts should carry more weight => TrustValue

TrustValue

Not only cares about the direction in which the stock price went, but also care about the magnitude

Takes into account the fact that a single author cannot be expert on all stocks => an author can be assigned different trust values for different stocks

Center for E-Business Technology

PredictionScore : author’s prediction performance that is how closely does the author’s prediction follow the stock market

NumberOfPrediction : the total number of predictions made by the author

ExactPrediction : the number of exact predictions

ClosePrediction : the number of “good enough” predictions

ActivityConstant : a constant used to penalize low activity or predictions by the author

PredictionScore : author’s prediction performance that is how closely does the author’s prediction follow the stock market

NumberOfPrediction : the total number of predictions made by the author

ExactPrediction : the number of exact predictions

ClosePrediction : the number of “good enough” predictions

ActivityConstant : a constant used to penalize low activity or predictions by the author

Page 13: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: Stock PredictionSOPS: Stock Prediction

Center for E-Business Technology

Classifier

Go up Go down

Page 14: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: Evaluation MetricsSOPS: Evaluation Metrics

Center for E-Business Technology

Page 15: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

SOPS: ExperimentsSOPS: Experiments

Center for E-Business Technology

Page 16: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

ConclusionConclusion

SOPS can predict Web sentiment with high precision and recall

SOPS introduced TrustValue which takes into account the trust-worthiness of an author

In my opinion, there are some points that are unclear

Presentation

– About Summarization

Users

Time Period

Center for E-Business Technology

Page 17: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

FurthermoreFurthermore

We have the paper [3]

Center for E-Business Technology

Page 18: SOPS: Stock Prediction using Web Sentiment Presented by Vivek sehgal, Charles Song Department of Computer Science, University of Maryland ICDMW 2007 2009-05-29.

Copyright 2009 by CEBT

Research Problem from [1][2][3]Research Problem from [1][2][3]

How can information from various, heterogeneous sources be integrated?

Different formats

How can the opinions in the documents be extracted?

Statistical, NLP ways

How can the important opinions be filtered?

Reliable Source(news, blog), Trusted Author, Promising Alg.

How can the users’ trading decisions be supported?

Finding out the relationships between investors’ sentiment and stock values

Center for E-Business Technology