Top Banner
Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak [email protected]
19

Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak [email protected].

Mar 26, 2015

Download

Documents

Arianna Cobb
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Artificial Neural Network For Automated Prediction of Popularity of Digitized Images

David Oranchak

[email protected]

Page 2: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Objective Flickr.com ranks photographs based on their

popularity in the site’s user base. “Interesting”: An image with high rank “Not Interesting”: An image with low or

nonexistent rank For any image, can a neural network predict

which group it belongs to?

Page 3: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Approach Obtain sample images from Flickr.com

559 total samples in the training set “Very Interesting”: Ranking in the top 25 “Somewhat Interesting”: Ranking between 300 and

500. “Not Interesting”: No ranking data assigned by Flickr

Page 4: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

“Very Interesting”: 25 samples

Page 5: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

“Somewhat Interesting”: 11 samples

Page 6: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

“Not Interesting”: 36 samples

Page 7: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Approach Input data sets based on original image data

Raw pixel data, resampled for performance reasons 10x10 RGB pixels 20x20 Grayscale pixels

Color analysis data One-dimensional color counts (histogram)

RGB: three channels, 256 entries per channel Gray scale: one channel (luminosity), 256 entries

Texture data Contrast, correlation (inertia), dissimilarity, energy, entropy,

homogeneity, correlation matrix sum, symmetry

Input data derived using JIU, a free set of Java image tools

Page 8: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Approach Select a suitable neural network architecture

Feedforward backprop architecture? Result: difficult to train based on input data Hard to determine suitable number of hidden neurons

Kohonen unsupervised learning? Result: outputs do not naturally cluster based on

“interestingness” No mapping between clusters and desired outputs.

Counter Propagation Network? Result: Very easy to train on input data.

Page 9: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Approach Training the CPN

559 input patterns 221 patterns for “Very Interesting” 88 patterns for “Somewhat Interesting” 250 patterns for “Not Interesting”

Network simulated using CPN algorithm in JavaNNS, the Java-based successor to SNNS.

Five networks trained successfully; one for each type of input Raw RGB pixel data, raw gray scale pixel data, 1D RGB

histogram, 1D gray scale histogram, texture

Page 10: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 1: Comparison against Flickr images with known rankings 2381 images from 67 different days obtained

from Flickr 1373 “Very Interesting” images 557 “Somewhat Interesting” images 451 “Not Interesting” images

Page 11: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 1: Comparison against Flickr images with known rankings Results:

32% error rate when at least one network classifies images as “Very Interesting”

28% error rate when at least two networks classify images as “Very Interesting”

23% error rate when at least three networks classify images as “Very Interesting”

14% error rate when at least four networks classify images as “Very Interesting”

3% error rate when all five networks classify images as “Very Interesting”

Page 12: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 1: Comparison against Flickr images with known rankings Results are greatly improved when we combine the

categories “Very Interesting” and “Somewhat Interesting” into a single category: “Interesting” When one network classifies: 9% error rate When two networks classify: 9% error rate When three networks classify: 7% error rate When four networks classify: 4% error rate When five networks classify: 2% error rate

Downside: As number of networks go up to reduce noise, number of missed “Interesting” photos goes up.

Page 13: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 2: Flickr photos with unknown rankings 250 photos sampled at random from recently

uploaded Flickr photos All five networks classify “Interesting” for 14

of the 250 photos

Page 14: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 2: Flickr photos with unknown rankings Result

Page 15: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 2: Flickr photos with unknown rankings

Relaxing the constraint to 4 out of 5 networks produces 57 images

Page 16: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 2: Flickr photos with unknown rankings

Page 17: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 2: Flickr photos with unknown rankings Very subjective results. In my opinion, most

of the photos are interesting!

Page 18: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Experiment 3: Personal photo collection 2912 samples from personal photo collection When all 5 networks classify “Interesting”, 98

images result. Flickr results are better. Personal collection

experiment resulted in many “ordinary-looking” photos.

Test data setup may contribute to lack of success in this case (resizing of input photos, differences between Flickr image management and personal photo formats)

Page 19: Artificial Neural Network For Automated Prediction of Popularity of Digitized Images David Oranchak doranchak@gmail.com.

Conclusions Current CPN technique is very successful

within the Flick image data at locating interesting photographs

Further experimentation must be performed to improve success in locating interesting photographs outside of Flick

More experimentation and refinement must be done to improve detection rates and reduce false positives