Page 1
Augmented Media for Traditional Magazines
Vinh-Tiep Nguyen
University of Science &
John von Neumann Institute
VNU-HCM
Ho Chi Minh city, Vietnam
[email protected]
Trung-Nghia Le
University of Science
VNU-HCM
Ho Chi Minh city, Vietnam
[email protected]
Quoc-Minh Bui
University of Science
VNU-HCM
Ho Chi Minh city, Vietnam
[email protected]
Minh-Triet Tran
University of Science
VNU-HCM
Ho Chi Minh city, Vietnam
[email protected]
Anh-Duc Duong
University of Information Technology
VNU-HCM
Ho Chi Minh city, Vietnam
[email protected]
ABSTRACT Reading traditional newspapers or magazines is a common
way to get latest information about events or new products.
However these printed materials only provide readers with
static information. Readers may want to know more detail
information about some product in an article or to watch
video clips related to an event mentioned in a news right at
the moment when they read that article or news. The
authors propose a system with mobile devices that can
provide extra information and multimedia for readers by
applying augmented reality to traditional magazines. A user
can enjoy extra rich multimedia information on a product or
a news on his/her mobile device just by looking at an article
in a traditional magazine through his/her mobile device.
The system detects which article in which page of a
magazine that is being displayed in a mobile device and
provides a reader with related information and multimedia
objects. The important feature of our proposed system is
using lightweight filter to efficiently filter out candidate
covers or articles that do not visually match an image
captured by a mobile device. The experiment shows that
our proposed system achieves the average accuracy of more
than 90% and can process in the real-time manner.
Keywords
Magazine, Augmented Reality, Planar Object Recognition
1. INTRODUCTION
For the traditional newspaper, readers can only receive
information in the form of printed papers. Although it is
prevalent, users will easily get bored with the traditional
way of reading news. For more information about the
events they read, readers often find themselves on the
website. We can solve this problem by integrating AR
technology into existing newspapers. With this approach,
users can get a new experience by the interesting media has
been added to the traditional newspaper.
Inspired by Harry Potter film, when the newspapers do not
only contain static images but also characters which can
move, talk and act like real people live. The animations are
put into real life of man, with the support of the AR. This
can be done on the papers, comic books, the user guide, or
even in the field of education such as books, research
papers.
Imagine when traditional newspapers are now equipped
with digital information, the reader will no longer need to
open a web browser to search the video whenever they see
the images of an event in the newspaper. We do not need to
go online for information about a new film was released, we
can watch the movie trailer directly and quickly through the
application associated with the newspaper.
In addition, this application is not limited to video
enhancement. We can expand to increase the illustrations
for the events in the newspaper. Sounds can also be added
to the newspaper to know the emotional tone of someone in
specific situations. For example in case of earthquakes in
Japan, we can see the video of the tsunami with more
pictures of each earthquake and hear the feelings of the
victims when faced with disaster.
Page 2
To get more information or to argue online, we can use the
URL associated with this event on the paper to open it in
your browser. To share your feelings with friends while
reading the newspaper, we can use function “Like” to put
his remarks on social networks like Facebook. That can
actually raise more interest in reading the news of people,
because they not only read alone, but also read and share
with their friends.
Besides, the augmented information such as videos, images,
sounds, URLs and even 3D models can be used for
advertising purposes on the newspaper. As the new
advertisements of Toyota cars, manufacturers will offer
more information about their vehicles, a vehicle created by
a 3D model will help customers to view products visually.
They not only can observe the car from all angles, but also
can interact by touching the car parts to get detailed
information about the vehicle, such as gasoline
consumption in 1 km, the information warranty of the
product, etc. All the regular questions that the customers
may request will be matched appropriately according to
their requirements. Another example is real estate. The sale
will advertise their land more easily with images, videos
and 3D models. The customer can see and know how to
communicate easily with the seller. In this way, the broker
will be able to reduce their budget by advertising when they
can add a range of information on this same page size
paper.
Kompas newspaper, Indonesia's first Asian reality
applications enhance the reader support [2].
Commonwealth Bank uses technology to enhance
newspaper advertising [4]. The company Tissot watches has
practical applications in helping enhance the user can lay
hands on the clock to try to find the most suitable type
without having to lay directly at the store [3].
In this paper, we propose a system which gives a new
experience of interaction for newspaper readers, supply
more related information, so that users could approach in
multi-dimension. With this system, users could express their
though about the article such as “like”, comment. We also
propose a lightweight filtering method which helps
matching processing skip unnecessary computation. This is
called pruning strategy. The experiment shows that our
proposed method could be used in the reality with many
type of magazine.
This paper is organized as follows. In section 2, we present
the background about augmented reality and related works
in detection and matching. Our proposed system and
method are presented in section 3. Experiments to evaluate
the performance and efficiency of our proposed system are
in section 4. Sample usage scenarios of our proposed
system with different types of mobile devices are presented
in section 5. Conclusion and future work are discussed in
section 6.
2. BACKGROUND
2.1 Augmented Reality
With the development of virtual reality technology,
everything in the real world can be simulated by
computer[1]. All of objects are created in a likely 3D
environment and human can fully experience by vision,
hearing, and acting on it or even feel the fragrance of the
object. However, these objects is still a virtual object, the
user can not feel like the real world around them. For the
above reasons, augmented reality combines both virtual
objects and the real world to make human feel more
familiar.
Augmented Reality (AR) is a combination of virtual objects
and the real world [7]. The virtual object is used to enhance
the relevant information of the scene recorded from reality.
Whatever users can see is the augmented information which
is displayed overlaid real world objects or associated with
real space in which they are observed. The user will not feel
the separation between the virtual and real components.
That is the main purpose is of augmented reality to blur the
boundaries and the difference between real and virtual
objects in order to increase awareness and interaction of
human with the real world [8].
AR provides information in various types such as text,
images, and video can be applied in many different fields
such as education [6], health [7], geographic information
systems [9], and painting [10].
2.2 Marker based matching
This methods could be used to calculate camera pose in
real-time from marker [10], special image [11] or bokode
[12]. Markers are like barcodes that are associated with
objects that need to be tracked. ARToolKit [9] is one of the
most famous toolkits widely used in AR applications. After
thresholding the input image, regions whose outline contour
can be fitted by four line segments are extracted. The
regions are normalized and the sub-image within the region
is compared by template matching with patterns. This
recognition is linear with the number of markers, so the
performance is very high if there are a lot of markers in the
same image.
ARToolKitPlus [5] and ARTag [22] for example overcome
this scalability issue by using a bar code-like system to
Page 3
encode the marker index in its appearance. Tracking by
using markers offers high speed and high accuracy. Also,
markers can be attached to any objects and if we want, it is
possible to make users unable to see the markers within the
display zone by the video see-through system. However,
newspaper or magazine does not have enough space for
attaching markers, they often used for content and
advertising. Moreover, in practice, one of the most
disadvantages of markers is that it is not familiar with
human sense.
2.3 Natural image based matching
This method recognizes objects using their outside
appearance of a book or magazine as natural features.
Natural image based matching is a common technique that
finds a sub-template in a bigger image. There are two main
approaches of template matching: local feature-based
template matching and template-based template matching.
Template-based approach uses color information of the
template as a main factor to determine the similarity
between template and an extracted pattern from source
image. In the template-based approach, there are many
distance measures to classify templates such as: Sum of
Squared Differences (SSD), Sum of Absolute Differences
(SAD) [20]. Area-based matching methods are simple and
easy to be implemented. Moreover, this approach can work
efficiently with both simple texture and complex texture
patterns. However, this approach is usually not robust with
change of scale, rotation and view point. So it is not
suitable for our problem solving.
Feature-based approaches use features such as edges [16],
corners [25], blobs [17][18] and a similarity measure to find
the best matching between two features in a template image
and a source image. There are two main steps to determine
a local feature. The first step is to detect interest points and
the second step is to descript that key points. This approach
is very popular in object recognition problem because of its
robustness with scale and rotation transformation,
occlusion, change of view point and noise. However, one of
the weaknesses of this approach is its high computational
cost. There are many ways to improve the low speed of
SIFT-like feature such as: Randomized Tree and Ferns
[24], however, it require a long time of training process.
Combining detection and tracking also solves the problem
of high computational cost but it only performs with a
limited number of objects.
Come back to our problem, we only need to perform with a
static image capture by mobile camera, so that we don’t
need to use tracking technique. All of the patterns are also
extracted features and stored in the server. Hence, in this
paper, we propose a lightweight filtering approach to skip a
lot of patterns as much as possible in order to decrease
unnecessary computational cost.
2.4 Marker and markerless AR applications
Using marker, barcodes, bokode or another natural markers,
AR applications can make many applications on many
aspects of life such as entertainment, health-care, sport and
education. Augmented Book [13] is a system that uses
Hybrid Visual Tracking to display information of some
book. The Hybrid Visual Tracking is the combination of
fiducial marker tracking and markerless tracking. Fiducial
marker is surrounded by a black shape to detect easily while
markerless tracking uses key points to match between
different scenes. By using FAST detector [14], the authors
of this system can find the key points of the image and the
augmented reality information can be displayed to users in
real-time.
Virtual Pop-up book [15] is another application using
Augmented Reality to provide more extra information for
users. In this system, the authors do not use markers to
create natural ways for users to communicate with the
system. Another attraction of the system is using 3D scenes,
users feel the scene alive and they become involved in the
story.
3. PROPOSED METHOD
3.1 Overview of the system
The overview of our proposed system is shown in Figure 1.
When reading a book, magazine, a reader wants to know
information about some product on a page of magazine
right at the time he or she sees the appearance of the
product. The user can use a mobile device to see a page of a
magazine. After sending the query image, the server will
find the information about products in the query image and
display this on user’s mobile device screen. The user will
select one of the products that they want to know its
information.
Page 4
Figure 1. The proposed system’s overview.
Figure 2. The process of proposed system.
The specific steps of system are illustrated in Figure 2.
First, a mobile device is to capture the visual appearance of
the magazine that has the product that a user is interested in.
This query image must be a planar object. In a smart device
of the user, we use histogram of color to detect color of the
visual query of product to filter out pages of magazines in
database that cannot match with the magazine user want to
get information. Then, user will send visual query image
with a collection of candidates to server for matching and
searching information.
A server will receive the query photo and verify the best
match for the visual query photo in the collection of
candidate sent by user’s mobile device. After this step, the
server will find all products in the visual query photo and
send a collection of products to user. Then, user will select
one of them to get the information of this product.
3.2 Lightweight Filtering This module is the pre-processing step for the visual query
process. The purpose is filter out quickly books, magazines
in database that cannot be candidates for next step. Another
gain of this step is comparison between the images with low
computational cost. Thus, this step is processed on user
mobile.
Dominant colors in an external appearance of a book,
magazine is the main factor for customers to detect which
book or magazine is. Not only that, an external appearance
of a book, magazine usually has only few dominant colors
so that customers can remember and easily recognize that
book from first sight. This natural approach can be
employed to execute the lightweight filtering module for
our proposed system. In our implementation, the
lightweight filtering module uses only the distributions of
colors of two images to evaluate the dissimilarity between
them. The color distribution of an image is used as its
lightweight feature.
RGB (Red, Green, and Blue) space color is the most
common model to encode colors, it is not appropriate to
represent the photosensitivity of human. Difference to
RGB, HSV model reflects on human color perception. In
HSV color space, each component takes a different role. H
represents Hue responding to color element, Saturation (S)
refers to the dominance of hue in the color, and finally
Value (V) is the brightness of the color. In reality, when
capturing a book, magazine, the brightness factors are
different. Therefore, lightweight filtering module needs to
reduce the effect of brightness conditions in comparing two
images. Thus, we use the Hue component to create
lightweight visual feature of an image.
Let MaxH be the maximum value of Hue. In practice, MaxH
= 360 (degree). Let nH > 0 be the total number of bins for
Hue channel. We calculate the nH-bin histogram of Hue
channel for each image as follows:
Ikk nInIH /)()( for Hnk 0
where nk(I) is the number of pixels in I with Hue value in
[kMaxH /nH, (k+1) MaxH / nH] and nI is the total number of
Page 5
pixels in I. The nH bin histogram of Hue channel of an
image is its lightweight feature.
Based on nH-bin histogram of Hue channel for each image,
we apply dissimilarity measure between features of a query
image and an external appearance of a book image to
calculate the difference between two images. There are two
main categories: bin-to-bin distance and cross-bin distance
[19].
The bin-to-bin distance is sensitive to quantization, i.e. size
of a bin. When the number of bins decreases, the robustness
is increasing but the distinctiveness is decreasing, vice
versa. In order to achieve both the robustness and
distinctiveness, we use a cross-bin distance such as
Quadratic-Chi histogram distance [19].
Query image Cover 1 Cover 2
Histograms of the query image and three covers
Figure 3. Lightweight filtering with a query image and
three covers.
The Quadratic-Chi histogram distance between a query
image I* and a logo Ik is determined as follows:
where Ai,j is the similarity between bin i and j. After
calculating the distance between query image and cover
image, we choose no more than nk candidate books,
magazines whose distance less than a threshold H.
This module is illustrated in Figure 3 with a query image
and three cover images. The image of magazine 1 has
histograms similar to the query image’s histogram and the
peaks of their histograms are at bin 4. On the other hand,
the magazine 2 has a histogram with peak at bin 6, thus it
cannot be a candidate because it is dissimilar to the query
image.
3.3 Product Matching After filtering out books, magazines in lightweight filtering
module, the next step is to verify if each of the candidate
book, magazine found in the lightweight filtering step can
be accepted as the result of the visual query process. Our
main purpose is to find books, magazines with similar
visual appearance to a query image I*, so that we can apply
template matching in this step.
Template matching is a technique for finding a sub-template
in an image. This technique can be divided into two
approaches: template-based approach and feature-based
approach.
Template-based approaches use color information of a
template as global features to determine the similarity
between a template and an extracted pattern from a source
image. Sum-comparing metric (such as Sum of Squared
Differences (SSD), Sum of Absolute Differences (SAD),
Cross-Correlation [20]) is a measure to determine the best
location of template image in an image. These methods are
simple, easy to implement, and can perform with less
texture objects. However they are not robust with scale,
rotation transformation, and change of view point.
Feature-based approaches use local features such as edges
[16], corners [25], blobs [17][18] and a similarity measure
to find the best match between local features in a template
image and a source image. Because these methods are
robust with scale, rotation, and change of view point, they
are suitable for matching an image in which a book,
magazine can be captured in different scales, poses, and
orientations. Furthermore, a template for each book,
magazine is large enough and has sufficient texture for this
approach.
With each candidate selected in lightweight filtering
module, if the template T (candidate) can be matched with a
query image I*, the corresponding book, magazine is
considered as a result of the visual query process. This
Page 6
process consists of two main steps: key point extraction and
key point matching between template image and query
image.
In the first step, we extract key points from the query image
I*. Each key point is a blob-like structure described by its
center and the properties of its neighbor region. Scale
Invariant Feature Transform (SIFT) by D. Lowe[17] is the
most popular feature based method. In this method, each
key point is described with the descriptor of 128-
dimensional vector. The main advantage of this method is
invariant with scale, rotation, illumination and viewpoint.
Another method is Speeded-Up Robust Features (SURF)
[18]. The descriptor of a key point in this method is just a
64-dimensional vector. Therefore, the speed of key points
extraction in SURF is faster than SIFT. Thus, we decide to
use SURF because it is not only faster than SIFT but also
invariant with scale, rotation, illumination and viewpoint.
In the next step of matching process, we match key points
between a template T and a query image I* to consider each
template can be a result of visual query or not.
Let T and I* be the key point collections T and I*
respectively. For each key point p in T, we find its
corresponding key point q in I* by the nearest neighbour
search. The pair (p, q) is called a match and is only valid if
the distance between p and q is not greater than a threshold
M. Now we have a collection T,I* of key point matches
between T and I*.
If the template T is a result of visual query process, we can
find the matrix M that can map most of the keypoints in T
into I*. In our proposed system, we use RANSAC
method[21]. This method estimates the homography
transform M from a subset T,I* with randomly selected of
matches (usually with no less than 5 matches) between two
images, then count the number of outliers, i.e. matches that
do not support the estimated transform. This selection
process repeats until the number of iterations exceeds a
threshold.
Let M0 be the best homography transform with minimum
number of outliers found in RANSAC process. If the
number of outliers corresponding to M0 is less than a
threshold H, the template T is accepted as one result of the
query image I*. Otherwise, the template T can be
considered as conditionally accepted if the number of
matches |T,I*| is greater than a given threshold. If the
number of matches |T,I*| is lower than the threshold, this
template T is rejected.
Figure 4 shows an example of template matching using
SURF features with the book, magazine cover T (left) and
the query image I* (right). Each line mapping from the left
to the right is a pair of corresponding SURF features.
Figure 4. Detect a book, magazine cover in a query
image using SURF features.
4. EXPERIMENTS We present experiments to test different properties of our
proposed system, including three main tasks: evaluating the
efficiency of Lightweight Filtering (c.f. Section 5.1), the
performance of using Lightweight Filter (c.f. Section 5.2),
and the accuracy of Template Matching (c.f. Section 5.3).
The experiments is done based on the system running Core
Quad 2.4 GHz (with 2GB RAM) and a graphic card
GeForce GTX 460 (1GB memory)
4.1 Efficiency of Lightweight Filtering The experiment is set in order to evaluate efficiency in
limiting matching candidates by Lightweight Filter.
We have observation that most of the magazines or journals
have not many pages having similar color distribution,
except pages only use two colors (black and white) and the
ones without photos. To verify, we collect 30 magazines of
10 kinds of magazines as Table 1. For each magazine, we
Page 7
classify to groups having similar color distribution by using
Lightweight Filter.
The experiment shows that the number of pages for each
group is smaller than the total pages of the magazine a lot.
In the worst case, the percentage of the number of page for
each group is equal to 10.29% total pages of magazine in
this group.
Table 1. Efficiency of Lightweight Filtering
Magazine Number
of pages
Maximum number of
pages in a group of
pages with similar
color distribution
Tiep thi gia dinh 158 9
Tuoi Tre
(Sunday Edition)
44 3
Thanh Nien
(Weekly edition)
68 7
Echip mobile 60 3
Game world 82 4
PC world 132 11
Sai Gon Saturday 44 3
The gioi dien anh 84 4
Kien truc va doi song 110 6
Sieu thi o to 154 7
4.2 Performance of using Lightweight Filter The experiment is set to compare the performance of the
system using Lightweight Filter and not using Lightweight
Filter. In this experiment, our dataset include 150
magazines. Each of them has from 50 to 150 pages. We
divide the dataset into 5 small ones with different sizes:
200, 300, 500, 800, and 1000 pages.
For each dataset, we perform 100 visual queries with
different input images. For each visual query, we conduct
the visual query process in two situations: without
Lightweight Filtering and with Lightweight Filtering. The
experimental results are illustrated in Figure 5. In the first
situation, a query image I* is matched with each cover in a
dataset. Thus the total time to process a query linearly
increases with the number of covers in that dataset. In the
second context, only the top nk candidate covers are
considered for matching with SURF features. In our
experiment, we choose nk = 5.
In Figure 5, the time to process a visual query in the second
case increase slightly with the total number of product logos
in a dataset because image matching (with SURF features)
is only executed with no more than nk candidate covers for
each query. The average elapsed time is slightly higher than
the total time for matching a query image with nk = 5
candidates because of the extra time to perform the
Lightweight Filtering.
Figure 5 Comparison between the performance (in
milliseconds) of processing a visual query with and
without Lightweight Filtering.
4.3 Accuracy of Template Matching The experiment is to evaluate the accuracy of using
Template Matching. The covers captured by mobile devices
are matched with the datasets. We conduct the experiments
in four scenarios: a cover is obscured by fingers, a plastic
cover makes glare lighting, a cover with shadow, and a
cover with motion blur because of fast movement. Figure 6
illustrates sample images in 4 scenarios.
In each scenario, we detect 40 magazine covers and for
each cover, we process in 300 frames. The accuracy
percentages are shown in Table 2. In motion blur situation,
the cover cannot be detected in consecutive frames.
However, we can sterilize the result by applying Kalman
Filter[23] to correct detection and make the processing
more smoothly and exactly.
(a) Being obscured (b) Glare lighting
Page 8
(c) Shadow (d) Motion blur
Figure 6 Sample images of 4 scenarios:
(a) Being obscured, (b) Glare lighting,
(c) Shadow, (d) Motion blur
Table 2. Accuracy of Template Matching
Scenario Without
Kalman Filter
With
Kalman Filter
(a) Being obscured 89.8% 93.4%
(b) Glare lighting 83.2% 88.3%
(c) Shadow 92.6% 96.8%
(d) Motion Blur 84.8% 91.8%
5. SAMPLE USAGE SCENARIOS OF THE
PROPOSED SYSTEM In this section, we briefly present several features of our
proposed system in practical contexts.
Figure 7 shows an example of a regular page of a magazine
or a newspaper with extra information and multimedia
objects marked in this page. When the page is detected, a
user can interact with each augmented object in this page to
invoke the performance of that object.
In Figure 8, an audio clip and a color photo are augmented
into a regular article in Thanh Nien newspaper. Through the
mobile device, the grayscale photo in the printed article is
replaced by color photos. When a user touches on the icon
of the audio clip, he or she can listen to the whole content
of that article.
Figure 7 Extra information and multimedia objects are
marked into a regular page of a magazine or a newspaper
Figure 8 An audio clip and a color photo are augmented
into an article in Thanh Nien newspaper
Figure 9 demonstrates the proposed system with a tablet. A
video clip corresponding to an article in the first page of
Tuoi Tre newspaper is being played when a reader read this
article through the tablet. First, the first frame of the video
clip is displayed to replace the grayscale photo in the
article. When a user invokes this video clip by touching into
the touchscreen of the tablet, he or she will watch this clip
in different sizes and views, e.g. projective view of
fullscreen view.
Page 9
Figure 9 A video clip is embedded into a regular article
in Tuoi Tre newspaper
6. CONCLUSION In this paper, we introduce a system that provides extra
information for user by applying Augmented Reality
technology to the traditional newspaper. A user can have
content of extra information about a product or news as
soon as he/she reads it in the first time.
Lightweight filter is the important feature of the system to
filter out quickly candidates that do not match with the
product/news. The experiment shows that the system can
process in real-time and can be applied in practice.
In the future, we can apply parallel processing in matching
step to improve the performance of the system. We display
not only multimedia information (videos, clips, and detail
about product) but also social media content (comment,
“like”, rating) from social networks.
7. ACKNOWLEDGEMENT This research was supported by John von Neumann
Institute – Vietnam National University and Faculty of
Information Technology, Ho Chi Minh University of
Science – Vietnam National University.
REFERENCES [1] Ig-Jae Kim, “Introduction to augmented reality and its
applications”, ACM SIGGRAPH ASIA 2010 Courses (SA
'10), 2010
[2] Kompas Augmented Reality, http://www.kompas.com/ar
[3] Tissot Reality, http://www.tissot.ch/reality
[4] http://www.commbank.com.au/about-us/news/media-
releases/interactive/iphone
[5] Daniel Wagner, Dieter Schmalstieg, “ARToolkitPlus for
Pose Tracking on Mobile Devices”, Proceedings of 12th
Computer Vision Winter Workshop (CVWW07), 2007, tr.
139-146.
[6] Shalin Hai-Jew, “Virtual Immersive and 3D Learning
Spaces: Emerging Technologies and Trends”, IGI Global,
2010
[7] Michael Haller, Mark Billinghurst, Bruce Thomas,
“Emerging Technologies of Augmented Reality: Interfaces
and Design”, IGI Global, 2006
[8] Nils Petersen, Didier Stricker, “Continuous natural user
interface: Reducing the gap between real and digital world”,
in Proceedings of the 8th IEEE International Symposium on
Mixed and Augmented Reality 2009, ISMAR 2009, tr. 23-26
[9] Sandy Martedi, Hideaki Uchiyama, Guillermo Enriquez,
Hideo Saito, Tsutomu Miyashita, Takenori Hara, “Foldable
augmented maps”, in Proceedings of the 9th IEEE
International Symposium on Mixed and Augmented Reality
2010, ISMAR 2010, tr. 65-72
[10] M. Knecht, C. Traxler, O. Mattausch, W. Purgathofer, M.
Wimmer, “Differential Instant Radiosity for Mixed Reality”,
ISMAR 2010, pp. 99-107 (2010).
[11] W. Lee, Y. Park, V. Lepetit, “Point-and-Shoot for
Ubiquitous Tagging on Mobile Phones”, ISMAR 2010, pp.
57-64 (2010).
[12] A. Mohan, G. Woo, S. Hiura, Q. Smithwick, R. Raskar.
Bokode, “Imperceptible Visual Tags for Camera-based
Interaction from a Distance”, SIGGRAPH 2009 (2009).
[13] Hyun S. Yang, Kyusung Cho, Jaemin Soh, Jinki Jung, and
Junseok Lee. 2008, “Hybrid Visual Tracking for Augmented
Books”, In Proceedings of the 7th International Conference
on Entertainment Computing (ICEC '08).
[14] Rosten, E., Drummond, T., “Fusing points and lines for high
performance tracking”, In: 9th IEEE International
Conference on Computer Vision, pp. 1508–1511 (2005).
[15] Nobuko Taketa, Kenichi Hayashi, Hirokazu Kato, and Shogo
Noshida. 2007, “Virtual pop-up book based on augmented
reality”, In Proceedings of the 2007 conference on Human
interface: Part II, Michael J. Smith and Gavriel Salvendy
(Eds.). Springer-Verlag, Berlin, Heidelberg, 475-484.
[16] J. Shi and C. Tomasi, “Good Features to Track”. In IEEE
Conference on Computer Vision and Pattern Recognition,
pp. 593 – 600, 1994.
[17] D. G. Lowe, “Distinctive Image Features from Scale-
Invariant Keypoints”, International Journal of Computer
Vision (IJCV), pp. 91-110, 2004.
[18] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “SURF:
Speeded Up Robust Features”, Computer Vision and Image
Understanding (CVIU), pp. 346-359, 2008.
[19] O. Pele and M. Werman. “The quadratic-chi histogram
distance family”, In Proceedings of European conference on
Computer vision (ECCV). pp. 749-762 (2010).
[20] J. P. Lewis. “Fast normalized cross-correlation”, In Vision
Interface, Canadian Image Processing and Pattern
Recognition Society. pp. 120 – 123 (1995).
[21] M. A. Fischler, R. C. Bolles. “Random Sample Consensus: A
Paradigm for Model Fitting with Applications to Image
Analysis and Automated Cartography”, Comm. of the ACM,
Vol 24, pp 381-395 (1981)
Page 10
[22] M. Fiala, “ARTag, a fiducial marker system using digital
techniques”, Conference on Computer Vision and Pattern
Recognition, pp. 590-596, 2005.
[23] R. E. Kalman, “A New Approach to Linear Filtering and
Prediction Problems”, Transaction of ASME-Journal of
Basic Engineering, 1960.
[24] M. Ozuysal, M. Calonder, P. Fua, V. Lepetit, “Fast keypoint
recognition using random ferns”, IEEE Transactions on
Pattern Analysis and Machine Intelligence ,448-461, 2010.
[25] J. Canny, “A Computational Approach to Edge Detection”,
In IEEE Transactions on Pattern Analysis and Machine
Intelligence, pp. 679-698, 1986.