Top Banner
1 Supporting End Users in the Creation of Dependable Web Clips Sandeep Lingam, Sebastian Elbaum Proceedings of the 16th international conference on Wor ld Wide Web (WWW2007) Reporter: Shih-Feng Yang 2007/7/2
27

Supporting End Users In The Creation Of Dependable Web Clips

Jan 28, 2015

Download

Economy & Finance

tomelf2007

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Supporting End Users In The Creation Of Dependable Web Clips

1

Supporting End Users in the Creation of Dependable Web Clips

Sandeep Lingam, Sebastian Elbaum

Proceedings of the 16th international conference on World Wide Web (WWW2007)

Reporter: Shih-Feng Yang

2007/7/2

Page 2: Supporting End Users In The Creation Of Dependable Web Clips

2

Outline

Introduction Web Clipper Evaluation Conclusion

Page 3: Supporting End Users In The Creation Of Dependable Web Clips

3

Introduction

Web authoring environments have enabled end-users who are non-programmers to design and quickly construct web pages.

Web clip : a component within the end-user’s website which can dynamically extract information from other web-sources.

Page 4: Supporting End Users In The Creation Of Dependable Web Clips

4

Introduction

Web Clip

Page 5: Supporting End Users In The Creation Of Dependable Web Clips

5

Introduction

Goal Web clipper : An approach to support end-users

through the entire process of creating a dependable web clip.

Three fundamental aspects:1. Our tool will be embedded in the web authoring tool

interface.

2. Training: increase the robustness of the web clip.

3. Deploy multiple filters to increase the confidence in the correctness of the retrieved information.

Page 6: Supporting End Users In The Creation Of Dependable Web Clips

6

Introduction

Challenges We can’t expect end-users to have any

programming experience about web clip. The content within the target site of a web clip

will change.

Page 7: Supporting End Users In The Creation Of Dependable Web Clips

7

Web Clipper

Approach Overview

Page 8: Supporting End Users In The Creation Of Dependable Web Clips

8

Web Clipper-Clipping Target Clip Selection

There is a custom browser for controlling the web clip. Every extractable document element is highlighted when

the user moves the mouse, and the user can make a selection by clicking on it.

Extraction Pattern Once a selection is made, an extraction pattern is

generated. During the clipping process, the user’s selection is uniquely

identified by its HTML-Path. HTML-Path : a specialized XPATH expression.

Page 9: Supporting End Users In The Creation Of Dependable Web Clips

9

Web Clipper-Clipping

Page 10: Supporting End Users In The Creation Of Dependable Web Clips

10

Web Clipper-Training To increase the robustness of the web clip, they con

struct extraction patterns which uniquely characterize the end-user selection.

Several clips will created using different extraction patterns.

Every time the user marks a clipping as valid, the system generates a filter corresponding to the clipping. Filter: Javascript code, embedded within the user’s web pa

ge.

Page 11: Supporting End Users In The Creation Of Dependable Web Clips

11

Web Clipper-Training

Validation of the extraction patterns presented by the system.

Page 12: Supporting End Users In The Creation Of Dependable Web Clips

12

Web Clipper-Training

Extraction Patterns

Page 13: Supporting End Users In The Creation Of Dependable Web Clips

13

Web Clipper-Training

Page 14: Supporting End Users In The Creation Of Dependable Web Clips

14

Web Clipper-Deployment The URL and extraction patterns of the clipped

content are used to generate an AJAX script. HTML documents -> XHTML. Relative URLs -> absolute URLs. Generate filters from pre-defined templates for each

of the extraction patterns during training. The user can move, resize or annotate the web clip

to suit her preference.

Page 15: Supporting End Users In The Creation Of Dependable Web Clips

15

Web Clipper-Filtering and Assessment

The content which the user want to see in the web clip

Page 16: Supporting End Users In The Creation Of Dependable Web Clips

16

Web Clipper-Filtering and Assessment

Page 17: Supporting End Users In The Creation Of Dependable Web Clips

17

Web Clipper-Filtering and Assessment

Page 18: Supporting End Users In The Creation Of Dependable Web Clips

18

Web Clipper-Filtering and Assessment

Then the paper defined Confidence The ratio of the maximum filter score of all valid

extraction patterns generated during the training section.

The prototype will alert the user when the content within the target site changes.

The user can also configure the web clips to provide alerts when the confidence scores fall below a particular threshold.

Page 19: Supporting End Users In The Creation Of Dependable Web Clips

19

Web Clipper-Filtering and Assessment

Label filter has the highest score, soThe system will use this pattern to extract content, andthe confidence score = 2/3 = 67%

Page 20: Supporting End Users In The Creation Of Dependable Web Clips

20

Web Clipper-Filtering and Assessment

Alert the user when the content within the target site changes

Page 21: Supporting End Users In The Creation Of Dependable Web Clips

21

Evaluation

Effectiveness of the extraction patterns used in generating web clips.

Dependability of web clips in providing sufficiently correct information over time.

Robustness of web clips to changes in the clipped web site.

Page 22: Supporting End Users In The Creation Of Dependable Web Clips

22

Evaluation Effectiveness of extraction patterns

Page 23: Supporting End Users In The Creation Of Dependable Web Clips

23

Evaluation Dependability of web clips

confidence scores

Page 24: Supporting End Users In The Creation Of Dependable Web Clips

24

Evaluation Robustness

This experiment will test the degree to which the web clips change:

1. Block Insertion

2. Block Movement

3. Block Deletion

4. Enclosing Element Changes

5. Target Clipping Removed

Page 25: Supporting End Users In The Creation Of Dependable Web Clips

25

Evaluation Robustness

Page 26: Supporting End Users In The Creation Of Dependable Web Clips

26

Conclusion

This paper presented an approach to support end-users through the entire process of creating a dependable web clip.

Web clipper addresses the shortcomings of existing tools by introducing the notion of training and of dynamic confidence evaluation.

Page 27: Supporting End Users In The Creation Of Dependable Web Clips

27

Finish

Thanks for your patience!