Top Banner
Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised by Dr Markus Roggenbach Department of Computer Science University of Wales Swansea Nov. 2005 @ Gregynog
16

Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Dec 25, 2015

Download

Documents

Claribel Tate
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

Visualization of the Popularity of the Web Access

for Ping Wales

Xiaochuan Huang (George)

Supervised by Dr Markus RoggenbachDepartment of Computer Science

University of Wales SwanseaNov. 2005 @ Gregynog

Page 2: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

Overview

1. A Regular Website Report

2. Specification

3. Technology Involved

4. A First Approach

Page 3: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

1. A Regular Website Report

What the project is aboutOur customer, Ping Media Ltd; the website, Ping Wales;

What they need; and the technical infrastructure

Page 4: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

1. A Regular Website Report

What the project is about

Introducing similar toolsLog file analyzers;The AWStats and Analogs 6.0;Graphic statistics generated by AWStats and Analog

Page 5: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

1. A Regular Website Report

Page 6: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

1. A Regular Website Report

What the project is aboutOur customer, Ping Media Ltd; the website, Ping Wales;What they need; and the technical infrastructure

Introducing similar toolsLog file analyzers;The AWStats and Analogs 6.0;Graphic statistics generated by AWStats and Analog

Why this application is necessaryCustomer’s needs; The shortage of existing applications;Extendable project

Page 7: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

2. Specification

ComponentsThe filter/parser;The analyzer;Two databases;Visualization

Going through the processesTake daily log file -> parse with DB1 -> output filtered result -> write result into DB2Given a specified duration -> access DB2 -> generate the records -> output an visualized report

Page 8: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

3. Technologies Involved

The Apache log filesIntroduction;

Page 9: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

3.Technologies Involved

The Apache log filesIntroduction;Format;"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined220.244.224.104 - - [12/Jan/2005:00:12:38 +0000] "GET /hardware/toshiba-small-80gb-hdd.html HTTP/1.0" 200 11020 "http://www.pingwales.co.uk/business/apple-keynote.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041204 Epiphany/1.4.4"

Page 10: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

The Apache log filesIntroduction;Format

"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined220.244.224.104 - - [12/Jan/2005:00:12:38 +0000] "GET /hardware/toshiba-small-80gb-hdd.html HTTP/1.0" 200 11020 "http://www.pingwales.co.uk/business/apple-keynote.html" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041204 Epiphany/1.4.4"

Log string analysis:(%h) 220.244.224.104: the IP address of the client (%l) The RFC 1413, identity of the client (%u) The userid of the requesting person(%t) [12/Jan/2005:00:12:38 +0000]: the request time(\"%r\") "GET /hardware/toshiba-small-80gb-hdd.html HTTP/1.0" method, request page,

client protocol(%>s) 200: the status code (%b) 11020: the size of the object returned to the client (\"%{Referer}i\") the site that the client reports having been referred from. (\"%{User-agent}i\") identifying information of client browser

Page 11: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

3. Technologies Involved

The Apache log files

Programming language – Rubyinterpreted scripting language for quick and easy

object-oriented programming

% rubyputs "Hello, world!“^DHello, world!

% cd sample% ruby eval.rbruby> a = "Hello, world!" "Hello, world!“ruby> puts a Hello, world!Nilruby> ^D%

Page 12: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

3. Technologies Involved

The Apache log files

Programming language – Ruby

Database accessMySQL,

The two databases

Access DB with Ruby

Page 13: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

4. A First Approachload the daily log fileParsing/Filteringwhile not end of file

read hit, line by linefor each hit, getIP(%h), getTime(%t), getReq(\"%r\"), getSt(%>s)

Check if even(first( getSt() )), then go through the articles database looking for getIP()

if there is, write such hit to database 2, read nextgo to next hit

AnalyzingSpecify StartingTime, EndTime, build an array/stack: myArrayRead through records from database 2, for those within the specified time

for each hit,if getIP() is in myArray, then counter+=1otherwise, write this hit to myArray, initial counter

Sort myArray according to counter of each elementWrite out the result of top Ns to file, for visualizing

Page 14: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Water flow model Take daily log file -> parse with DB1 -> output filtered result -> write result into DB2Given a specified duration -> access DB2 -> generate the records -> output an visualized report

Daily Log File

FilterDatabase 1

<webpage add DB>

Database 2<page visits records>

VisualizationTool

GraphicReport

AnalyzerPeriod entry Records

Page 15: Visualization of the Webpage Popularity for Ping Wales Visualization of the Popularity of the Web Access for Ping Wales Xiaochuan Huang (George) Supervised.

Visualization of the Webpage Popularity for Ping Wales

Summary

What I have done so far

&

What I am planning to do next