Towards Automated Web Design Advisors Melody Y. Ivory Marti A. Hearst School of Information Management & Systems UC Berkeley IBM Make IT Easy Conference.

Towards Automated Web Design Advisors

Melody Y. Ivory Marti A. HearstSchool of Information Management & SystemsUC Berkeley

IBM Make IT Easy ConferenceJune 4, 2002

The Problem:Poor Website Design by Non-Professionals

A Solution

Automatic recommendations and context-specific guidelines.

“Grammar checkers” for web design– Create good templates to incorporate

into web design tools– Compare current design to high-

quality designs and show differences

The WebTango Goal

•Predictions•Similarities•Differences•Suggestions•Modification

Quality Checker

User’s DesignProfiles

High Quality Designs

The ApproachDevelop Statistical Profiles

1. Create a large set of measures to assess various design attributes

2. Obtain a large set of evaluated sites3. Create models of good vs. avg. vs. poor

sitesTake into account the context and type of site

4. Use models to evaluate other sites 5. Use models to suggest improvements

Idea: Reverse engineer design patterns from high-quality sites and use to assess the

quality of other sites

Step 1: Measuring Web Design Aspects Identified key aspects from the

literature– Extensive survey of Web design literature:

texts from recognized experts; user studies• amount of text on a page, text alignment, fonts, colors,

consistency of page layout in the site, use of frames, …

– Example guidelines• Use 2–4 words in text links [Nielsen00].• Use links with 7–12 useful words [Sawyer & Schroeder00].• Consistent layout of graphical interfaces result in a 10–25%

speedup in performance [Mahajan & Shneiderman96].

– There are no theories about what to measure

157 Web Design Measures(Metrics Computation Tool)

Text Elements (31)# words, type of words

Link Elements (6)# graphic links, type of links

Graphic Elements (6)# images, type of images

Text Formatting (24)# font styles, colors, alignment, clustering

Link Formatting (3)# colors used for links, standard colors

Graphics Formatting (7) max width of images, page area

Page Formatting (27)quality of color combos, scrolling

Page Performance (37)download time, accessibility

Site Architecture (16)consistency, breadth, depth

TE LE GE

TF LF GF

information, navigation,& graphicdesign

experiencedesign

Word Count: 157

Content Word Count: 81

Body Word Count: 94

Step 2: Obtaining a Sample of Evaluated Sites Webby Awards 2000

– Only large corpus of rated Web sites 3000 sites initially

– 27 topical categories• Studied sites from informational categories

– Finance, education, community, living, health, services

100 judges– International Academy of Digital Arts & Sciences

• Internet professionals, familiarity with a category

– 3 rounds of judging (only first round used)• Scores are averaged from 3 or more judges• Converted scores into good (top 33%), average (middle

34%), and poor (bottom 33%)

Webby Awards 2000 6 criteria

– Content– Structure &

navigation– Visual design– Functionality– Interactivity– Overall experience

Scale: 1–10 (highest)

Nearly normally distributed

Example Page from Good Site

Example Page from Avg. Site

Example Page from Poor Site

The Data Set Downloaded pages from sites

– Downloads informational pages at multiple levels of the site

Computed measures for the sample– Processes static HTML, English pages

• Measures for 5346 pages• Measures for 333 sites

– Categorize by • Topic: education, health, finance, …• Page Type: content, homepage, link page,

Step 3: Creating Prediction Models

Statistical analysis of quantitative measures– Methods

• Classification & regression tree, linear discriminant classification, & K-means clustering analysis

– Context sensitive models

• Content category, page style, etc.

– Models identify a subset of measures relevant for each prediction

??Good

Average

Page-Level Models (5346 Pages)

Model Method Accuracy

Overall page quality~1782 pgs/class

C&RT 96% 94%

Content category quality~297 pgs/class & cat

LDC 92% 91%

ANOVAs showed that all differences in measures were significant (good vs. avg, good vs. poor, etc.)

Page-Level Models (5346 Pages)

Model Method Accuracy

Page type quality~356 pgs/class & type

LDC 84% 78%

Overall page quality C&RT 96% 94% 93%

Content category quality LDC 92% 91% 94%

ANOVAs showed that all differences in measures were significant (good vs. avg, good vs. poor, etc.)

Page Type Classifier (decision tree)Home page, content, form, link, other1770 manually-classified pages, 84% accurate

Clustering Good Pages K-means clustering to

identify 3 subgroups ANOVAs revealed key

differences– # words on page, HTML

bytes, table count Characterize clusters as:

– Small-page cluster (1008 pages)

– Large-page cluster (364 pages)

– Formatted-page cluster (450 pages)

Use for detailed analysis of pages

Small page

Large page

Formatted page

Step 4: Evaluate Other Sites Make predictions for an existing

design– good, average, poor– How do the scores on th emetrics vary from

good pages?

Example

Site drawn from Yahoo Education/Health– Discusses training programs on numerous

health issues– Chose one that looked good at first glance,

but on further inspection seemed to have problems.

– Only 9 pages were available, at level 0 and 1

– Not present in the original study

Sample Content Page (Before)

Page-Level Assessment Decision tree predicts: all 9 pages

consistent with poor pages– Content page does not have accent color;

has colored, bolded body text words• Avoid mixing text attributes (e.g., color, bolding, and

size) [Flanders & Willis98] • Avoid italicizing and underlining text [Schriver97]

Page-Level Assessment Cluster mapping

– All pages mapped into the small-page cluster

– Deviated on key measures, including• text link, link cluster, interactive object, content link

word, ad• Most deviations can be attributed to using graphic links

without corresponding text links– Use corresponding text links [Flanders &

Willis98,Sano96]

Link Count Text Link

Good Link Word Count

Font CountSans Serif Word Count

Display Word Count

Towards Automated Web Design Advisors Melody Y. Ivory Marti A. Hearst School of Information Management & Systems UC Berkeley IBM Make IT Easy Conference.

good site slide

poor site slide

distributed slide

web design aspects

current design

web design tools

poor sites

poor website design

Documents

1 User Interfaces for Information Access Marti Hearst IS202,...

Empirically Validated Web Page Design Metrics Melody Y....

Relevance Feedback Prof. Marti Hearst SIMS 202, Lecture 24.

SIMS 247: Information Visualization and Presentation Marti.....

1 Email Viz Future Directions Marti Hearst UC Berkeley.

Melody Y. Ivory and Marti A. Hearst UC...

I213: User Interface Design & Development Marti Hearst March...

Empirical Foundations for Web Site Usability Marti Hearst...

Some Thoughts on Tagging Marti Hearst UC Berkeley.

Information Seeking Behavior Prof. Marti Hearst SIMS 202,...

Text Visualization Marti Hearst Guest Lecture, i247, Spring....

Foundations of Software Design Fall 2002 Marti Hearst

1 i247: Information Visualization and Presentation Marti...

1 i206: Lecture 18: Regular Expressions Marti Hearst Spring....

1 CS188 Guest Lecture: Statistical Natural Language...

Automating Assessment of WebSite Design Melody Y. Ivory and....