YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: Detecting Communities in Science Blogs

Detecting Communitiesin Science Blogs

Christina K. [email protected]

 

 http://terpconnect.umd.edu/~cpikas/ScienceBlogging

Page 2: Detecting Communities in Science Blogs

Problem Area

• eScience includes using electronic tools both for conducting science and for communicating about science

• There are an abundance of tools both online and offline to help scientists communicate

• Lots of scientists and members of the interested public maintain blogs (~2500?)

• Ultimate Questions:Why? With whom are scientists communicating? What are scientists communicating about? What is the value to the scientists and to science?

Page 3: Detecting Communities in Science Blogs

Specific Problem Addressed

• What is the nature of the science blogosphere?– What is its shape?– Who are the central participants?– What is the connectivity?– Where are the potential information flows?

Page 4: Detecting Communities in Science Blogs

Outline

•Background

•Methods–Data gathering

–Analysis

•Results

•Discussion

Page 5: Detecting Communities in Science Blogs

Background: Blogs

• Defined by format– Individual posts, with permanent URLs– Comments

• Links– In content– In blogroll– In comments and trackbacks

• Community develops around single blogs and among blogs through commenting

Page 6: Detecting Communities in Science Blogs

Posts

Links to StaticPages

Links and automatically generated content

http://dorigo.wordpress.com/

Page 7: Detecting Communities in Science Blogs

Access to posts by search and older posts using the calendar

A list of most recent posts is automatically generated

Page 8: Detecting Communities in Science Blogs

A list of categories the blogger used to describe his posts. Clicking will list all of the posts in that category.

The blogroll is a list of blogs the author reads or endorses to some extent.

Access to the older posts by month.

Page 9: Detecting Communities in Science Blogs

The individual post page looks a lot like the blog home page

Page 10: Detecting Communities in Science Blogs

But with Comments, which may be signed with thethe commenter’s URL

And a form to leave your own comment. Typically your e-mail will not appear on the site

Page 11: Detecting Communities in Science Blogs

Background: Social Network Analysis

• Uses connections between actors to understand potential flows of informationand influence

• Uses graph theoretic methods to find– Central or prestigious actors– Cohesive subgroups including communities

Page 12: Detecting Communities in Science Blogs

Methods: Sample Selection

Operational Definition of Science Blog

• Blogs maintained by scientists that deal with any aspect of being a scientist

• Blogs about scientific topics by non-scientists

Omitted

• Primarily political speech

• Ones maintained by corporations

• Non-English language

Page 13: Detecting Communities in Science Blogs

Methods: Data Gathering

• Two Networks: Links and Commenters

• Link Data (Blogroll)– Used seed list developed in previous study

using directories and searches

– Snowball sampled using links from blogrolls

– Visited and copied links

• Commenter Data– Selected most central blogs from blogroll data

– Used Perl scripts to pull the commenter URLs from each of the last 10 posts

Page 14: Detecting Communities in Science Blogs

Methods: Analysis

• Used social network analysis and graphing software

• Examined graph and calculated basic descriptive statistics

• Found centrality and prestige measures–Degree: the links in and out

–Betweenness: the number of shortest paths that flow through that node

–Closeness: short paths to other nodes

Page 15: Detecting Communities in Science Blogs

Methods: Analysis

Located cohesive subgroups

• Link methods– Components

– LS Sets

• Clustering methods

• Community detection techniques– Newman-Girvan

– Spin Glass

Page 16: Detecting Communities in Science Blogs

Results: Link Analysis (Blogroll)

• One large component

• There were 1091 nodes, 6621 arcs

• Diameter is 9

• In-degree ranges from 1 to 292, with the median in-degree of 3, and mean 6

– 10 of the top 20 blogs by in-degree are authored or co-authored by women

– 4 of the top 5 blogs by closeness are authored or co-authored by women

Page 17: Detecting Communities in Science Blogs
Page 18: Detecting Communities in Science Blogs

Results: Commenter

• 5 components, the largest with 911, others with 11 or fewer nodes

• 938 nodes (starting with the 46), 1152 arcs

• The largest component has a diameter of 5

Page 19: Detecting Communities in Science Blogs
Page 20: Detecting Communities in Science Blogs

Discussion: Links (Blogroll)

• Most of the blogs were connected in one dense component

– A result of the diffusion of blogs?

• There were a few very central blogs, and then many less central

– Typical skewed distribution

• The community of women scientists merits further study

Page 21: Detecting Communities in Science Blogs

Discussion: Commenters

• Analysis easily located a notorious commenter who leaves incendiary comments on physics and chemistry blogs– High out-degree, no links in

• Traffic on the women scientist blogs is more uniform, with frequent comments that are widely distributed among the blogs– Indicates a different use

Page 22: Detecting Communities in Science Blogs

Take Home Messages

  • The science blogosphere is densely connected with many opportunities for influence and information diffusion

• Communities tend to form within disciplinary boundaries

• An exception is the community of women scientist bloggers who are from many different disciplines

Page 23: Detecting Communities in Science Blogs

Acknowledgements

• Thanks to Dr. Jen Golbeck for supervising this work as part of an independent study

• Thanks also to– Dr. Alan Neustadtl for SNA advice– Dr. Dagobert Soergel for research advice

Page 24: Detecting Communities in Science Blogs

Christina K. Pikas

Doctoral Student

University of Maryland

College of Information Studies

[email protected]

http://terpconnect.umd.edu/~cpikas/ScienceBlogging


Related Documents