ICS 691: Trust and Social Capital on CouchSurfing and OkCupid BJ Peter DeLaCruz and Michael Claveria
ICS 691: Trust and Social Capital on
CouchSurfing and OkCupid
BJ Peter DeLaCruz and Michael Claveria
Abstract
Researchers in social computing traditionally evaluate online relationships under the same
assumptions that they use when evaluating offline relationships. Despite this fact, researchers also
regard online relationships as separate spheres from offline interaction. Any crossing between the two
occurs predominantly between friends, relatives, and acquaintances who are also members of those
online communities. However, some online social communities operate under the premise that people
will meet and form online relationships that translate into offline relationships through real-world
interactions. We examined two such online social communities, CouchSurfing and OkCupid, to
determine how online interactions affect potential offline interactions.
Introduction
In many online social communities, members remain anonymous and interact solely through the
mediums of online communication without the intention of face-to-face encounters. However, with the
advent of Web 2.0 technologies, online participation produced more dynamic relationships between
users in online environments. In some online communities, face-to-face interactions between users are
essential for the success of the community. One example is CouchSurfing [1], an online community
where users agree to host or visit other people who live in different areas. A dating website like
OkCupid [2] is another example on which people typically sign up with the intention of meeting other
members in person. We address the following questions in this paper: Do online social communities
that encourage face-to-face interaction between members operate under different assumptions than
those online communities that do not? What are some factors that influence people to meet others
offline and also create a sense of trust between them?
Trust and Verification Systems on Online Communities
From a research standpoint, trust has many definitions ranging from business to sociology
depending upon different contexts. We examined trust under two different definitions. Reed defines
trust as “the foundation of any relationship, and a relationship is exactly what you are looking to
establish with your users” [3]. This definition applies to our study because of the emphasis on
developing relationships with other users. This description of trust can also apply to two people in the
offline world or two users in an online community. A second definition of trust is one person’s “reliance
on the integrity” of another [4]. This definition relates to the communities we studied because
evaluating reliability or integrity of user profiles is essential to forming relationships. For example, after
reading Joe’s profile on a social networking site like Facebook and MySpace, if Bill did not find his
personal information reliable, more than likely, he would not trust him and thus would not “friend” him
on either of those websites.
Trust exists in the offline world and can easily be established between two people with the help
of one’s observations of another’s actions, gestures, words, and personality. However, this concept
cannot easily be translated to the online community because of the anonymity of the Internet. For
instance, if Joe helped Bill on MySpace but never met him face-to-face before, are Joe’s actions enough
for Bill to trust Joe? More than likely, Bill would want to know more about the person from whom he is
getting help. Thus, it would be in Joe’s best interest to be as truthful as possible or at least control what
he writes on his MySpace profile. Depending on the information on his profile, Joe’s chances of being
trusted by Bill would increase.
Massa [5] discusses “trust statements,” which are opinions expressed by one user of another,
and gives examples of online communities that use them to help a user decide as to whether to trust
someone else online. Consider the scenario that Bill follows Joe’s reviews for products on websites like
Epinions and Amazon. If Bill found his reviews to be very valuable, he would more than likely consider
buying the next product that Joe reviews. In online communities such as CouchSurfing, this type of
system would especially be useful because “judgments entered about other users … are used to
personalize a specific user’s experience on the system” [5]. If Bill wanted to sleep over at Joe’s place but
did not contact him yet on CouchSurfing, his decision to get in touch with him would not only be
influenced by what Joe wrote on his profile page but also what other people wrote about him there.
Not all online communities would benefit from having this particular type of verification system;
in fact, it could be detrimental to both the user and the online community. OkCupid is one example in
which a review system may not work. One reason is that the reviewers would be other users with
whom one had dated. Dating is considered an exclusive practice, and people do not want to display
their history of past relationships because they consider it private information. Past transgressions may
reflect poorly on a user’s character, and displaying them publicly can be detrimental to building
relationships with others. In general, most users do not want reviews from other users who went on
failed dates because of the potential for negative reviews and the ineffectiveness of positive reviews.
The paradox of a positive review is that other users will wonder why the reviewer is not dating that
person. The problem with a negative review is the strong bias of an ex-girlfriend or ex-boyfriend. For
OkCupid, implementing a verification system comes at the risk of privacy issues. Members tend to keep
a user’s identity confidential even when complaining about a failed date with that user. However, there
are examples of users who post negative remarks about others whom they met in real life.
A negative review of a user in the forum on OkCupid
OkCupid does not advocate the creation of such reviews, but their existence remains largely
unregulated by moderators. The difficulty in establishing a system of reviews on OkCupid is that there is
little incentive and value to writing a positive review while a negative review has serious consequences
for the parties involved.
Trust is the determining factor for evaluating whether or not users in social networking
communities should meet face-to-face after meeting each other online. Thus, these websites have
implemented mechanisms or ways that help a user build trust in the person with whom he or she is
communicating. On CouchSurfing, the References section is similar to the review system on e-marketing
websites. A user writes comments about his or her host (the person who let the user stay over at his or
her place) or guest (the person who stayed over at the user’s place). The user has the option of giving a
positive, neutral, or negative rating. Interestingly, we did not see any negative ratings for users on
CouchSurfing. One reason for the dearth of negative reviews is the lack of anonymity in the review
process. A review has the writer’s profile displayed next to it; thus, the writer of a negative review risks
repercussions from the reviewed user. This inevitably leads to biased reviews as users may feel inclined
to leave neutral ratings for their hosts or guests instead and also phrase their words in such a way so as
to not embarrass or humiliate them. A negative rating could forever tarnish a user’s reputation in this
online community. However, as [5] pointed out, if these trust mechanisms are not enough for a user to
make a decision, the actual physical location (latitude and longitude) of the person can be verified by the
system. According to CouchSurfing,
Verification [helps] our community stay safe. By confirming your name and address, you show other CouchSurfers that you are who you say you are. This simple gesture strengthens the trust system that allows CS to function. Your verified profile will help other CouchSurfers feel comfortable reaching out to you, and without that comfort, our mission of bringing people together across barriers could never be realized. Getting verified shows the community your commitment to the success of the project.
On the other hand, on OkCupid, there is no verification system at all, so a user has to rely on the
information on a person’s profile and make the decision to either trust or not trust that person himself
or herself. In addition to the typical sections that one would find on a profile in an online dating website
(e.g., My Self-Summary, What I’m Doing with My Life, and I’m Really Good At), a user can upload
photographs of himself or herself, take tests, post to a forum, and answer questions that would help
increase the chances of him or her being matched with a potential date.
Using a person’s answers to questions, the algorithm on OkCupid calculates how well he or she
matches with every other user in this online community and displays a match percentage to reflect it.
The algorithm also determines whether two users should become friends or stay away from each other
by displaying friend and enemy percentages, respectively. Altogether, the three percentages can be
used to help determine the trustworthiness of an individual. For example, a user may trust another who
has very high match and friend percentages because they share similar tastes; on the other hand, if the
person has very dissimilar tastes than the user, which is indicated by a very high enemy percentage,
then the latter would probably have to peruse through his or her profile more and basically have to
work harder to build his or her trust in that person.
Another algorithm used on OkCupid determines how frequently a user would respond to a
person’s message. A user could use this feature to determine how trustworthy that person is through
status messages such as “Replies often” and “Replies very selectively.” These messages are indicative of
a user’s selectivity and popularity. However, there is one caveat about this feature. We noticed that
although the last time some users logged in was a couple of years ago, their status messages still
displayed “Replies often,” which, of course, is misleading. Instead of calculating the frequency by
averaging the number of times that a user replied to another user’s message from the time the former
registered up until the current date, the algorithm may make the calculation based on the time from
when he or she registered up until the last time he or she logged in. Thus, unless one reads a user’s
profile (or sorts the list of potential matches by Last Login), he or she may end up typing a long, detailed
message and sending it to the user before finding out that he or she would most likely never read it, and
in this setting, knowing that he or she never will is certainly frustrating for one looking for a romantic
relationship.
Social Capital on Online Communities
According to Robert Putnum [6], social capital exists in two different forms: bridging and
bonding. Bonding capital occurs between people with similar traits, and these connections are more
emotional and more meaningful than those resulting from bridging capital. Bridging capital centers on
relationships with people from different backgrounds with more frequent but less meaningful
connections than those established by bonding capital. These two types of capital are interdependent
and the downfall of one brings about the downfall of the other. One difficulty in determining the
difference between the two types of capital is that the definitions are subjective and bound by the
method through which “closeness” is measured. In an effort to distinguish and measure the different
types of social capital, Williams [7] created a matrix of social capital measures. His two-by-two matrix
divides social capital into four types: offline bridging, offline bonding, online bridging, and online
bonding. The creation of such a matrix implies that there are clear divisions between these quadrants.
However, our study of CouchSurfing and OkCupid suggests that these categories are not distinct entities
and that often times strengthening the social capital in one area affects other areas in a similar fashion.
Although Williams mentions that there is a gray area between bridging and bonding, he does
not allude to blurring between offline and online relationships. Thus, we argue that websites like
CouchSurfing and OkCupid transcend the distinction between offline and online relationships. Trust and
social capital both allow for a successful transition from an online relationship to an offline one and vice
versa. We will examine the mechanisms employed by the two aforementioned online communities
that, we believe, enable these transitions to take place.
CouchSurfing
CouchSurfing users create bridging relationships through offline interactions organized by online
communication via the website. The premise of the website is that the majority of relationship-building
occurs when a user sleeps over at another person’s house or hosts that person at the user’s house.
These relationships fall under the category of bridging because hosting is meant to be temporary and
may occur with many different users from different locations. An individual user’s success rate in
establishing relationships on CouchSurfing depends on information expressed on his or her profile. The
online profile is the means through which a user will evaluate his or her compatibility with another user.
Offline and online interactions that occur through CouchSurfing encourage the other type of
interaction in a variety of ways. Offline interaction feeds online interaction through references provided
by other users. These references are qualitative and thus give users more information about the
evaluated individual than numerical quantitative measurements. However, such evaluations like the
former make it difficult to compare quality between two users.
The reverse also occurs: online interactions through posting events on forums, for example,
encourage offline interactions. CouchSurfing has a forum in which users can post information about
community events that encourage other members to get together for social gatherings in specific areas,
and since the primary purpose of this online community is to form relationships with people travelling to
other locations, events would bring together people on CouchSurfing who would not normally meet
others in person in other online communities that do not encourage face-to-face, offline interactions.
OkCupid
OkCupid differs from CouchSurfing in that one of the main supported features of the site is
searching for romantic relationships. Both sites support offline encounters through features such as
listing contact information of other users. Unlike CouchSurfing, however, where one user’s network of
friends is visible on his or her profile, OkCupid does not have built-in ways of displaying social
connections or relationships on a user’s profile, and this lack of linking between profiles does not allow
users to obtain bridging capital and form online relationships with other people (besides those with
whom they are communicating through OkCupid’s message system).
On the other hand, OkCupid encourages bonding capital through a system that measures
compatibility using a user’s responses to a databank of questions. Just by looking at a user’s profile, one
can determine the match, friend, and enemy percentages of that user. In addition, users can tell the
search algorithm to display profiles with the highest match percentage if they were looking for a date.
We believe that the algorithm supports bonding capital because it quickly matches users who have very
high compatibility according to their responses.
OkCupid also has the ability to compare two users based on information from questions they
answered, and the algorithm used here makes definitive judgments on users. For example, the compare
feature allows a user to compare any two users based on personality traits as determined by their
responses. The ability to filter through people makes it easier for users to evaluate potential matches
and theoretically increases the chances of finding a desirable partner.
Methodology
To measure trust and social capital on CouchSurfing, we examined one hundred random user
profiles. Our hypothesis is that the more pictures that a user posts of himself or herself on his or her
profile, the more positive ratings and friends that the user will have. In addition to a user’s pictures, we
also recorded the number of friends that he or she has and the number of people who vouched for,
stayed with, and hosted that user, and the reason why we recorded these data is that there could be
other factors besides pictures that could be responsible for building trust in a user.
After perusing through profiles on OkCupid, we were unable to find any sort of verification
system that enabled us to compare it against content that the user created or posted (for example,
pictures) in order to measure trust. Also, we could not view a user’s list of friends or at least see the
profiles of those whom he or she contacted, so we were unable to measure trust this way. Like we
mentioned before, on dating websites like OkCupid, individual contact is regarded as private
information between two people. Unlike CouchSurfing, there are no vouchers, no recommendations,
and no easy way to verify contact between users other than forum participation. Nonetheless, we
collected information on users including age, gender, sexual orientation, relationship status (for
example, single or married), number of tests taken, number of forum posts, number of pictures, number
of questions answered, and response frequency. We looked for general patterns within the variables
that suggested that they affected user interaction within the OkCupid community.
Results and Data Analysis
For the CouchSurfing dataset, we examined seven major variables: age, gender, number of
positive ratings, number of photos, and number of people who vouched for, stayed with, and hosted a
given user. We examined these variables so that we could create a model that best predicted the
number of people whom a user hosted and another model that best predicted the number of people
who hosted a user. After looking at individual x-y plots of the given variable versus the number of users
who hosted or were hosted, t-tests revealed that there were relationships, although they were not
particularly strong. Looking at possible relationships, we assumed a linear model after testing for
quadratic, exponential, and logarithmic possibilities between variables.
Example plots of relationships between variables
As expected, there were correlations between variables that indicated that they affected trust.
For example, more people tended to stay with someone who has tons of vouchers than one who only
0 20 40 60 80 100 120 140
05
01
00
15
02
00
Couch Surfing Photos vs. Recommendations
No.Photos
No
.Po
s_
Ra
tin
gs
0 20 40 60 80 100 120 140
01
02
03
04
05
0
Couch Surf. Photos vs. Num. vouch
No.Photos
No
.Pe
op
le_
wh
o_
Vo
uch
0 10 20 30 40 50
02
04
06
0
Couch Surf Vouchers vs. Num. people who stay
No.People_who_Vouch
No
.Sta
y_
with
_P
ers
on
has a few. Rather than examining each pair of variables, we created a model that predicted the number
of CouchSurfers who stayed with a user based on the significant variables.
We used a linear model for predicting the number of users who stayed with a person because
individual plots between this variable and the other variables appeared linear for each case.
lm(formula = No.Stay_with_Person ~ No.Pos_Ratings + No.Photos +
No.People_who_Vouch + Gender + Age)
Residuals:
Min 1Q Median 3Q Max
-22.3910 -3.9675 -0.8579 2.3569 50.3580
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -11.32414 5.39354 -2.100 0.038445 *
No.Pos_Ratings 0.06359 0.04531 1.403 0.163800
No.Photos 0.22121 0.05738 3.855 0.000212 ***
No.People_who_Vouch -0.20647 0.25997 -0.794 0.429082
GenderM 2.25199 1.94229 1.159 0.249210
Age 0.39478 0.19029 2.075 0.040751 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 9.339 on 94 degrees of freedom
Multiple R-squared: 0.3654, Adjusted R-squared: 0.3317
F-statistic: 10.83 on 5 and 94 DF, p-value: 3.056e-08
The number of photos and age were significant factors in the model, indicating that a user’s number of
photos and his or her age were factors that helped to predict whether other CouchSurfers would want
to stay with that person. Interestingly, other factors like gender and number of vouchers and positive
ratings do not seem to be significant. However, this model did not have a very good R-squared value,
indicating that it did not explain much of the data.
Both age and the number of photos appear to be significant factors in the combined linear
model. Individual graphs reveal rather weak linear relationships. Both the number of photos and
number of people who stayed with a user seem to fit a linear model decently and are only slightly worse
in a quadratic model. Age, however, does not appear to be a significant factor.
Call:
lm(formula = No.Stay_with_Person ~ No.Photos + Age)
Residuals:
Min 1Q Median 3Q Max
-23.4743 -4.3592 -0.7885 2.7253 51.3576
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.96464 2.83465 2.104 0.0379 *
No.Photos 0.21592 0.03124 6.911 5.09e-10 ***
Age -0.16442 0.09146 -1.798 0.0753 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 9.446 on 97 degrees of freedom
Multiple R-squared: 0.3302, Adjusted R-squared: 0.3164
F-statistic: 23.91 on 2 and 97 DF, p-value: 3.612e-09
After testing a model with only age and the number of photos as the predictor variables, the
latter remains the only significant variable in predicting the number of users who stayed with a person.
0 20 40 60 80 100 120 140
02
04
06
0# who stay with person vs. # of photos
No.Photos
No
.Sta
y_
with
_P
ers
on
20 25 30 35 40 45
02
04
06
0
# who stay with person vs. age
Age
No
.Sta
y_
with
_P
ers
on
We ran a similar analysis to find the best predictor variables of the number of times that a person was
hosted.
lm(formula = No.Hosted_person ~ No.Pos_Ratings + No.Photos +
No.People_who_Vouch + Gender + Age)
Residuals:
Min 1Q Median 3Q Max
-7.2762 -2.8102 -0.5626 1.7734 17.2913
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.336465 2.379382 2.663 0.00911 **
No.Pos_Ratings -0.004237 0.019990 -0.212 0.83258
No.Photos 0.003826 0.025315 0.151 0.88018
No.People_who_Vouch 0.311009 0.114686 2.712 0.00796 **
GenderM -1.618380 0.856849 -1.889 0.06201 .
Age -0.060857 0.083946 -0.725 0.47028
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.12 on 94 degrees of freedom
Multiple R-squared: 0.3354, Adjusted R-squared: 0.3001
F-statistic: 9.488 on 5 and 94 DF, p-value: 2.38e-07
According to this model, the most significant factor in predicting how often a person gets hosted by
other users is the number of people who vouched for that user. Gender is also a fairly strong factor,
although not as strong as the number of vouchers. The other factors—age, number of photos, and
number of positive ratings—are not significant to the model.
The above boxplot suggests that females are hosted more often than males on CouchSurfing.
The individual plot comparing the number of vouchers to the number of people who hosted a person
indicates a positive correlation between vouching and hosting someone. T-tests on the individual
variables also come up significant on a .05 level, indicating that there is a correlation between gender
and the number of hosts for a person, as well as vouchers and the number of hosts.
For OkCupid, predictive models were not as helpful in displaying trust, but we found a few
interesting trends in the data. A quick look at forum posts indicate that most of the users whom we
found did not have any postings in the forum. On the other hand, there was a user who had more than
three hundred posts. The lack of posting in the forum for the vast majority of users indicates that
forums are not a primary tool for creating bridging relationships in OkCupid.
A majority of users (88 out of 100) filled out each of the sections on the profile page. Another
interesting factor that we found was that response frequency differed highly by gender. There were no
men whom we sampled who fit into the category of very selective when responding to other users’
messages. Most men (around ninety percent of them) either responded on a frequent basis or were
never messaged. On the other hand, more than sixty percent of women were either selective or very
selective in their response rate.
F M
05
10
15
20
25
Number hosted by gender
0 10 20 30 40 50
05
10
15
20
25
# of people who hosted person vs. # of vouchers
No.People_who_Vouch
No
.Ho
ste
d_
pe
rso
n
Histogram of forum posts and a plot of response rate by gender for OkCupid
Conclusion
OkCupid and CouchSurfing both have hidden factors when dealing with establishing trust and
social capital. One significant variable in both online communities is gender. Our statistical data verified
that females are hosted in CouchSurfing more often than males. Females are also more selective in
responding to messages than males on OkCupid. It appears that females are more trusted or more
desired in the CouchSurfing community when evaluating whether to host a particular user. However,
there is evidence that suggests that the built-in verification system on CouchSurfing helps to foster trust.
Vouchers were the strongest factor in determining whether to host a person. When evaluating a host,
the most important factors according to our study were the number of photos and age. This
observation suggests that visual verification plays a large role in establishing trust and promoting social
capital. The positive coefficient for the age variable suggests that older age correlates with more people
being hosted by a person. Perhaps age is associated with more responsibility, or maybe older people
are simply more likely to host other users.
Histogram of the Number of Forum Posts
NoForumPosts
Fre
qu
en
cy
0 50 100 150 200 250 300
02
04
06
08
0Response Frequency by Gender
Gendery
F M
NC
OW
Ofte
nS
ele
ctive
lyV
ery
Se
lective
ly
0.0
0.2
0.4
0.6
0.8
1.0
Evaluating the transition between online and offline relationships in sites like CouchSurfing and
OkCupid is difficult because we only got a few glimpses of offline behavior. In CouchSurfing, the
vouchers and other forms of verification come from offline meetings and occurrences. For OkCupid,
there is not a clear way to determine how a user’s offline transgressions affect that user’s online
relationships. However, there are still discernable patterns with some variables that suggest
establishing the right credentials in the community are essential for establishing trust and social capital
when engaging in offline interaction with other users.
References
[1] http://www.couchsurfing.com
[2] http://www.okcupid.com
[3] Reed, Martin. “The importance of trust.” 14 March 2007. Community Spark.
http://www.communityspark.com/the-importance-of-trust/
[4] Dictionary.com, LLC. http://dictionary.reference.com/browse/trust
[5] Massa, Paolo (2006). A Survey of Trust Use and Modeling in Current Real Systems. Trust in E-services:
Technologies, Practices and Challenges. Idea Group.
[6] Putnam, R. D. (2000). Bowling Alone: The Collapse and Revival of American Community. New York:
Simon & Schuster.
[7] Williams, D. (2006). On and off the 'net: Scales for social capital in an online era. Journal of
Computer-Mediated Communication, 11(2), article 11.