Document Type


Peer Reviewed


Publication Date


NLM Title Abbreviation


Journal/Book/Conference Title

Journal of medical Internet research [electronic resource]

PubMed ID


PubMed Central ID


DOI of Published Version


Start Page


End Page



BACKGROUND: Online health communities (OHCs) have become a major source of social support for people with health problems. Members of OHCs interact online with similar peers to seek, receive, and provide different types of social support, such as informational support, emotional support, and companionship. As active participations in an OHC are beneficial to both the OHC and its users, it is important to understand factors related to users' participations and predict user churn for user retention efforts.

OBJECTIVE: This study aimed to analyze OHC users' Web-based interactions, reveal which types of social support activities are related to users' participation, and predict whether and when a user will churn from the OHC.

METHODS: We collected a large-scale dataset from a popular OHC for cancer survivors. We used text mining techniques to decide what kinds of social support each post contained. We illustrated how we built text classifiers for 5 different social support categories: seeking informational support (SIS), providing informational support (PIS), seeking emotional support (SES), providing emotional support (PES), and companionship (COM). We conducted survival analysis to identify types of social support related to users' continued participation. Using supervised machine learning methods, we developed a predictive model for user churn.

RESULTS: Users' behaviors to PIS, SES, and COM had hazard ratios significantly lower than 1 (0.948, 0.972, and 0.919, respectively) and were indicative of continued participations in the OHC. The churn prediction model based on social support activities offers accurate predictions on whether and when a user will leave the OHC.

CONCLUSIONS: Detecting different types of social support activities via text mining contributes to better understanding and prediction of users' participations in an OHC. The outcome of this study can help the management and design of a sustainable OHC via more proactive and effective user retention strategies.


OAfund, Blogging, Data Mining, Health Services, Humans, Internet, Neoplasms, Patient Participation, Peer Group, Self-Help Groups, Social Media, Social Support, Supervised Machine Learning, Survival Analysis, Survivors

Granting or Sponsoring Agency

National Natural Science Foundation of China (Award #: 71572013)

Grant Number


Journal Article Version

Version of Record

Published Article/Book Citation

J Med Internet Res 2017;19(4):e130


©Xi Wang, Kang Zhao, Nick Street. This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.