Hashtags in online social networks have gained tremendous popularity during the past five years. The resulting large quantity of
data has provided a new lens into modern society. Previously, researchers mainly rely on data collected from Twitter to study either
a certain type of hashtags or a certain property of hashtags. In this
paper, we perform the first large-scale empirical analysis of hashtags shared on Instagram, the major platform for hashtag-sharing.
We study hashtags from three different dimensions including the
temporal-spatial dimension, the semantic dimension, and the social
dimension. Extensive experiments performed on three large-scale
datasets with more than 7 million hashtags in total provide a series of interesting observations. First, we show that the temporal
patterns of hashtags can be categorized into four different clusters, and people tend to share fewer hashtags at certain places and
more hashtags at others. Second, we observe that a non-negligible
proportion of hashtags exhibit large semantic displacement. We
demonstrate hashtags that are more uniformly shared among users,
as quantified by the proposed hashtag entropy, are less prone to
semantic displacement. In the end, we propose a bipartite graph
embedding model to summarize users’ hashtag profiles, and rely
on these profiles to perform friendship prediction. Evaluation results show that our approach achieves an effective prediction with
AUC (area under the ROC curve) above 0.8 which demonstrates the
strong social signals possessed in hashtags.
History
Preferred Citation
Yang Zhang. Language in Our Time: An Empirical Analysis of Hashtags. In: The Web Conference (WWW). 2019.
Primary Research Area
Trustworthy Information Processing
Name of Conference
The Web Conference (WWW)
Legacy Posted Date
2019-02-04
Open Access Type
Unknown
BibTeX
@inproceedings{cispa_all_2797,
title = "Language in Our Time: An Empirical Analysis of Hashtags",
author = "Zhang, Yang",
booktitle="{The Web Conference (WWW)}",
year="2019",
}