Investigation of the Diversity of News Reading Article and Browsing Trends Using Sentence-BERT
DOI:
https://doi.org/10.52731/liir.v005.302Keywords:
Diversity, News, Recommender system, Sentence EmbeddingAbstract
In modern life, platforms like a social networking service (SNS) play a crucial role by utilizing recommender systems to present useful information from vast datasets. However, the advancement of these systems has led to biases in user-exposed information, causing societal issues like public opinion conflicts and defamation. Furthermore, sentiment biases in the information viewed contribute to this problem, described as “informational malnutrition”. This highlights the need for “informational health”, where user access to various information maintains the balance of information intake they seek. In this work, we explored differences in user viewing tendencies based on the diversity of viewed articles, employing a dataset of news articles and user logs. We utilized Sentence-BERT, a natural language processing model known for its effective sentence similarity analysis, to vectorize articles and score their similarity, measuring users’ article diversity. Our analysis, considering sentiment content biases, used multiple regression. The results suggest that users with diverse viewing habits tend to prefer articles with a negative bias and that news in categories such as music, current affairs, and politics have a low contribution to the diversity of information viewed, and conversely, categories like entertainment and lifestyle content tend to have a high contribution to the diversity of information viewed.
References
Ministry of Internal Affairs and Communications, “2023 White Paper on Information and Communications,” Nikkei Printing Inc., 2023, pp. 30–31.
C. R. SUNSTEIN, “republic: Divided democracy in the age of social media,” Princeton University Press, 2018, pp. 98–136.
B. Ytre-Arne and H. Moe, “Doomscrolling, monitoring and avoiding: News use in covid-19 pandemic lockdown,” Journalism Studies, vol. 22, pp. 1–17, 2021.
K. Ohata, K. Iizuka, and H. Yatomu, “Investigation of the impact of negative news on user behavior,” The Database Society of Japan, 2023.
F. Toriumi, and T. Yamamoto, “Towards a healthy discourse platform version 2.0 - implementing informational health,” 2023; https://www.soumu.go.jp/main content/000885478.pdf.
M. Abdool, M. Haldar, P. Ramanathan, T. Sax, L. Zhang, A. Manaswala, L. Yang, B. Turnbull, Q. Zhang, and T. Legrand, “Managing diversity in airbnb search,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 2952–2960.
A. Anderson, L. Maystre, I. Anderson, R. Mehrotra, and M. Lalmas, “Algorithmic effects on the diversity of consumption on spotify,” in Proceedings of theWeb Conference 2020, 2020, pp. 2155–2165.
S. Suganuma, K. Iizuka, Y. Seki, and F. Toriumi, “Effect of Article Diversity on retention rates in Online News Service,” Proceedings of the 36th Annual Conference of the Japanese Society for Artificial Intelligence, 2022, pp. 4H1OS2a03-4H1OS2a03; https://doi.org/10.11517/pjsai.JSAI2022.0 4H1OS2a03.
F. Wu, Y. Qiao, J.-H. Chen, C. Wu, T. Qi, J. Lian, D. Liu, X. Xie, J. Gao, W. Wu, and M. Zhou, “MIND: A large-scale dataset for news recommendation,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, D. Jurafsky, J. Chai, N. Schluter, and J. Tetreault, Eds. Online: Association for Computational Linguistics, July 2020, pp. 3597–3606; https://aclanthology.org/2020.acl-main.331.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” 2019.
N. Reimers and I. Gurevych, “Sentence-bert: Sentence embeddings using siamese bert-networks,” 2019; https://doi.org/10.48550/arXiv.1908.10084.
M. Sasaki, S. Okura, and S. Ono, “Analysis on the relationship between diversity of consumed news articles and user activity,” Proceedings of the 36th Annual Conference of the Japanese Society for Artificial Intelligence, 2022, pp. 1H1GS1102–1H1GS1102; https://doi.org/10.11517/pjsai.JSAI2022.0 1H1GS1102.
T. T. Nguyen, P.-M. Hui, F. M. Harper, L. Terveen, and J. A. Konstan, “Exploring the filter bubble: the effect of using recommender systems on content diversity,” Proceedings of the 23rd international conference on World wide web, 2014, pp. 677–686.
C. Robertson, N. Pr¨ollochs, K. Schwarzenegger, P. P¨arnamets, J. Van Bavel, and S. Feuerriegel, “Negativity drives online news consumption,” Nature Human Behaviour, vol. 7, Mar. 2023, pp. 1–11.
I. Waller and A. Anderson, “Generalists and specialists: Using community embeddings to quantify activity diversity in online platforms,” The World Wide Web Conference, ser. WWW ’19, New York, NY, USA: Association for Computing Machinery, 2019, pp. 1954–1964; https://dl.acm.org/doi/10.1145/3308558.3313729.
D. Holtz, B. Carterette, P. Chandar, Z. Nazari, H. Cramer, and S. Aral, “The engagement-diversity connection: Evidence from a field experiment on spotify,” Proceedings of the 21st ACM Conference on Economics and Computation, ser. EC ’20, New York, NY, USA: Association for Computing Machinery, 2020, pp. 75–76; https://dl.acm.org/doi/10.1145/3391403.3399532.
G. Koch, R. Zemel, and R. Salakhutdinov, “Siamese neural networks for one-shot image recognition,” Lille, 2015.
D. Cer, M. Diab, E. Agirre, I. Lopez-Gazpio, and L. Specia, “SemEval-2017 task 1: Semantic textual similarity multilingual and crosslingual focused evaluation,” in Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), S. Bethard, M. Carpuat, M. Apidianaki, S. M. Mohammad, D. Cer, and D. Jurgens, Eds. Vancouver, Canada: Association for Computational Linguistics, Aug. 2017, pp. 1–14; https://aclanthology.org/S17-2001.
V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter,” 2020.
F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, June 2015; https://ieeexplore.ieee.org/document/7298682. Copyright ©