Summarization of the Methodology of Applying N-gram to Obtain Factor Scores of Q&A Statements
Abstract
With a view to solving the troubles of mismatches between the questioners and respondents of Question and Answer (Q&A) sites, an impression evaluation experiment resulted in obtaining nine factors of impressions for Q&A statements. Factor scores were then estimated through multiple regression analysis utilizing feature values of statements. The factor scores obtained and estimated were subsequently employed for finding appropriate respondents who would be likely to answer a posted question. However, this methodology so far has substantially depended on the syntactic information extracted through morphological analysis. In addition, this method has a significant drawback of demanding manifold variables and complex multiple regression equations to estimate factor scores. Thus, another course has been taken by applying N-gram instead of morphological analysis. So far, the analyses of 2-gram through 5-gram have shown good estimation accuracy. In order to strengthen these tendencies, in this paper, 6-gram is applied to the feature values. Further analysis has shown that 6-gram would also be applicable to the method. In terms of estimation accuracy, N-grams also outscore morphological analysis; above all 2-gram and 3-gram show the best accuracy. Hence, it could be suggested that N-gram should play a more important role in estimating factor scores than mere morphological analysis.
References
Yahoo! Chiebukuro (URL, in Japanese), http://chiebukuro.yahoo.co.jp/, 2024-11-02.
F. Riahi, Z. Zolaktaf , M. Shafiei and E. Milios, “Finding Expert Users in Community Question Answering,” Proc. of the 21st International Conference Companion on World Wide Web (WWW12), 2012, pp.791-798, DOI: https://doi.org/10.1145/2187980.2188202
F. M, Harper, D. Raban and S. Rafaeli, “Predictors of Answer Quality in Online Q&A Sites,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2008, pp.865-874, DOI: https://doi.org/10.1145/1357054.1357191
E. U. Haq, T. Braud and P. Hui, “Community Matters More Than Anonymity: Analysis of User Interactions on the Quora Q&A Platform,” Proc. of the International Conference Series on Advances in Social Network Analysis and Mining (ASONAM 2020), 2020, pp.94-98, DOI: 10.3233/SHTI190759
P. Jurczyk and E. Agichtein, “Discovering authorities in question answer commu-nities by using link analysis,” Proc. of the 16th ACM Conference on Information and Knowledge Management, 2007, pp.919–922, DOI: https://doi.org/10.1145/1321440.1321575
Wang, L., Zhang, L. and Jiang, J., “IEA: An Answerer Recommendation Approach on Stack Overflow,” Science China Information Sciences, Volume 62, 2019, 19 pages, DOI: https://doi.org/10.1007/s11432-018-9848-2
Anandhan, A., Ismail, M. A. and Shuib, L., “Expert Recommendation through Tag Relationship in Community Question Answering,” Malaysian Journal of Computer Science, 35(3), 2022, pp.201–221, DOI: https://doi.org/10.22452/mjcs.vol35no3.2
M. Yazdaninia, D. Lo and A. Sami, “Characterization and Prediction of Questions without Accepted Answers on Stack Overflow,” IEEE/ACM 29th International Conference on Program Comprehension (ICPC), 2021, pp.59-70, DOI: 10.1109/ICPC52881.2021.00015
Z. Gao, X. Xia, D. Lo and J. Grundy, “Technical Q&A Site Answer Recommenda-tion via Question Boosting,” ACM Transactions on Software Engineering and Methodology, 30(1), 2020, pp.1-34, DOI: https://dl.acm.org/doi/10.1145/3412845
Y. Yokoyama, T. Hochin, H. Nomiya and T. Satoh, “Obtaining Factors Describing Impression of Questions and Answers and Estimation of their Scores from Feature Values of Statements,” Software and Network Engineering, Springer, Volume 413, 2013, pp.1-13, DOI: https://doi.org/10.1007/978-3-642-28670-4_1
Y. Yokoyama, T. Hochin and H. Nomiya, “Using Feature Values of Statements to Improve the Estimation Accuracy of Factor Scores of Impressions of Question and Answer Statement,” International Journal of Affective Engineering. Volume 13(1), 2013, pp.19-26, DOI: https://doi.org/10.5057/ijae.13.19
Y. Yokoyama, T. Hochin and H. Nomiya, “Application of 2-gram and 3-gram to Obtain Factor Scores of Statements Posted at Q&A Sites,” International Journal of Networked and Distributed Computing, Vol.1-2, 2022, pp.11-20, DOI: https://doi.org/10.1007/s44227-022-00005-2.
Y. Yokoyama, T. Hochin and H. Nomiya, “Using 4-gram to Obtain Factor Scores of Japanese Statements Posted at Q&A Sites,” Proc. of the 13th International Congress on Advanced Applied Informatics (AAI 2022-Winter), 2022, pp.25-31, DOI: https://ieeexplore.ieee.org/document/10123522/
Y. Yokoyama, “Application of 5-gram to Obtain Factor Scores of Japanese Q&A Statements,” Proc. of the 14th International Congress on Advanced Applied Infor-matics (AAI 2023), 2023, pp.69-75, DOI: 10.1109/IIAI-AAI59060.2023.00024.
Y. Yokoyama, T. Hochin and H. Nomiya, “Towards Detecting Appropriate Re-spondents to Questions Posted at Q&A Sites by Disregarding and Considering Categories of Answer Statements,” International Journal of Networked and Distrib-uted Computing, Vol.15, No.2, 2016, pp.167-175, DOI: https://doi.org/10.5057/ijae.IJAE-D-15-00031.
Y. Yokoyama, T. Hochin and H. Nomiya, “Improvement of Obtaining Potential Appropriate Respondents to Questions at Q&A Sites by Considering Categories of Answer Statements,” International Journal of Affective Engineering, Vol.16, No.2, Special Issue ISASE2016, pp.63-73, 2017, pp.63-73, DOI: https://doi.org/10.5057/ijae.IJAE-D-16-00023
Y. Yokoyama, “Impression and Suitability of Q&A Statements through Factor Scores Using 2-gram,” Proc. of the 15th International Congress on Advanced Ap-plied Informatics (AAI 2023-Winter), 2023, pp.45-51, DOI: 10.1109/IIAI-AAI-Winter61682.2023.00017.
M. Ishida, “Text Mining Introduction Using R (in Japanese),” Morikita Publishing, 2nd Edition, 2017, pp.94-99, ISBN978-4-627-84842-9
The R Project for Statistical Computing (URL), https://www.r-project.org, 2024-11-02.
Y. Yokoyama, T. Hosoda and T. Matsuo, “Extracting Impression from the Low-Rated Statements Posted at EC Sites,” Proc. of the 16th International Congress on Advanced Applied Informatics (AAI 2024), 2024, pp.650-654, DOI: 10.1109/IIAI-AAI63651.2024.00122.
Y. Yokoyama, T. Hosoda and T. Matsuo, “Extracting Factors through Additional Impression Evaluation Experiment Assessing Both High-rated and Low-rated Re-views Posted at EC Sites,” The 19th International Conference on Knowledge, In-formation and Creativity Support Systems (KICSS2024), 6 pages, posted.