Program Comment Generation with Improved Distributed Representation by Seq2seq Model Using Parse Tree Information

  • Sakuei Ohnishi Okayama University of Science
  • Yukimoto Fumihiro Benesse InfoShell
  • Hiomitsu Shiina Okayama University of Science
Keywords: Programming learning, Comment Generating, Seq2seq Model, EncoderDecoder Translation Model, Distributed Representation, Parse Tree

Abstract

Comments in a program’s source code are important for understanding the program. Understanding the logical flow and overall procedure of the programs is important as the next step especially for beginners learning programming language, and it is inferred that appropriate comments on the source code can support it. In this study, we generate comments for source code using a distributed representation of line dependencies constructed with Word2Vec and using parse tree information obtained from the source code as input. Also, we generate comments not only for each line of source code but also for blocks, which are the logical units of processing.

References

Ministry of Education, Culture, Sports, Science and Technology, “Elementary school programming education guide (2nd edition),” https://www.mext.go.jp/a_menu/shotou/zyouhou/detail/1403162.htm, 2018, accessed May. 5, 2020 (in Japanse).

Ministry of Education, Culture, Sports, Science and Technology, “How to programming education at elementary school level (summary of discussion),” https: //www.mext.go.jp/b menu/shingi/chousa/shotou/122/attach/1372525.htm, 2016, accessed May. 5, 2020 (in Japanse).

H. Kanamori, T. Tomoto, and T. Akakura, “Development of a computer programming learning support system based on reading computer program,” in Human Interface and the Management of Information. Information and Interaction for Learning, Culture, Collaboration and Business, S. Yamamoto, Ed. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 63–69.

K. Okimoto, S. Matsumoto, S. Yamagishi, and T. Kashima, “Developing a source code reading tutorial system and analyzing its learning log data with multiple classification analysis,” Artificial Life and Robotics, vol. 22, no. 2, pp. 227–237, apr 2017. [Online]. Available: https://doi.org/10.1007%2Fs10015-017-0357-2

S. Matsumoto, K. Okimoto, T. Kashima, and S. Yamagishi, “Automatic generation of c source code for novice programming education,” in HumanComputer Interaction. Theory, Design, Development and Practice, M. Kurosu, Ed. Cham: Springer International Publishing, 2016, pp. 65–76. [Online]. Available: https://doi.org/10.1007/978-3-319-39510-4_7

I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in Neural Information Processing Systems 27 (NIPS 2014), 2014, pp. 3104–3112.

T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” in 1st International Conference on Learning Representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, Y. Bengio and Y. LeCun, Eds., 2013. [Online]. Available: http://arxiv.org/abs/1301.3781

K. Greff, R. K. Srivastava, J. Koutn´ık, B. R. Steunebrink, and J. Schmidhuber, “Lstm: A search space odyssey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2222–2232, 2016.

K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, Pennsylvania, USA: Association for Computational Linguistics, Jul. 2002, pp. 311–318. [Online]. Available: https://www.aclweb.org/anthology/P02-1040

S. Matsumoto, Y. Hayashi, and T. Hirashima, “Development of a card operationbased programming learning system focusing on thinking between the relations of parts,” IEEJ Transactions on Electronics, Information and Systems, vol. 138, pp. 999–1010, 08 2018.

J. Shinkai, Y. Hayase, and I. Miyaji, “A trial of algorithm education emphasizing manual procedures,” in Proceedings of Society for Information Technology & Teacher Education International Conference 2016, G. Chamblee and L. Langub, Eds. Savannah, GA, United States: Association for the Advancement of Computing in Education (AACE), March 2016, pp. 113–118. [Online]. Available: https://www.learntechlib.org/p/171656

K. Sakane, N. Kobayashi, H. Shiina, and F. Kitagawa, “Kanji learning and programming support system which conjoined with a lecture,” in IEICE Technical Report, ser. ET2014–86, vol. 114, no. 513, 2015, pp. 7–12.

T. Fujiki, Y. Hayase, and K. Inoue, “Generating descriptions of nouns in software from program comments,” in IEICE, vol. 110, no. 169, 2010, pp. 65–69.

A. Takahashi, H. Shiina, R. Ito, and N. Kobayashi, “Procedure generation for algorithm learning system using comment synthesis and lstm,” International Journal of Service and Knowledge Management(IJSKM), vol. 3, no. 2, pp. 48–61, 11 2019.

S. Onishi, A. Takahashi, H. Shiina, and N. Kobayashi, “Automatic comment generation for source code using external information by neural networks for computational thinking,” International Journal of Smart Computing and Artificial Intelligence(IJSCAI), vol. 4, no. 2, pp. 39–61, 12 2020.

X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, “Deep code comment generation with hybrid lexical and syntactical information,” Empirical Software Engineering, vol. 25, no. 3, pp. 2179–2217, jun 2019. [Online]. Available: https://doi.org/10.1007%2Fs10664-019-09730-9

E. Bendersky, “Github – eliben/pycparser: Complete c99 parser in pure python,” https://github.com/eliben/pycparser, 2019, accessed Jul. 15, 2019.

Published
2022-03-11
Section
Technical Papers