What are you looking for?
51 Résultats pour : « Portes ouvertes »

L'ÉTS vous donne rendez-vous à sa journée portes ouvertes qui aura lieu sur son campus à l'automne et à l'hiver : Samedi 18 novembre 2023 Samedi 17 février 2024 Le dépôt de votre demande d'admission à un programme de baccalauréat ou au cheminement universitaire en technologie sera gratuit si vous étudiez ou détenez un diplôme collégial d'un établissement québécois.

Software Engineering Research and Innovation Sensors, Networks and Connectivity LASI – Computer System Architecture Research Laboratory

Automatic Summarization for Application Programming Interface

Developer programming

Purchased on Istockphoto.com. Copyright.

SUMMARY

Automated source code summarization is a task that generates summarized information on code entities (e.g. Classes and methods) in the form of natural language descriptions. In this article, we propose an automatic approach in summarizing Android API methods discussed in Stack Overflow. Our approach takes the API method's name as an input and generates a natural language summary based on Stack Overflow discussions of that method. We conducted a survey involving 16 Android developers to evaluate the quality of our generated summaries and compare them with the official Android documentation. Our results demonstrate that while developers find the official documentation more useful in general, the generated summaries are also competitive and can be used as a complementary source in guiding developers in software development tasks. Keywords: code summarization, unsupervised learning, unofficial documentation, survey, professional developers.

Using Stack Overflow as Additional Documentation

In many cases, developers are unaware of the purpose or usage of a code entity. They must examine a large volume of code and/or documentation to grasp the concept related to a code entity. It is therefore interesting to have an automatic approach that can provide them with summaries on the purpose, implementation, and/or usage of code entities that are part of their tasks. Consider, for instance, a developer trying to fix a bug that was caused by someone else’s task. To understand the bug and how to reproduce it, developers must first read all related bug reports and review previous discussions. Also, a developer who wants to implement a function for the first time needs to read all its related documentation to become familiar with the method and understand how to use it. Although developers use official documentation as their main source of information on code entities [1], researchers have shown that official documentation sometimes lacks completeness, insight, and conciseness [2][3]. As a result, developers may refer to other sources like Stack Overflow which is a question-and-answer website where programmers ask their questions.

To fill this gap, we propose an automatic approach to summarize APIs by leveraging unofficial documentation and unsupervised learning. In this study, we used Stack Overflow as a type of unofficial documentation for our investigation. In addition, we focused on extractive code summarization, which extracts the most important sentences from documents, i.e. Stack Overflow posts in our research.

Generating and Evaluating Summaries

To carry out our study, we divided it into two major parts, generating summaries and evaluating the summaries. For the first part, we collected Stack Overflow’s Android posts from January 2009 up to April 2020 and ended up with 3,084,143 unique posts. Furthermore, we used TextRank, an unsupervised machine learning algorithm, in generating summaries.

TextRank algorithm

Figure 1. Overview of the TextRank algorithm.

For the second part, we asked 28 Android developers to first evaluate the quality of our generated summaries and second, to compare them with official Android documentation. Sixteen developers agreed to participate in our study. We assigned only three APIs to each participant to prevent confusion and fatigue.

A Useful Tool for Developers

We evaluated a total of 3,084,143 unique Stack Overflow posts and summarized the top 15 most popular Android APIs. Following are the most important findings of our survey:

  • All developers involved in this study (100%) agreed that the length of summaries was appropriate.
  • About half of the participants (58%) believed that the automatically generated summaries were coherent.
  • Most of the participants (73%) found that our summaries included accurate information about Android methods.
  • 59% of participants believed that automatically generated summaries contained important and necessary information about Android methods.
  • When comparing the automatically generated summaries with official Android documentation, there are not many differences between the two: almost the same proportions have been obtained for both, except for a slightly smaller difference regarding implementation versus usage.
  • A rate of 4.1 participants out of 5 agreed it would be helpful to have an integrated plugin to show our automatically generated summaries.
Comparison between generated summaries and official documentation

Figure 2. Which one is better – generated summaries or official documentation?

Quality of generated summaries

Figure 3. Developers’ satisfaction with the quality of generated summaries.

Conclusion

We presented a novel code summarization approach for methods based on unofficial documentation and unsupervised learning. We used Stack Overflow Android posts as our dataset and applied the TextRank algorithm as our main technique for summarization. The generated summaries were evaluated by 16 professional developers. We found that our automatically generated summaries could be useful to developers in software development. Additionally, the produced summaries are almost as useful as official documentation in understanding the usage and implementation of Android methods. Moreover, participants agreed that the generated summaries can be used as a complementary source for official documentation.

Additional Information

For more information on this research, please read the following conference paper:

Naghshzan, A.; Guerrouj, L. and Baysal, O. 2021. “Leveraging Unsupervised Learning to Summarize APIs Discussed in Stack Overflow”. IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM). pp. 142-152.