Fig 5 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Source publication
A source code summary of a subroutine is a brief description of that subroutine. Summaries underpin a majority of documentation consumed by programmers, such as the method summaries in JavaDocs. Source code summarization is the task of writing these summaries. At present, most state-of-the-art approaches for code summarization are neural network-ba...
Contexts in source publication
Context 1
... our results from RQ1 we show that the models that had the largest improvement when ensembled, also show very little overlap in their inputs when compared with other models. Figure 5a shows histograms of encoder comparisons. In these figures the x-axis is the number of methods that overlap in the encoders 100 most similar lists, as explained in the previous section. ...
Context 2
... these figures the x-axis is the number of methods that overlap in the encoders 100 most similar lists, as explained in the previous section. For example in Figure 5a, the bar labeled '5' is the count of all methods in the testing set that had an overlap count of 5. Having an overlap count of 5 indicates that for a given method the encoders agreed on 5 entries. In this example, the number of methods that had an overlap count of 5 in the testing set is in between 100 and 1000. ...
Context 3
... histogram shows the overlap distribution over the testing set. Figure 5a is the overlap between the source code sequence encoder and AST encoder from the AST-Flat model. When comparing the source code sequence encoder to the AST encoder we see very little overlap in method similarity. ...
Context 4
... Figure 5b we see a similar situation where there is almost no overlap in the methods the encoders find similar. This figure compares the source code sequence encoder and AST encoder of the AST-GNN model. ...
Context 5
... could be due to the GNN providing a more orthogonal representation of the AST than the AST-Flat model. In Figure 5c we show the overlap between the AST encoders of the AST-Flat and AST-GNN models. In this histogram we see that while there is still very little overlap, there are many more methods that have an overlap count of 1 or 2. This could mean that while the ASTs are being represented differently to each model, they are learning some features that allow them to identify a small subset of similar methods the same way. ...
Context 6
... all encoder overlaps show orthogonal representations. In Figure 5d there is significantly more overlap between the encoders, with some methods having 70-80 of their top 100 similar methods shared between the encoders. This figure shows the overlap between the source code sequence encoders from the Transformer and AST-Flat models. ...
Context 7
... found that many of the of the source code sequence encoders shared a high level of similarity. Similarly, in Figure 5e we compare the file context encoder average output from the Seq2Seq-FC and AST-Flat-FC models. In this comparison we see a lot of overlap between the encoders, the only difference between these two models is that the AST-Flat-FC model has an additional input of the flattened AST. ...