RUO Home

Repositorio Institucional de la Universidad de Oviedo

View Item 
  •   RUO Home
  • Investigación
  • Datos de investigación
  • View Item
  •   RUO Home
  • Investigación
  • Datos de investigación
  • View Item
    • español
    • English
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

All of RUOCommunities and CollectionsBy Issue DateAuthorsTitlesSubjectsxmlui.ArtifactBrowser.Navigation.browse_issnAuthor profilesThis CollectionBy Issue DateAuthorsTitlesSubjectsxmlui.ArtifactBrowser.Navigation.browse_issn

My Account

LoginRegister

Statistics

View Usage Statistics

RECENTLY ADDED

Last submissions
Repository
How to publish
Resources
FAQs

Data from "Mining Common Syntactic Patterns used by Java Programmers"

Author:
Losada de Castro, Álvaro; Facundo Colunga, GuillermoUniovi authority; García Rodríguez, MiguelUniovi authority; Ortín Soler, FranciscoUniovi authority
Subject:

Syntactic patterns

Rule mining

Abstract Syntax Trees

Association rules

Java

Publication date:
2022-01-26
Abstract:

Open source code repositories provide massive data as programs that have been used to develop different tools. These kinds of works have been included in the active Big Code and Mining Software Repositories research fields. Although different machine learning works already classify the syntactic constructs used by programmers, there are no reports about the most common syntactic patterns used by Java programmers. In this article, we describe a system we build to provide such a report. Our system retrieves the syntactic patterns used by Java programmers, distinguishing those utilized by experts and beginners. We also present the anomalies found in the usage of different syntactic constructs. We modify the OpenJDK compiler to double the syntactic information included in its Abstract Syntax Tree (AST), define a mechanism to translate ASTs into n-dimensional vectors, combine the information of different syntax constructs to build heterogeneous patterns, and apply the Frequent Pattern Growth algorithm to mine the syntactic patterns as association rules. The mined patterns allow expressing hierarchical subpatterns connected to one another, providing a high level of expressiveness.

Open source code repositories provide massive data as programs that have been used to develop different tools. These kinds of works have been included in the active Big Code and Mining Software Repositories research fields. Although different machine learning works already classify the syntactic constructs used by programmers, there are no reports about the most common syntactic patterns used by Java programmers. In this article, we describe a system we build to provide such a report. Our system retrieves the syntactic patterns used by Java programmers, distinguishing those utilized by experts and beginners. We also present the anomalies found in the usage of different syntactic constructs. We modify the OpenJDK compiler to double the syntactic information included in its Abstract Syntax Tree (AST), define a mechanism to translate ASTs into n-dimensional vectors, combine the information of different syntax constructs to build heterogeneous patterns, and apply the Frequent Pattern Growth algorithm to mine the syntactic patterns as association rules. The mined patterns allow expressing hierarchical subpatterns connected to one another, providing a high level of expressiveness.

Description:

Data from the article "A. Losada, G. Facundo, M. Garcia, F. Ortin. Mining Common Syntactic Patterns used by Java Programmers. Latin America Transactions, Volume 20(5), pp. 753-762, 2022. https://doi.org/10.1109/TLA.2022.9693559"

URI:
https://hdl.handle.net/10651/70846
DOI:
10.17811/ruo_datasets.70846
Enlace a recurso relacionado:
http://hdl.handle.net/10651/64137
Patrocinado por:

This work has been partially funded by the Spanish Department of Science, Innovation and Universities: project RTI2018-099235-B-I00. The authors have also received funds from the University of Oviedo, Spain through its support of official research groups (GR-2011-0040).

Collections
  • Datos de investigación [79]
Files in this item
untranslated
Dataset (206.8Mb)
untranslated
Readme.txt (2.955Kb)
Métricas
Compartir
Exportar a Mendeley
Estadísticas de uso
Estadísticas de uso
Metadata
Show full item record
Página principal Uniovi

Biblioteca

Contacto

Facebook Universidad de OviedoTwitter Universidad de Oviedo
The content of the Repository, unless otherwise specified, is protected with a Creative Commons license: Attribution-Non Commercial-No Derivatives 4.0 Internacional
Creative Commons Image