TY - GEN
T1 - Using n-grams for the automated clustering of structural models
AU - Babur, Önder
AU - Cleophas, Loek
PY - 2017
Y1 - 2017
N2 - Model comparison and clustering are important for dealing with many models in data analysis and exploration, e.g. in domain model recovery or model repository management. Particularly in structural models, information is captured not only in model elements (e.g. in names and types) but also in the structural context, i.e. the relation of one element to the others. Some approaches involve a large number of models ignoring the structural context of model elements; others handle very few (typically two) models applying sophisticated structural techniques. In this paper we address both aspects and extend our previous work on model clustering based on vector space model, with a technique for incorporating structural context in the form of n-grams. We compare the n-gram accuracy on two datasets of Ecore metamodels in AtlanMod Zoo: small random samples using up to trigrams and a larger one (∼100 models) up to bigrams.
AB - Model comparison and clustering are important for dealing with many models in data analysis and exploration, e.g. in domain model recovery or model repository management. Particularly in structural models, information is captured not only in model elements (e.g. in names and types) but also in the structural context, i.e. the relation of one element to the others. Some approaches involve a large number of models ignoring the structural context of model elements; others handle very few (typically two) models applying sophisticated structural techniques. In this paper we address both aspects and extend our previous work on model clustering based on vector space model, with a technique for incorporating structural context in the form of n-grams. We compare the n-gram accuracy on two datasets of Ecore metamodels in AtlanMod Zoo: small random samples using up to trigrams and a larger one (∼100 models) up to bigrams.
KW - Hierarchical clustering
KW - Model comparison
KW - Model-driven engineering
KW - N-grams
KW - Vector space model
KW - n-grams
UR - http://www.scopus.com/inward/record.url?scp=85010689255&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-51963-0_40
DO - 10.1007/978-3-319-51963-0_40
M3 - Conference contribution
AN - SCOPUS:85010689255
SN - 9783319519623
T3 - Lecture Notes in Computer Science
SP - 510
EP - 524
BT - SOFSEM 2017: Theory and Practice of Computer Science - 43rd International Conference on Current Trends in Theory and Practice of Computer Science, Proceedings
PB - Springer
T2 - 43rd Conference on Current Trends in Theory and Practice of Computer Science, (SOFSEM 2017), Januari 16-20, 2017, Limerick, Ireland
Y2 - 16 January 2017 through 20 January 2017
ER -