Your browser does not support JavaScript!
http://iet.metastore.ingenta.com
1887

Text mining and software engineering: an integrated source code and document analysis approach

Text mining and software engineering: an integrated source code and document analysis approach

For access to this article, please select a purchase option:

Buy article PDF
£12.50
(plus tax if applicable)
Buy Knowledge Pack
10 articles for £75.00
(plus taxes if applicable)

IET members benefit from discounts to all IET publications and free access to E&T Magazine. If you are an IET member, log in to your account and the discounts will automatically be applied.

Learn more about IET membership 

Recommend Title Publication to library

You must fill out fields marked with: *

Librarian details
Name:*
Email:*
Your details
Name:*
Email:*
Department:*
Why are you recommending this title?
Select reason:
 
 
 
 
 
IET Software — Recommend this title to your library

Thank you

Your recommendation has been sent to your librarian.

Documents written in natural languages constitute a major part of the artefacts produced during the software engineering life cycle. Especially during software maintenance or reverse engineering, semantic information conveyed in these documents can provide important knowledge for the software engineer. A text mining system capable of populating a software ontology with information detected in documents is presented. A particular novelty is the integration of results from automated source code analysis into a natural language processing pipeline, allowing to cross-link software artefacts represented in code and natural language on a semantic level.

References

    1. 1)
      • G. Antoniou , F. Harmelen . (2004) A Semantic Web primer.
    2. 2)
      • R. Baeza-Yates , B. Ribeiro-Neto . (1999) Modern information retrieval.
    3. 3)
      • Welty, C.: `Augmenting abstract syntax trees for program understanding', Proc. Int. Conf. Automated Software Engineering, 1997, IEEE Comp. Soc. Press, p. 126–133.
    4. 4)
      • Meng, W., Rilling, J., Zhang, Y., Witte, R., Charland, P.: `An ontological software comprehension process model', Proc. 3rd Int. Workshop on Metamodels, Schemas, Grammars, and Ontologies for Reverse Engineering (ATEM), October 2006, Genoa, Italy, p. 28–35.
    5. 5)
    6. 6)
      • Witte, R., Bergler, S.: `Fuzzy coreference resolution for summarization', Proc. 2003 Int. Symp. Reference Resolution Applications Question Answering Summarization (ARQAS), June 2003, Venice, Italy, Università Ca' Foscari, p. 43–50.
    7. 7)
      • Haarslev, V., Möller, R.: `RACER system description', Proc. Int. Joint Conf. Automated Reasoning (IJCAR), 2001, Siena, Italy, Springer-Verlag, Berlin, p. 701–705.
    8. 8)
      • Witte, R., Zhang, Y., Rilling, J.: `Empowering software maintainers with semantic web technologies', Proc. 4th European Semantic Web Conference (ESWC 2007), LNCS, No. 4519, June 2007, Innsbruck, Austria, Springer-Verlag, p. 37–52, Berlin, Heidelberg.
    9. 9)
      • Lethbridge, T.C., Nicholas, A.: `Architecture of a source code exploration tool: a software engineering case study', Technical Report TR-97-07, 1997.
    10. 10)
      • A. Marcus , J.I. Maletic , A. Sergeyev . Recovery of traceability links between software documentation and source code. Int. J. Softw. Eng. Knowl. Eng. , 5 , 811 - 836
    11. 11)
      • P.N. Johnson-Laird . (1983) Mental models: towards a cognitive science of language, inference and consciousness.
    12. 12)
      • Rilling, J., Witte, R., Zhang, Y.: `Automatic traceability recovery: an ontological approach', Proc. Int. Symp. Grand Challenges Traceability (GCT'07), March 2007, Lexington, KY, USA.
    13. 13)
      • R. Seacord , D. Plakosh , G. Lewis . (2003) Modernizing legacy systems: software technologies, engineering processes, and business practices’ ‘SEI series in SE.
    14. 14)
      • Antoniol, G., Canfora, G., Casazza, G., Lucia, A.: `Information retrieval models for recovering traceability links between code and documentation', Proc. IEEE Int. Conf. Software Maintenance, 2000, San Jose, CA, USA.
    15. 15)
      • N.F. Noy , H. Stuckenschmidt . (2005) Ontology alignment: an annotated bibliography'. Semantic interoperability and integration.
    16. 16)
      • Ilieva, M.G., Ormandjieva, O.: `Automatic transition of natural language software requirements specification into formal presentation', Proc. 10th Int. Conf. Applications Natural Language to Information Systems (NLDB), LNCS, 2005, Springer, p. 392–397.
    17. 17)
      • Riva, C.: `Reverse architecting: an industrial experience report', Proc. 7th IEEE Working Conf. Reverse Engineering (WCRE), 2000, p. 42–52.
    18. 18)
      • Sabou, M.: `Extracting ontologies from software documentation: a semi-automatic method and its evaluation', Proc. ECAI-2004 Workshop Ontology Learning and Population, 2004, Valencia, Spain.
    19. 19)
      • Kof, L.: `Natural language processing: mature enough for requirements documents analysis?', Proc. 10th Int. Conf. Applications of Natural Language to Information Systems (NLDB), 2005, Alicante, Spain, Springer, p. 91–102, LNCS.
    20. 20)
      • Witte, R., Bergler, S.: `Next-generation summarization: contrastive, focused, and update summaries', Proc. Int. Conf. Recent Advances Natural Language Processing (RANLP 2007), September 2007, Borovets, Bulgaria.
    21. 21)
      • Marcus, A., Maletic, J.I.: `Recovering documentation-to-source-code traceability links using latent semantic indexing', Proc. 25th Int. Conf. Software Engineering, 2002.
    22. 22)
      • C. Calero , F. Ruiz , M. Piattini . (2006) Ontologies for software engineering and software technology.
    23. 23)
      • M. Shaw , D. Garlan . (1996) Software architecture: perspectives on an emerging discipline.
    24. 24)
      • F. Baader , D. Calvanese , D. MacGuinness , D. Nardi , P. Patel-Schneider , P. Patel-Schneider . (2007) The description logic handbook: theory, implementation and applications.
    25. 25)
      • I. Sommerville . (2006) Software engineering.
    26. 26)
      • M.A. Storey , S.E. Sim , K. Wong . A collaborative demonstration of reverse engineering tools. ACM SIGAPP Appl. Comput. Rev. , 1 , 18 - 25
    27. 27)
      • R. Witte , T. Kappler , C.J.O. Baker . (2007) Ontology design for biomedical text mining’ in ‘Semantic web: revolutionizing knowledge discovery in the life sciences.
    28. 28)
    29. 29)
      • Mencl, V.: `Deriving behavior specifications from textual use cases', Proc. Workshop Intelligent Technologies Software Engineering, 2004, Linz, Austria, Oesterreichische Computer Gesellschaft, p. 331–341.
    30. 30)
      • Gaizauskas, R., Hepple, M., Saggion, H., Greenwood, M.A., Humphreys, K.: `SUPPLE: a practical parser for natural language engineering applications', Proc. 9th Int. Workshop on Parsing Technologies (IWPT2005), 2005FP, Vancouver.
    31. 31)
      • Jin, D., Cordy, J.: `Ontology-based software analysis and reengineering tool integration: the oasis service-sharing methodology', Proc. 21st IEEE Int. Conf. Software Maintenance (ICSM), 2005.
    32. 32)
      • Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: `GATE: a framework and graphical development environment for robust NLP tools and applications', Proc. 40th Anniversary Meeting ACL, 2002.
    33. 33)
      • `IEEE standard for software maintenance', IEEE 1219, 1998.
    34. 34)
      • C.D. Manning , H. Schütze . (1999) Foundations of statistical natural language processing.
http://iet.metastore.ingenta.com/content/journals/10.1049/iet-sen_20070110
Loading

Related content

content/journals/10.1049/iet-sen_20070110
pub_keyword,iet_inspecKeyword,pub_concept
6
6
Loading
This is a required field
Please enter a valid email address