Doctoral Position

Modelling of terminology acquisition in physics teaching

Supervisors: Prof. Dr. Christian Wartena and Prof. Dr. Gunnar Friege

In the STEM fields, the writing of test protocols is one of the terminologically challenging tasks, that progressively should be improved from the 5th to the 13th grade in school. The use of machine evaluation methods and the modelling of learning progression can support teachers in correction and individual support.  

To learn a subject, it is important to acquire the related technical language. Technical languages ​​differ from the general language at different linguistic levels. On the word level, technical languages ​​are characterized primarily by the extensive use of terminology (H. Härtig, 2015). The recognition of terminology in text corpora and the automatic creation of terminology collections is one of the classic application fields of computational linguistics. An overview of the methods used is given e.g. by  Pazienza et al. (2005). A. Conde et al. (2016) present a system for terminology extraction and the construction of terminology systems specifically for educational purposes. Although the fundamental importance of good knowledge of the technical language and terminology is often emphasized (e.g. Diethelm and Goschler, 2014; Pineker-Fischer, 2017) and specific teaching methods for the acquisition of technical languages ​​have been developed (see e.g. J. Poupova, 2018) so far there have been no documented attempts to automatically access the terminology knowledge of learners from written texts such as test protocols.

In a first step German terminology for physics will be collected. Besides building on  existing terminology, terminology extraction procedures will be applied on physics texts, such as school books or lecture notes. In addition to words and phrases, typical uses of the terms should also be recorded here. In the second phase, the students’ texts are to be analyzed to determine how well the students have mastered physics terminology. The great challenge here is that it is hardly possible to carry out statistical analyzes of the texts of an individual student.  Rather a precise syntactic and semantic analysis of individual sentences should provide information about the correct use or the absence of a desired word. In addition to traditional structure based methods, it is conceivable to use newer methods using artificial neural networks, in particular (contextual) word embeddings, e.g. based on models for error correction in the context of foreign language learning (Devlin et al. (2018), Amjadian et al. (2016), Kochmar and Briscoe (2014), Herbelot and Kochmar (2016)). Finally, models for the development of terminology skills of students as well as concepts for the application of the developed methods in physics lessons should be developed. We aim to transfer of the knowledge gained from physics to other STEM fields.


Härtig, H., Bernholt, S., Prechtl, H., & Retelsdorf, J. (2015). Unterrichtssprache im Fachunterricht–Stand der Forschung und Forschungsperspektiven am Beispiel des Textverständnisses. Zeitschrift für Didaktik der Naturwissenschaften, 21(1), 55-67.

Pazienza, M. T., Pennacchiotti, M., & Zanzotto, F. M. (2005). Terminology extraction: an analysis of linguistic and statistical approaches. In Knowledge mining (pp. 255-279). Springer, Berlin, Heidelberg.

Conde, A., Larrañaga, M., Arruarte, A., Elorriaga, J.A. and Roth, D. (2016), LiTeWi: A Combined Term Extraction and Entity Linking Method for Eliciting Educational Ontologies From Textbooks. J Assn Inf Sci Tec, 67: 380-399. doi:10.1002/asi.23398

Diethelm, I., & Goschler, J. (2014). On human language and terminology used for teaching and learning CS/informatics. In Proceedings of the 9th Workshop in Primary and Secondary Computing Education (pp. 122-123). ACM.

Pineker-Fischer, A. (2016). Sprach- und Fachlernen im naturwissenschaftlichen Unterricht: Umgang von Lehrpersonen in soziokulturell heterogenen Klassen mit Bildungssprache. Springer-Verlag.

Poupova, J. (2018). Biological Terminology: an Opportunity for Teaching in Tandem. In Conference proceedings: New Perspectives in Science Education  (pp. 382-387). Libreraria Universitaria, Padua.

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

Amjadian, E., Inkpen, D., Paribakht, T. S., & Faez, F. (2016). Local-global vectors to improve unigram terminology extraction. In Proceedings of the 5th international workshop on computational terminology (pp. 2-11).

Kochmar, E., & Briscoe, T. (2014). Detecting learner errors in the choice of content words using compositional distributional semantics. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers (pp. 1740-1751).

Herbelot, A., & Kochmar, E. (2016). ‘Calling on the classical phone’: a distributional model of adjective-noun errors in learners’ English. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 976-986).

%d bloggers like this: