Settles, Burr, T. LaFlair, Geoffrey, and Hagiwara, Masato
Transactions of the Association for Computational Linguistics, Vol 8, Pp 247-263 (2020)
Subjects
Computational linguistics. Natural language processing and P98-98.5
Abstract
We describe a method for rapidly creating language proficiency assessments, and provide experimental evidence that such tests can be valid, reliable, and secure. Our approach is the first to use machine learning and natural language processing to induce proficiency scales based on a given standard, and then use linguistic models to estimate item difficulty directly for computer-adaptive testing. This alleviates the need for expensive pilot testing with human subjects. We used these methods to develop an online proficiency exam called the Duolingo English Test, and demonstrate that its scores align significantly with other high-stakes English assessments. Furthermore, our approach produces test scores that are highly reliable, while generating item banks large enough to satisfy security requirements.
Health Information Systems : Concepts, Methodologies, Tools, and Applications. 2010, v. 2, p975-984.
Abstract
Chapter 3.17: A Software Tool for Biomedical Information Extraction (And Beyond) ABNER (A Biomedical Named Entity Recognizer) is an open-source software tool for text mining in the molecular biology literature. [...]
The key idea behind active learning is that a machine learning algorithm can achieve greater accuracy with fewer labeled training instances if it is allowed to choose the training data from which is learns. An active learner may ask queries in the form of unlabeled instances to be labeled by an oracle (e.g., a human annotator). Active learning is well-motivated in many modern machine learning problems, where unlabeled data may be abundant but labels are difficult, time-consuming, or expensive to obtain. This report provides a general introduction to active learning and a survey of the literature. This includes a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date. An analysis of the empirical and theoretical evidence for active learning, a summary of several problem setting variants, and a discussion of related topics in machine learning research are also presented.