A Question Answering Software for Assessing AI Policies of OECD Countries

Category

Conference Article

Published

24 July 2024

Abstract

It is widely accepted that the current advancements in the fields of Natural Language Processing (NLP) and Artificial Intelligence (AI) in general, have provided the necessary means for the implementation of systems that were not possible in the past. Given the fact that there is now a vast volume of available text data, a tremendous amount of heterogenous language models have already been trained based on those data, in order to complete different tasks. The complexity of those tasks, in both the demands for computational resources and the efficiency of the corresponding approach, increases exponentially, mostly based on the volume and the complexity of the available data. A typical example of such a task is Question Answering (QA), where the data that are selected as a knowledge base for providing answers directly affect the efficiency, accuracy and the computational cost of the whole approach. As a result, when developing a QA tool, it is crucial to take into consideration a number of factors throughout the whole QA pipeline, in order to ensure that the given approach provides the right answers in the most efficient way possible. To this end, this paper proposes an efficient QA pipeline consisting of several steps, which range from the retrieval and preprocessing of the data to the selection of the appropriate language model, for assessing large documents that contain AI policies of OECD countries.