Few-shot natural language understanding has attracted much recent attention. However, prior methods have been evaluated under a diverse set of protocols, which hinders fair comparison and measuring progress of the field. It is quested for a converged evaluation protocol as well as a general toolkit for few-shot NLU. FewNLU addresses this issue from the following aspects.
- FewNLU introduces an evaluation framework of few-shot NLU, and uses comprehensive experiments to justify the choices of data split construction and hyper-parameter search space formulation. The evaluation framework can be viewed as correction, improvement, and unification of previous evaluation protocols.
- Under this evaluation framework, a number of recently-proposed state-of-the-art methods are re-evaluated. Through these experiments, we benchmark the performance of prior methods individually as well as the best performance with a combined approach.
- Throughout our exploration, we arrive at several key findings summarized in the FewNLU paper.
- We open-source a toolkit FewNLU to facilitate future research based on our evaluation framework.