QRA produces a single score estimating the degree of reproducibility of a given system and evaluation measure, on the basis of the scores from, and differences between, different reproductions. This paper describes and tests a method for carrying out quantified reproducibility assessment (QRA) that is based on concepts and definitions from metrology. Quantified Reproducibility Assessment of NLP Results In comparison to other widely used strategies for selecting important tokens, such as saliency and attention, our proposed method has a significantly lower false positive rate in generating rationales. We also validate the quality of the selected tokens in our method using human annotations in the ERASER benchmark. Our experiments on several diverse classification tasks show speedups up to 22x during inference time without much sacrifice in performance. To determine the importance of each token representation, we train a Contribution Predictor for each layer using a gradient-based saliency method. Our method dynamically eliminates less contributing tokens through layers, resulting in shorter lengths and consequently lower computational cost. ![]() In this work, we propose a novel approach for reducing the computational cost of BERT with minimal loss in downstream performance. But, this usually comes at the cost of high latency and computation, hindering their usage in resource-limited settings. Pre-trained language models have shown stellar performance in various downstream tasks.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |