Task Location

Contrary to the interpretation of the title of the original GPT-3 paper by Brown et al [3] that language models are "few-shot learners", we argue that GPT-3 often does not actually learn the Task at runtime from "few-shot examples".

Rather than Instruction, the method’s primary function is task location in the model’s existing space of learned tasks [⇒ Model Free Method ZooUnderstanding MachinesTable LookupInterrupt Vector TableDispatch TableTable Lookup and Dispatch]. This is evidenced by the effectiveness of alternative prompts which, with no examples, can elicit comparable or superior performance to the few-shot format. >> task location table lookup dispatch table

This motivates new approaches which explicitly pursue the goal of task location. We propose exploring more general methods of Prompt Programming and specifically techniques for communicating task intention and structure to a self-supervised model in the modality it was trained: Natural Language. With a few caveats, we want to find prompts which we would expect a human to complete in a way that accomplishes the desired Task.

In this work, we investigate the few-shot paradigm and find that its performance can be matched or exceeded by simple 0-shot prompts. We explore the nature of successful 0-shot prompts and propose general methods of prompt programming through the lens of natural language Semiotics. We demonstrate novel prompts which force a language model to break a problem into components before producing a verdict, and we introduce the concept of metaprompt programming, an approach which offloads the job of writing a task-specific prompt to the language model itself. Finally, we discuss how these ideas can be incorporated into existing and future benchmarks to allow us to better probe the capabilities of large language models.

~

REYNOLDS, Laria and MCDONELL, Kyle, 2021. Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm. In: Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems. Online. New York, NY, USA: Association for Computing Machinery. 2021. p. 1–7. [Accessed 29 January 2023]. CHI EA ’21. ISBN 978-1-4503-8095-9. DOI 10.1145/3411763.3451760.

[3] Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020).