What is prompt engineering in AI research and why is it important?

Last Updated on August 3, 2022

Prompt engineering is the process of discovering prompts / inputs which yield useful or desired results. An everyday example of this is the way that we google. Practically any piece of information which we desire can be found on google, but getting access to it can sometimes be difficult. We might need to use a set of keywords which don’t make sense to a human, but which can guide machines towards the answer we want.

For instance we might add the year onto the end of a question to get more up to date results, or may append the word reddit in order to get concise answers from forum users rather than long winded content from bloggers.

For machine learning, this relates to how we phrase our inputs. Dalle-2 for instance yields higher quality images when the keyword ‘artstation’, or ‘featured on artstation’ is appended to the end of the prompt. This is because the model associates artstation, which is a portfolio website for digital artists, to high quality pieces of work.

A recent discovery (arXiv:2205.11916) has found that similar increases in quality can be extracted from large language models (LLMs) by adding the answer prompt, “let’s think step by step” to a question. The research claims an increase in “the accuracy on MultiArith from 17.7% to 78.7% and GSM8K from 10.4% to 40.7% with an off-the-shelf 175B parameter model”.

See the example I made below using GPT-3.

GPT-3 prompt engineering example using “Let’s think step by step”

In the above example, GPT-3 got the answer and its reasoning right 9/10 times tested when using a temperature of 0.7.

This discovery demonstrates that this single prompt across very diverse reasoning tasks hints at untapped and understudied fundamental zeroshot capabilities of LLMs, suggesting highlevel, multitask broad cognitive capabilities may be extracted through simple prompting.


A similar thing has been found with DALLE-2, where by increasing the number of ‘very’s before the word beautiful in a prompt causes the image to improve. For instance the prompt ‘A very beautiful painting of a mountain next to a waterfall” gives worse results than “A very very very beautiful painting of a mountain next to a waterfall” and so on.

Check out this twitter thread for some visual examples:


Leave a Reply

Your email address will not be published.