Last Updated on August 3, 2022
Prompt engineering is the process of discovering prompts / inputs which yield useful or desired results. An everyday example of this is the way that we google. Practically any piece of information which we desire can be found on google, but getting access to it can sometimes be difficult. We might need to use a set of keywords which don’t make sense to a human, but which can guide machines towards the answer we want.
For instance we might add the year onto the end of a question to get more up to date results, or may append the word reddit in order to get concise answers from forum users rather than long winded content from bloggers.
For machine learning, this relates to how we phrase our inputs. Dalle-2 for instance yields higher quality images when the keyword ‘artstation’, or ‘featured on artstation’ is appended to the end of the prompt. This is because the model associates artstation, which is a portfolio website for digital artists, to high quality pieces of work.
A recent discovery (arXiv:2205.11916) has found that similar increases in quality can be extracted from large language models (LLMs) by adding the answer prompt, “let’s think step by step” to a question. The research claims an increase in “the accuracy on MultiArith from 17.7% to 78.7% and GSM8K from 10.4% to 40.7% with an off-the-shelf 175B parameter model”.
See the example I made below using GPT-3.
In the above example, GPT-3 got the answer and its reasoning right 9/10 times tested when using a temperature of 0.7.
This discovery demonstrates that this single prompt across very diverse reasoning tasks hints at untapped and understudied fundamental zero–shot capabilities of LLMs, suggesting high–level, multi–task broad cognitive capabilities may be extracted through simple prompting.
A similar thing has been found with DALLE-2, where by increasing the number of ‘very’s before the word beautiful in a prompt causes the image to improve. For instance the prompt ‘A very beautiful painting of a mountain next to a waterfall” gives worse results than “A very very very beautiful painting of a mountain next to a waterfall” and so on.
Check out this twitter thread for some visual examples:
Language-conditional models can act a bit like decision transformers, in that you can prompt them with a desired level of "reward".
E.g., want prettier #dalle creations? "Just ask" by adding "[very]^n beautiful":
n=0: "A beautiful painting of a mountain next to a waterfall." pic.twitter.com/vu0NceTxAv
— Phillip Isola (@phillip_isola) June 2, 2022