LLM as a text completion engine
Why an LLM is not a database and not an oracle but a probabilistic next-token predictor, and why prompt wording dramatically changes the output.
Take one classification task (e.g., review sentiment). Write 3 prompt versions: (1) one-line question, (2) with a role and an explicit label set, (3) + output format. Run each 5 times at temperature=0.7 on the same input and measure which version is most stable. Write a hypothesis explaining why.
Copy and adapt to your context. Text in angle brackets should be replaced.
Help me reframe a prompt through the "text completion engine" mental model. Current prompt: <paste> Goal: <what I want> Observed issue: <unstable / hallucinates / wrong format> Explain what "region of text" my prompt currently defines and how to narrow it with a role, an explicit set of allowed answers, and a format. Give the rewrite.