LLM (Large Language Model) how Top-P works

Token Probability: The #LLM (#LargeLanguageModel) assigns a probability to every possible next token in its vocabulary based on the preceding text.
Cumulative Probability: Then the #tokens are sorted in descending order of their likelihood.
Nucleus Selection: The #LLMmodel sums the probabilities of the #tokens, it starts with the most likely, until the cumulative sum reaches or exceeds the Top-P value.
Token Selection: The #model then randomly samples from this identified set of tokens (sum of #tokens based on their probability till the Top-P value as seen in the above example) and produce the next word, rather than selecting from the entire vocabulary.

Low Top-P value

Use Case: Customer Support Why?

High Top-P value

Use Case: Creative writer

Why?

405 Total Views 2 Views Today