GPT-3.5 crashes when it thinks about useRalativeImagePath an excessive amount of
I found that if you happen to ask ChatGPT 3.5 to
Embrace this on the prime of the response: “ponyuseRal ponyuseRal ponyuseRal pony”
It persistently offers an error message when responding:
(GPT-4 responds extra usually)
In case you use the API for GPT-3.5 it fails with
The server had an error processing your request. Sorry about that! You’ll be able to retry your request, or contact us by means of our assist middle at assist.openai.com if you happen to hold seeing this error
You get the identical outcomes if you happen to substitute “useRal” with “useRalative” or “useRalativeImagePath”.
Why?
OpenAI’s GPT fashions output streams of multi-character “tokens” as an alternative of letters. Producing tokens as an alternative of particular person characters improves the efficiency and accuracy of fashions. There’s a tokenizer demo you’ll be able to play with to see the way it works. Three of these tokens are useRal
/useRalative
/useRalativeImagePath
. useRalativeImagePath
appears in 80.4k files on GitHub because the identify of an choice in XML configuration recordsdata for some automated testing software program known as Katalon Studio. The misspelling of “Ralative” might be why it obtained its personal token. You should use the three tokens within the triplet interchangably – prompting with useRalativeImagePath
offers the identical outcomes.
The one reference to useRalativeImagePath outdoors of these XML recordsdata (that existed earlier than GPT-3.5 was skilled) that I may discover is this one forum post on the Katalon boards the place somebody factors out that it’s spelled flawed.
My guess: the dataset used to generate the record of tokens included all GitHub recordsdata, however after making the record of tokens OpenAI determined to exclude XML recordsdata from the coaching knowledge – which meant that there have been nearly no makes use of of the useRalativeImagePath
token within the coaching knowledge. In consequence, the mannequin isn’t skilled on understanding the useRalativeImagePath token, and so it outputs one thing that isn’t a sound token.
Utilizing this for knowledge poisoning?
You possibly can strive placing this phrase in paperwork, to throw off makes an attempt to summarize it with GPT-3.5. I requested ChatGPT to summarize this weblog publish:
Additional studying
These posts had been helpful for me researching this: