It’s about the counting subreddit. It was used on the token generation database, but then removed on the training. This user posted so much on that subreddit that a token with its username was created, but then it had nothing associated with it in the training and the model dosen’t know how to act when the token is present.
ChatGPT was already trained on Reddit data. Check this video to see how one reddit username caused bugs on it: https://youtu.be/WO2X3oZEJOA?si=maWhUpJRf0ZSF_1T
I’m not gonna watch, but I assume little Bobby Tables strikes again.
It’s about the counting subreddit. It was used on the token generation database, but then removed on the training. This user posted so much on that subreddit that a token with its username was created, but then it had nothing associated with it in the training and the model dosen’t know how to act when the token is present.
Here is an alternative Piped link(s):
https://piped.video/WO2X3oZEJOA?si=maWhUpJRf0ZSF_1T
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I’m open-source; check me out at GitHub.