This is an alpha demo of the 175B parameter model.
WARNING: This model will generate MANY offensive things. Due to this being an alpha demo, NO safety measures are in place.
Tips for better generation:
OPT-175B is a large Language model. It predicts the next word based on all the previous words. It's the same idea as the predictive auto-complete in your phone or email, but supercharged.
A combination of CCNews, CommonCrawl, DM Mathematics, Enron corpus, Project Gutenberg, HackerNews, OpenSubtitles, OpenWebText2, USPTO, Wikipedia, BookCorpus, Stories, and Pushshift.io Reddit.
The newest data in the model goes to roughly through September 2021, but most of the data is older. The model knows about COVID-19, knows Joe Biden is president, but isn't aware of the rebrand to Meta.
You can encourage the model to focus on newer information by putting "2021" somewhere in your prompt.
The model doesn't work well with declarative instructions, or point blank questions. See the Examples to understand how to best use the model.
This is a well-known problem with language models, and there is great research being done at Meta AI to address these problems. Due to the alpha nature of this demo, we have not been able to incorporate that work yet.
Yes, it definitely can be. We're sampling from the model, so you can try generating again and see if it helps. We will address repetition in future iterations.
During this alpha stage, we are not collecting or storing any data from this demo. We encourage you to create your own document with feedback or analysis of the model, and provide a link in the comments of the workplace post.
The model has a context length of 2048, but limited 512 in this demo for compute reasons.
For this demo, we use Nucleus sampling (p=0.9) with a softmax temperature of T=0.7.