Under the hood
Preparing LLaMA 2 for launch required a lot of tweaking to make the model more secure and less likely to spew toxic lies than its predecessor, Al-Dale says.
Meta has plenty of past mistakes to learn from. Its language model for science, Galactica, was taken down after just three days, and its previous research-only model, LlaMA, was leaked online, prompting criticism from politicians who questioned whether Meta adequately considered the risks associated with with AI language models such as disinformation and stalking.
To reduce the risk of these mistakes happening again, Meta applied a combination of different machine learning techniques aimed at improving the quality of care and safety.
Meta’s approach to training LLaMA 2 had more steps than usual for generative AI models, says Sascha Luccioni, a researcher at AI startup Hugging Face.
The model was trained with 40% more data than its predecessor. Al-Dale says there were two sources of training data: data that was collected online, and a dataset refined and adjusted based on feedback from human annotators to behave in a more desirable way. The company says it didn’t use meta-user data in LLaMA 2 and excluded data from sites it knew contained a lot of personal information.
Despite this, LLaMA 2 still throws up offensive, harmful and problematic expressions just like competing models. Meta says it didn’t remove the toxic data from the dataset because leaving it in could help LLaMA 2 better detect hate speech, and removing it could inadvertently filter out certain demographics.
Still, Metta’s commitment to openness is fascinating, Luciani says, because it allows researchers like her to properly study the biases, ethics and effectiveness of AI models.
The fact that LLaMA 2 is an open-source model will also allow outside researchers and developers to test it for security flaws, making it more secure than proprietary models, Al-Dale says.
Liang agrees. “I’m really excited to try things out and I think it’s going to be good for the community,” he says.