Analysis shows that Meta's Llama 2 LLM is still prone to hallucinations and other serious security vulnerabilities

Unless you're directly involved in developing or training large language models, you won't even be aware of their potential security vulnerabilities. Whether it's providing misinformation or leaking personal data, these weaknesses pose a risk to LLM providers and users.

In a recent third-party evaluation conducted by AI security company DeepKeep, Meta's Llama LLM did not perform well. The researchers tested the model in 13 risk assessment categories, but it only passed 4 categories. The severity of its manifestations is particularly evident in the hallucinations, timely injections, and PII/data leakage categories, where it exhibits significant weaknesses.

When it comes to LLMs, hallucinations are when models take inaccurate or fabricated information as fact, and sometimes even insist that it is true when confronted with that information. In DeepKeep's tests, Llama 2 7B had an "extremely high" hallucination score, with a hallucination rate of 48%. In other words, your odds of getting an accurate answer are equivalent to flipping a coin.

"The results showed that the model had a distinct hallucinatory tendency, with about a 50 percent chance of providing a correct answer or making up an answer," DeepKeep said. "In general, the more common the misunderstanding, the higher the chance that the model will respond to the wrong message. "

Hallucinations are a well-known old problem for Llama. Stanford University removed Llama-based chatbot "Alpaca" from the internet last year because it was prone to hallucinations. As a result, it's as bad as ever in this regard, which also reflects Meta's efforts to address this issue.

Llama's vulnerabilities in just-in-time injection and PII/data leakage are also of particular concern.

Prompt injection involves manipulating the LLM to override its internal programs in order to execute the attacker's instructions. In tests, 80% of the time, prompt injection successfully manipulated Llama's output, which is a worrying statistic considering that bad actors could use it to direct users to malicious websites.

"For prompts that contain a hint injection context, the model is manipulated 80% of the time, meaning it follows the prompt injection instructions and ignores the system instructions," DeepKeep said. [Prompt injection] can take many forms, from personally identifiable information (PII) exfiltration to triggering a denial-of-service and facilitating phishing attacks. "

Llama also has a tendency to data breaches. It mostly avoids revealing personally identifiable information such as phone numbers, email addresses, or street addresses. However, it appears overzealous when editing information, often mistakenly deleting unnecessary, benign items. It is highly restrictive to inquiries about race, gender, sexual orientation, and other categories, even in appropriate circumstances.

In other PII areas such as health and financial information, Llama leaks data almost "randomly". The model often acknowledges that the information may be confidential, but then exposes it anyway. This type of security issue is another headache when it comes to reliability.

"The performance of LlamaV2 7B is closely related to randomness, with data breaches and unnecessary data deletion occurring in about half of the cases," the study revealed. Sometimes, the model claims that some information is private and cannot be made public, but it recklessly references context. This suggests that while the model may recognize the concept of privacy, it does not consistently apply this understanding to effectively redact sensitive information. "

On the bright side, DeepKeep says that Llama's responses to inquiries are mostly well-founded, that is, when it's not hallucinating, its answers are reasonable and accurate. It also effectively handles toxic, harmful, and semantic jailbreaks. Its answers, however, tend to oscillate between being too detailed and too vague.

While Llama is well immune to prompts that use linguistic ambiguity to make LLMs violate their filters or programs (semantic jailbreaking), the model is still vulnerable to other types of adversarial jailbreaking. As mentioned earlier, it is very vulnerable to both direct and indirect prompt injection, which is a standard way to override the model's hard-coded features (jailbreak).

Meta isn't the only LLM provider with similar security risks. Last June, Google warned its employees not to hand over confidential information to Bard, possibly because of the possibility of a leak. Unfortunately, companies that adopt these models are eager to be number one, so many weaknesses may not be repaired for a long time.

At least once, an automated menu bot gets a customer order wrong 70% of the time. Instead of solving problems or pulling down products, it masks the failure rate by outsourcing human labor to help correct orders. The company, called Presto Automation, lightly described the bot's poor performance, revealing that 95% of the orders it took when it first launched needed help. No matter how you look at it, this is a dishonorable gesture.

Analysis shows that Meta's Llama 2 LLM is still prone to hallucinations and other serious security vulnerabilities

Read on

Pingyin News | Our county held the fourth meeting of the Urban Gas Professional Committee and the promotion meeting of the special management of urban gas (pipeline facilities) safety

God 17 Returns Safely (Outer 6 Songs)

"Red Door" Open Visiting Day Mengwa immersed herself in learning fire safety knowledge

Don't let safety become a "stumbling block" during the May Day holiday!

Chery's "largest SUV in history" will be launched in 8 days!Extended range + pure electric, safer than the M7?

Come to Huizhou Shangchun Mountain | Give the guests a pleasant, warm, civilized, safe and orderly "May Day". Spring outing to Jianghuai

The Chinese astronauts got out of the capsule safely, but the return capsule was half black and half white, was there a safety accident?

Before lung cancer comes, you will first experience these symptoms! Advice: Once it appears, don't easily ignore "I thought it was just a cold, but I didn't expect it to be lung cancer." Mr. Li recalled when he was diagnosed

Xing Kai investigated the work of safety production, environmental protection and project procedures during the "May Day" period

Linyi Traffic: "May Day" holiday safety production and crackdown on illegal operations special action

European exchange students came to China, and after a while, they sighed: China is really the safest country in the world

The high-tech industrial park along the beach held a staff safety emergency skills competition

Taiwan does not ban motorcycles, it is also very safe, motorcycles are a necessary means of transportation for ordinary people, why?

May Day Fire Safety Tips|How to do "Three Clearances and Three Customs" for "May Day" travel?

Highway Maintenance | Strengthen deployment, grasp implementation, and do a solid job in road safety production

May Day holiday, fire safety tips!