Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home AI News, Research & Latest Updates

Meta Launches Llama 3 Model, But How Powerful Is It?

Dhruv Kudalkar by Dhruv Kudalkar
April 19, 2024
Reading Time: 6 mins read
Llama 3 Open-Source AI Model
Follow us on Google News   Subscribe to our newsletter

Meta recently introduced an upgraded version of Meta AI, powered by the company’s new advanced LLM, Meta Llama 3. It has been integrated into WhatsApp, Instagram, Facebook, and Messenger and Meta claims that it outperforms competing open-source models on key benchmarks.

Highlights:

  • Meta introduced an upgraded version of Meta AI powered by Llama 3, the company’s latest open-source advanced LLM.
  • It has been integrated into the search features of Meta’s social media apps WhatsApp, Instagram, Facebook, and Messenger.
  • It beats all its open-source competitors like Mistral’s 7B and Google Gemma’s 7B on key benchmarks.

What is Llama 3?

Llama 3 is the successor to Meta’s previous language models, Llama and Llama 2, which were released in 2023. The newly released advanced LLM Llama 3 has improved performance, enhanced capabilities, and a more extensive knowledge base.

Meta says that Llama 3 is among the best open models currently available, offering users a powerful tool for generating text, creating AI images, and assisting with various tasks.

Introducing Meta Llama 3: the most capable openly available LLM to date.

Today we’re releasing 8B & 70B models that deliver on new capabilities such as improved reasoning and set a new state-of-the-art for models of their sizes.

Today's release includes the first two Llama 3… pic.twitter.com/Q80lVTeS7m

— AI at Meta (@AIatMeta) April 18, 2024

Meta CEO said the following about the launch:

“We’re upgrading Meta AI with our new state-of-the-art Llama 3 AI model, which we’re open sourcing. With this new model, we believe Meta AI is now the most intelligent AI assistant that you can freely use.”

Mark Zuckerberg

The Llama 3 family includes pretrained and instruction-fine-tuned language models with 8 billion and 70 billion parameters respectively that can support a wide range of use cases.

Meta described the new models, Llama 3 8B and Llama 3 70B, as a significant advancement compared to the previous generation of Llama 2 models in terms of performance.

Model Architecture

For Llama 3, Meta opted for a relatively standard decoder-only transformer architecture with several key improvements made compared to Llama 2. It utilizes a tokenizer with a vocabulary of 128,000 tokens that encodes language much more efficiently, leading to substantially improved model performance.

To enhance the inference efficiency of the new open-source models grouped query attention (GQA) was adopted across both the 8 billion and 70 billion parameter models.

The models were trained on sequences of 8,192 tokens, using a mask to ensure self-attention does not cross document boundaries. This prevents the model from attending to tokens across different documents during training.

Training Data

Meta invested heavily in pretraining data for Llama 3, which is pre-trained on over 15 trillion tokens collected from publicly available sources. This training dataset is seven times larger than the one used for Llama 2 and includes four times more code data.

To ensure the model was trained on the highest quality data, Meta developed a series of data-filtering pipelines. These included using heuristic filters, NSFW filters, semantic deduplication approaches, and text classifiers to predict data quality.

They found that previous generations of Llama were surprisingly effective at identifying high-quality data, hence Llama 2 was used to generate the training data for the text-quality classifiers powering Llama 3.

Platform Integration

One of the most notable applications of Llama 3 is its integration with WhatsApp. This allows users to generate AI images, and text, request the latest news, and generate birthday messages and greetings directly within the messaging app. This feature is being rolled out to users gradually, making AI-powered creative tools more accessible to a broader audience.

With LLaMA-3, WhatsApp users can easily generate images and text based on their prompts, opening up new possibilities for creative expression and communication.

Llama 3 is also available on Meta’s other social media apps such as Instagram, Facebook, and Messenger. The new upgraded Meta AI is designed to help users with all their queries across Meta apps and glasses. For easier access, Meta has integrated its AI assistant with the search features of the mentioned social media apps along with the launch of its official website.

The new models will soon be available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM Watson, Microsoft Azure, NVIDIA NIM, and Snowflake, and with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.

To further support the adoption and deployment of Llama, Meta has partnered with Microsoft to make the model available on the Azure cloud computing platform. This collaboration enables developers and businesses to easily integrate Llama 3 into their applications and services, leveraging the scalability and reliability of Azure.

The availability of Llama 3 on Azure is expected to accelerate the development of AI-powered solutions across various industries.

Results

Meta’s new language models, Llama 3 8B and Llama 3 70B, demonstrate impressive performance across multiple benchmarks compared to other open-source and industry models.

The smaller 8B parameter version outperforms models like Mistral 7B and Google’s Gemma 7B on at least 9 benchmarks covering areas such as reasoning, math, coding, and general knowledge.

These include MMLU, ARC, DROP, GPQA (a set of biology-, physics- and chemistry-related questions), HumanEval (a code generation test), GSM-8K (math word problems), MATH (another mathematics benchmark), AGIEval (a problem-solving test set) and BIG-Bench Hard (a commonsense reasoning evaluation).

While some of these competitor models are not the latest versions, Llama 3 8B still scores a few percentage points higher on several benchmarks.

Meta Llama 3 Instruct model performance

More notably, the larger 70B parameter Llama 3 model is competitive with flagship industry models like Google’s Gemini 1.5 Pro. It outperforms Gemini 1.5 Pro on benchmarks like MMLU, HumanEval, and GSM-8K math word problems.

Additionally, while not rivalling Anthropic’s highest-performing Claude 3 Opus model, Llama 3 70B scores better than the second-weakest Claude 3 Sonnet model on 5 benchmarks.

Meta Llama 3 Pre-trained model performance

Meta developed a new 1,800 prompt evaluation set across 12 real-world use cases, and restricted access to it during training to prevent overfitting, to optimize Llama 3 for practical performance beyond just benchmarks – with human evaluations showing how it compares to models like Claude, Mistral, and GPT-3.5 across these scenarios.

The chart below shows the aggregated results of human evaluations across the discussed categories and prompts against Claude Sonnet, Mistral Medium, GPT-3.5, and Meta Llama 2.

Meta Llama 3 Instruct Human Evaluation

Overall, Meta claims their new Llama 3 models, especially the 70B version, demonstrate state-of-the-art performance that is competitive with or superior to other leading open-source and commercial language models across a wide range of capabilities and benchmarks.

Conclusion

Meta’s decision to make Llama 3 an open model is a significant step towards democratizing AI technology. They aim at innovation and collaboration within the AI community by allowing researchers, developers, and businesses to access and build upon the open-source model

ShareTweetShareSendSend
Dhruv Kudalkar

Dhruv Kudalkar

Hello, I'm Dhruv Kudalkar, a final year undergraduate student pursuing a degree in Information Technology. My research interests revolve around Generative AI and Natural Language Processing (NLP). I constantly explore new technologies and strive to stay up-to-date in these fields, driven by a passion for innovation and a desire to contribute to the ever-evolving landscape of intelligent systems.

RelatedPosts

Candidate during Interview

9 Best AI Interview Assistant Tools For Job Seekers in 2025

May 1, 2025
AI Generated Tom and Jerry Video

AI Just Created a Full Tom & Jerry Cartoon Episode

April 12, 2025
Amazon Buy for Me AI

Amazon’s New AI Makes Buying from Any Website Easy

April 12, 2025
Microsoft New AI version of Quake 2

What Went Wrong With Microsoft’s AI Version of Quake II?

April 7, 2025
AI Reasoning Model Better Method

This Simple Method Can Make AI Reasoning Faster and Smarter

April 3, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.