Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home AI News, Research & Latest Updates

ExecuTorch Alpha: A New Tool to Deploy Large LLMs on Edge

Dhruv Kudalkar by Dhruv Kudalkar
May 6, 2024
Reading Time: 4 mins read
PyTorch ExecuTorch Alpha
Follow us on Google News   Subscribe to our newsletter

PyTorch recently announced the release of ExecuTorch Alpha, a new tool for deploying LLMs and large machine-learning models on edge devices with limited resources. It successfully bridges the gap between advanced AI capabilities and environments with limited computational resources.

Highlights:

  • ExecuTorch Alpha is a new tool released by PyTorch for the deployment of LLMs and large ML models on edge devices.
  • It leverages quantization and other techniques to pack LLMs for efficient execution on edge devices.
  • It supports running models like Meta’s Llama 2 7B and Llama 3 8B on smartphones and wearables.

Why do we need ExecuTorch Alpha?

The existing methods for running large language models require computers with high computational power and resources. This has limited their application on edge devices like smartphones and mobile phones.

With this new tool, PyTorch aims to resolve the need to optimize model execution on edge devices while maintaining performance and efficiency.

Built on the PyTorch framework, ExecuTorch Alpha offers a complete workflow for deploying models on edge devices. To bring LLMs to edge devices, it heavily leverages quantization and other techniques to pack these models appropriately.

It is focused on deploying large language models and large ML models to edge devices, stabilizing the API surface, and improving the installation process.

ExecuTorch Alpha supports 4-bit post-training quantization using GPTQ. PyTorch has also provided device support on CPU by landing dynamic shape support and new dtypes in XNNPack. They have also made significant improvements in export and lowering, reduced memory overhead, and improved runtime performance thus leading to resource optimization.

It makes it possible to use small and efficient model runtimes on a wide range of edge devices by focusing on portability and efficient memory management. This connects powerful AI models with environments that are limited in resources.

It enables the deployment of powerful models on resource-constrained edge devices by prioritizing portability and efficient memory management. This technology bridges the gap between advanced LLMs and environments with limited computational resources.

Support for various models

ExecuTorch Alpha enables running Meta’s Llama 2 7B efficiently on iPhone 15 Pro, iPhone 15 Pro Max, Samsung Galaxy S22, S23, and S24 phones, and other edge devices. It also provides early support for Meta’s latest model, Llama 3 8B.

In addition to other improvements, this release enables running Meta Llama 2 7B efficiently on devices like the iPhone 15 Pro, Samsung Galaxy S24 and other edge devices — it also includes early support for Llama 3 8B.

More details on ExecuTorch Alpha ⬇️ https://t.co/aVkecCkQeQ

— AI at Meta (@AIatMeta) April 30, 2024

PyTorch has also closely collaborated with its partners at Apple, Arm, Qualcomm Technologies, Google, and MediaTek to build ExecuTorch Alpha.

👇 Get started with ExecuTorch Alpha for optimal performance of LLMs on the CPU, alongside delegation to GPU and NPU.

With LLMs already running on our efficient CPUs, our close partnership with @PyTorch is making this easier on @Meta’s Llama 2, 3 and other broad models. https://t.co/HuwGHoJR8t

— Arm (@Arm) April 30, 2024

They have also significantly expanded their list of supported models across NLP, vision, and speech. Although support for on-device LLMs is early, they expect most traditional models to function seamlessly out of the box, with delegation to XNNPACK, Core ML, MPS, TOSA, and HTP for performance.

The ExecuTorch framework has already been tested at the production level. Meta has been using it for hand tracking on Meta Quest 3 and various models on Ray-Ban Meta Smart Glasses They have also begun the integration of ExecuTorch with Instagram, WhatsApp, and other Meta products.

With ExecuTorch Alpha, PyTorch also intends to provide a powerful software development kit (SDK) that will help monitor the entire process from model authoring to deployment. It provides the SDK for debugging the model as if it were debugging a Python program. This helps to analyze the model performance and identify bottlenecks.

Conclusion

PyTorch’s release of ExecuTorch Alpha presents an innovative solution for the deployment of LLMs and large machine-learning models on resource-constrained edge devices. NVIDIA is also working in similar space with its NIM platform to deploy LLMs.

ShareTweetShareSendSend
Dhruv Kudalkar

Dhruv Kudalkar

Hello, I'm Dhruv Kudalkar, a final year undergraduate student pursuing a degree in Information Technology. My research interests revolve around Generative AI and Natural Language Processing (NLP). I constantly explore new technologies and strive to stay up-to-date in these fields, driven by a passion for innovation and a desire to contribute to the ever-evolving landscape of intelligent systems.

RelatedPosts

Candidate during Interview

9 Best AI Interview Assistant Tools For Job Seekers in 2025

May 1, 2025
AI Generated Tom and Jerry Video

AI Just Created a Full Tom & Jerry Cartoon Episode

April 12, 2025
Amazon Buy for Me AI

Amazon’s New AI Makes Buying from Any Website Easy

April 12, 2025
Microsoft New AI version of Quake 2

What Went Wrong With Microsoft’s AI Version of Quake II?

April 7, 2025
AI Reasoning Model Better Method

This Simple Method Can Make AI Reasoning Faster and Smarter

April 3, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.