{"id":3932,"date":"2024-04-22T06:24:44","date_gmt":"2024-04-22T06:24:44","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=3932"},"modified":"2024-04-22T06:24:58","modified_gmt":"2024-04-22T06:24:58","slug":"access-llama-3-api","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/access-llama-3-api\/","title":{"rendered":"Here&#8217;s How Developers Can Access The Llama 3 API &amp; Locally"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Meta\u2019s latest<a href=\"https:\/\/favtutor.com\/articles\/meta-llama-3-benchmarks\/\"> Llama 3 open-source model release<\/a> has exceeded all expectations, beating top-of-the-line models on industry benchmarks. Developers are also <a href=\"https:\/\/favtutor.com\/articles\/meta-llama-3-developer-insights\/\">testing Llama 3<\/a> and checking it from various perspectives. But how to access Llama 3 to test it for yourself??<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For normal users, <a href=\"https:\/\/favtutor.com\/articles\/meta-ai-chatbot-whatsapp-instagram-access\/\">Meta AI Assistant<\/a> has integrated Llama 3 into their social media applications: Instagram, WhatsApp, and Facebook. It can also be accessed through the Meta AI <a href=\"https:\/\/www.meta.ai\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">web interface<\/a> if available in your country.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For developers wishing to incorporate Llama 3 for their applications, Llama 3 can be accessed in two ways:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The model can be run locally by downloading the model weights\/ quantized files from official sources like meta webpage, <a href=\"https:\/\/github.com\/meta-llama\/llama3\" data-type=\"link\" data-id=\"https:\/\/github.com\/meta-llama\/llama3\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">GitHub<\/a>, <a href=\"https:\/\/huggingface.co\/meta-llama\/Meta-Llama-3-8B\" data-type=\"link\" data-id=\"https:\/\/huggingface.co\/meta-llama\/Meta-Llama-3-8B\" target=\"_blank\" rel=\"noreferrer noopener\">Huggingface<\/a>, or <a href=\"https:\/\/ollama.com\/library\/llama3\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/ollama.com\/library\/llama3\" rel=\"noreferrer noopener nofollow\">Ollama<\/a> and running it on your local machine.<\/li>\n\n\n\n<li>It can also be accessed through APIs on authorized sites like <a href=\"https:\/\/replicate.com\/blog\/run-llama-3-with-an-api\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/replicate.com\/blog\/run-llama-3-with-an-api\" rel=\"noreferrer noopener nofollow\">Replicate<\/a>, Huggingface or Kaggle.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Which models to choose in the Llama 3 family:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>There are four variant Llama 3 models,<\/strong> each with their strengths. Llama 3 comes in two parameter sizes: 70 billion and 8 billion, with both base and chat-tuned models.<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>meta-llama-3-70b-instruct:<\/strong> 70 billion parameter model fine-tuned on chat completions. If you want to build a chatbot with the best accuracy, this is the one to use.<\/li>\n\n\n\n<li><strong>meta-llama-3-8b-instruct:<\/strong> 8 billion parameter model fine-tuned on chat completions. Use this if you\u2019re building a chatbot and would prefer it to be faster and cheaper at the expense of accuracy.<\/li>\n\n\n\n<li><strong>meta-llama-3-70b:<\/strong> 70 billion parameter base model. This is the 70 billion parameter model before the instruction tuning on chat completions.<\/li>\n\n\n\n<li><strong>meta-llama-3-8b:<\/strong> 8 billion parameter base model. This is the 8 billion parameter model before the instruction tuning on chat completions.<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#8217;s take a detailed look at the methods to access these models!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>1) Locally Run Llama 3<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Pretrained model weights can be downloaded to run large models on your local systems. These model weights can be downloaded only by requesting access to Meta AI<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>a) Official Site<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Downloading models through the official site:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Through <a href=\"https:\/\/llama.meta.com\/llama-downloads\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">this webpage<\/a>, users can access the files by entering their details, and Meta can accept or reject applications.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1002\" height=\"892\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/1-image.jpg\" alt=\"Access to Meta Llama Website\" class=\"wp-image-3934\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/1-image.jpg 1002w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/1-image-300x267.jpg 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/1-image-768x684.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/1-image-750x668.jpg 750w\" sizes=\"(max-width: 1002px) 100vw, 1002px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">Users have to then agree to the Meta terms and conditions including the acceptable use policy.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full is-resized\"><img decoding=\"async\" width=\"964\" height=\"615\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/2image.jpg\" alt=\"Download Llama from website\" class=\"wp-image-3935\" style=\"width:964px;height:auto\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/2image.jpg 964w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/2image-300x191.jpg 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/2image-768x490.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/2image-750x478.jpg 750w\" sizes=\"(max-width: 964px) 100vw, 964px\" \/><\/figure>\n<\/div>\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"915\" height=\"170\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/3image.jpg\" alt=\"continuation of llama website\" class=\"wp-image-3936\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/3image.jpg 915w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/3image-300x56.jpg 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/3image-768x143.jpg 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/3image-750x139.jpg 750w\" sizes=\"(max-width: 915px) 100vw, 915px\" \/><\/figure>\n<\/div>\n\n\n<h3 class=\"wp-block-heading\"><strong>b) Hugging Face<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The model can also be downloaded from <strong><a href=\"https:\/\/huggingface.co\/meta-llama\/Meta-Llama-3-8B\/tree\/main\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/huggingface.co\/meta-llama\/Meta-Llama-3-8B\/tree\/main\" rel=\"noreferrer noopener nofollow\">Huggingface Hub<\/a><\/strong>:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"713\" height=\"710\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/image333.png\" alt=\"Llama on Hugging Face\" class=\"wp-image-3939\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/image333.png 713w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/image333-300x300.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/image333-150x150.png 150w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/image333-75x75.png 75w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/image333-350x350.png 350w\" sizes=\"(max-width: 713px) 100vw, 713px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">Once your request is approved, you can access the repository and download the weights.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"718\" height=\"92\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/55.png\" alt=\"Llama on Hugging Face Continuation\" class=\"wp-image-3940\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/55.png 718w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/55-300x38.png 300w\" sizes=\"(max-width: 718px) 100vw, 718px\" \/><\/figure>\n<\/div>\n\n\n<h3 class=\"wp-block-heading\"><strong>c) Ollama Platform<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">For Linux\/MacOS users, <a href=\"https:\/\/ollama.com\/library\/llama3\" data-type=\"link\" data-id=\"https:\/\/ollama.com\/library\/llama3\" target=\"_blank\" rel=\"noreferrer noopener\">Ollama<\/a> is the best choice to locally run LLMs. Ollama now llama 3 models as a part of its library.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here are two commands to run Llama 3 in Ollama&#8217;s library platform:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>CLI<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Open the terminal and run this code:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ollama run llama3<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>API<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Example using curl:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>curl -X POST http:\/\/localhost:11434\/api\/generate -d '{\n  \"model\": \"llama3\",\n  \"prompt\":\"Why is the sky blue?\"\n }'<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Instruct is fine-tuned for chat\/dialogue use cases.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ollama run llama3 ollama run llama3:70b<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Pre-trained is the base model.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>Example:<\/em> <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>ollama run llama3:text ollama run llama3:70b-text<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2) Running Llama 3 through APIs<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Llama-3 is hosted on several websites like Hugging Face Hub, Kaggle, and Replicate.<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>a) Hugging Face<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The huggingface repository contains two versions of Meta-Llama-3-70B-Instruct, for use with transformers and with the original llama3 codebase.<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">mport transformers\nimport torch\n\nmodel_id = &quot;meta-llama\/Meta-Llama-3-70B-Instruct&quot;\n\npipeline = transformers.pipeline(\n    &quot;text-generation&quot;,\n    model=model_id,\n    model_kwargs={&quot;torch_dtype&quot;: torch.bfloat16},\n    device=&quot;auto&quot;,\n)\n\nmessages = [\n    {&quot;role&quot;: &quot;system&quot;, &quot;content&quot;: &quot;You are a pirate chatbot who always responds in pirate speak!&quot;},\n    {&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;Who are you?&quot;},\n]\n\nprompt = pipeline.tokenizer.apply_chat_template(\n        messages, \n        tokenize=False, \n        add_generation_prompt=True\n)\n\nterminators = [\n    pipeline.tokenizer.eos_token_id,\n    pipeline.tokenizer.convert_tokens_to_ids(&quot;&lt;|eot_id|&gt;&quot;)\n]\n\noutputs = pipeline(\n    prompt,\n    max_new_tokens=256,\n    eos_token_id=terminators,\n    do_sample=True,\n    temperature=0.6,\n    top_p=0.9,\n)\nprint(outputs[0][&quot;generated_text&quot;][len(prompt):])<\/pre><\/div>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"889\" height=\"314\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/66.png\" alt=\"Llama API on Hugging Face\" class=\"wp-image-3944\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/66.png 889w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/66-300x106.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/66-768x271.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/66-750x265.png 750w\" sizes=\"(max-width: 889px) 100vw, 889px\" \/><\/figure>\n<\/div>\n\n\n<h3 class=\"wp-block-heading\"><strong>b) Kaggle<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The Llama-3 model can be accessed through Kaggle by verifying the access provided on the official meta page.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"399\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/67-1024x399.png\" alt=\"Llama API on Kaggle\" class=\"wp-image-3947\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/67-1024x399.png 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/67-300x117.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/67-768x299.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/67-750x292.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/67-1140x445.png 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/67.png 1195w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"553\" height=\"527\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/68.png\" alt=\"Access Llama on Kaggle\" class=\"wp-image-3948\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/68.png 553w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/68-300x286.png 300w\" sizes=\"(max-width: 553px) 100vw, 553px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">Once access is granted, users can load the model into either Kaggle or any other notebook using<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\u201cmodel = \"\/kaggle\/input\/llama-3\/transformers\/8b-chat-hf\/1\"\n\npipeline = transformers.pipeline(\n    \"text-generation\",\n    model=model,\n    torch_dtype=torch.float16,\n    device_map=\"auto\",\n)\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>c) Replicate<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Developers can access the model through the <a href=\"https:\/\/replicate.com\/blog\/run-llama-3-with-an-api\" target=\"_blank\" data-type=\"link\" data-id=\"https:\/\/replicate.com\/blog\/run-llama-3-with-an-api\" rel=\"noreferrer noopener nofollow\">replicate API<\/a> using their replicate API token in Python, JavaScript, or cURL libraries.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For <strong>python<\/strong>,<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Install Replicate\u2019s Python client library:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>pip install replicate<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Set the REPLICATE_API_TOKEN environment variable:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>export REPLICATE_API_TOKEN=r8_I11**********************************<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Import the client:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;python&quot;,&quot;mime&quot;:&quot;text\/x-python&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;Python&quot;,&quot;modeName&quot;:&quot;python&quot;}\">import replicate\n\n\n# The meta\/meta-llama-3-70b-instruct model can stream output as it's running.\nfor event in replicate.stream(\n    &quot;meta\/meta-llama-3-70b-instruct&quot;,\n    input={\n        &quot;prompt&quot;: &quot;Can you write a poem about open source machine learning?&quot;\n    },\n):\n    print(str(event), end=&quot;&quot;)<\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">For JavaScript:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Install Replicate\u2019s Node.js client library:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>npm install replicate<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Set the REPLICATE_API_TOKEN environment variable:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>export REPLICATE_API_TOKEN=r8_I11**********************************<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">This is your Default API token. Keep it to yourself.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Import and set up the client:<\/p>\n\n\n\n<div class=\"wp-block-codemirror-blocks-code-block code-block\"><pre class=\"CodeMirror\" data-setting=\"{&quot;mode&quot;:&quot;javascript&quot;,&quot;mime&quot;:&quot;text\/javascript&quot;,&quot;theme&quot;:&quot;material&quot;,&quot;lineNumbers&quot;:true,&quot;styleActiveLine&quot;:false,&quot;lineWrapping&quot;:false,&quot;readOnly&quot;:true,&quot;language&quot;:&quot;JavaScript&quot;,&quot;modeName&quot;:&quot;js&quot;}\">import Replicate from &quot;replicate&quot;;\n\nconst replicate = new Replicate({\n  auth: process.env.REPLICATE_API_TOKEN,\n});<\/pre><\/div>\n\n\n\n<p class=\"wp-block-paragraph\">For cURL,<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Set the REPLICATE_API_TOKEN environment variable:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>export REPLICATE_API_TOKEN=r8_I11**********************************<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Run meta\/meta-llama-3-70b-instruct using Replicate\u2019s API.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ curl -s -X POST \\\n  -H \"Authorization: Bearer $REPLICATE_API_TOKEN\" \\\n  -H \"Content-Type: application\/json\" \\\n  -d $'{\n    \"input\": {\n      \"prompt\": \"Can you write a poem about open source machine learning?\"\n    }\n  }' \\\n  https:&#47;&#47;api.replicate.com\/v1\/models\/meta\/meta-llama-3-70b-instruct\/predictions\n<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The model can be tested for different prompts in the AI playground as well:<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"916\" height=\"630\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/69.png\" alt=\"Llama 3 on AI Playground\" class=\"wp-image-3951\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/69.png 916w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/69-300x206.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/69-768x528.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/69-750x516.png 750w\" sizes=\"(max-width: 916px) 100vw, 916px\" \/><\/figure>\n<\/div>\n\n\n<h3 class=\"wp-block-heading\"><strong>d) Other cloud platforms<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Llama-3 is hosted on several other cloud platforms like Vertex AI, Azure AI, and Cloudflare Workers AI.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Meta Llama 3 is available on <a href=\"https:\/\/console.cloud.google.com\/vertex-ai\/publishers\/meta\/model-garden\/llama3?pli=1\" data-type=\"link\" data-id=\"https:\/\/console.cloud.google.com\/vertex-ai\/publishers\/meta\/model-garden\/llama3?pli=1\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Vertex AI Model Garden<\/a>. Like its predecessors, Llama 3 is freely licensed for research as well as many commercial applications. Llama 3 is available in two sizes, 8B and 70B, as both a pre-trained and instruction fine-tuned model.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"1024\" height=\"544\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/70-1024x544.png\" alt=\"Llama on Vertex AI\" class=\"wp-image-3952\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/70-1024x544.png 1024w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/70-300x159.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/70-768x408.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/70-1536x816.png 1536w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/70-750x398.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/70-1140x606.png 1140w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/70.png 2048w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">To start building, click on \u201copen notebook\u201d<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"880\" height=\"211\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/71.png\" alt=\"Llama 3 on vertex, continuation\" class=\"wp-image-3953\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/71.png 880w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/71-300x72.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/71-768x184.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/71-750x180.png 750w\" sizes=\"(max-width: 880px) 100vw, 880px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">Here&#8217;s the code mentioned above.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\u201cgsutil -m cp -R gs:\/\/vertex-model-garden-public-us\/llama3 YOUR_CLOUD_STORAGE_BUCKET_URL\u201d<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">The model is available in the catalogue of\u00a0 <a href=\"https:\/\/ai.azure.com\/explore\/models\/Meta-Llama-3-70B-Instruct\/version\/1\/registry\/azureml-meta\" data-type=\"link\" data-id=\"https:\/\/ai.azure.com\/explore\/models\/Meta-Llama-3-70B-Instruct\/version\/1\/registry\/azureml-meta\" target=\"_blank\" rel=\"noreferrer noopener nofollow\"><strong>Azure AI<\/strong> <\/a>as well and is supported by integration platforms like Langchain.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"999\" height=\"567\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/72.png\" alt=\"Llama on Azure AI\" class=\"wp-image-3955\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/72.png 999w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/72-300x170.png 300w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/72-768x436.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/04\/72-750x426.png 750w\" sizes=\"(max-width: 999px) 100vw, 999px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">There are a large number of platforms hosting Llama-3 and developers can select the one best suited to their application. For low-cost local applications, free API calls might be the best approach, and for large quantities of data, cloud-based applications might be more suited. There is an option for everyone, so happy coding!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Find out how to access Llama 3 by locally downloading the model weights or through APIs on authorized sites like Huggingface or Kaggle.<\/p>\n","protected":false},"author":20,"featured_media":3964,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":null,"jnews_primary_category":{"id":"","hide":""},"footnotes":""},"categories":[57],"tags":[56,171,172,72,81],"class_list":["post-3932","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","tag-ai","tag-llama","tag-llama-3","tag-llm","tag-meta"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/3932","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/20"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=3932"}],"version-history":[{"count":18,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/3932\/revisions"}],"predecessor-version":[{"id":3966,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/3932\/revisions\/3966"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/3964"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=3932"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=3932"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=3932"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}