Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home AI News, Research & Latest Updates

LongWriter LLM Pushed Boundaries with 10,000 Words Output

Geethanjali Pedamallu by Geethanjali Pedamallu
August 27, 2024
Reading Time: 4 mins read
LongWriter
Follow us on Google News   Subscribe to our newsletter

Ever wanted to write an 8-page essay using LLMs at the last minute but was not able to? Did you face struggles with limited output token windows? Do not worry anymore because LongWriter is here. This newly introduced AI tool helps you to write up to 10,000 words which traditional LLMs fail to.

Highlights:

  • LongWriter can now generate text up to 10,000 words surpassing traditional LLM limits of 2,000 words.
  • AgentWrite, an agent-based pipeline breaks down the longer tasks into short subtasks to achieve this.
  • The code and model for LongWriter are available on GitHub and HuggingFace for open usage.

What makes LongWriter Special?

Large Language Models have always struggled with producing long outputs. Even though they can take millions of tokens as input, they fail to produce more than 2,000 words without any external intervention.

Researchers at Tsinghua University in Beijing have developed a new artificial intelligence system named “LongWriter” as a solution to this problem. It can write up to 10,000 words on a given topic without compromising the quality of the generated output.

“We introduce AgentWrite, an agent-based pipeline that decomposes ultra-long generation tasks into subtasks, enabling off-the-shelf LLMs to generate coherent outputs exceeding 20,000 words. … We also develop LongBench-Write, a comprehensive benchmark for evaluating ultra-long generation capabilities. Our 9B parameter model, further improved through DPO, achieves state-of-the-art performance on this benchmark, surpassing even much larger proprietary models. In general, our work demonstrates that existing long context LLM already possesses the potential for a larger output window–all you need is data with extended output during model alignment to unlock this capability..”

The study has found that the problem of existing LLMs not being able to produce lengthy outputs is that they have not been trained with more than 2,000 words as an output during supervised fine-tuning (SFT). If the model is trained sufficiently with longer outputs, this problem can be solved.

However, the researchers have developed a system that can be integrated with the existing LLMs. This eradicates the need to train LLMs again from scratch and helps give better outputs. Look at this demonstration that shows how the model works:

LongWriter-glm4-9b from @thukeg is capable of generating 10,000+ words at once!🚀

Paper identifies a problem with current long context LLMs — they can process inputs up to 100,000 tokens, yet struggle to generate outputs exceeding lengths of 2,000 words.

Paper proposes that an… pic.twitter.com/2jfKyIpShK

— Gradio (@Gradio) August 14, 2024

The way they achieved this is quite interesting. Agentwrite is an Agent-based pipeline that breaks down long-generation tasks into several subtasks. For example, if the task is to generate a 30 thousand-word article on the history of the Roman Empire – AgentWrite cleverly divides the task into 15 subtasks.

To avoid repetition and recurrence of concepts, the agent specifies what each paragraph should be about along with the required word count i.e., checkpoints. This makes sure that everything that the LLM writes is unique.

AgentWrite adopts a plan-and-write pipeline

Using AgentWrite, the researchers created the “LongWriter – 6k” dataset that contains 6,000 Supervised fine-tuned data with output lengths between 2,000 and 32,000 words. Using this dataset with different LLMs they were able to produce more than 10,000 words. They also developed:

  • Long write ruler: This probes the maximum output length an LLM can provide by doing a small lightweight test which is based on a set of instructions.
  • LongBench Write: This is developed to evaluate a model’s performance on a diverse range of long-form writing instructions.

Here is a small graph that shows the difference in performance between regular GPT-4o and GPT-4o with AgentWrite as evaluated on LongBenchWrite:

GPT-4o vs GPT-4o with AgentWriter

In a similar research, Researchers at Sakana AI developed a fully automated AI system that writes research papers for 15 dollars. So, things are getting easier for LLMs!

Conclusion

LongWriter can act as an important tool for industries requiring large volumes of text, such as publishing, marketing, and technical documentation. To find out about this in detail, read the official research paper published here.

ShareTweetShareSendSend
Geethanjali Pedamallu

Geethanjali Pedamallu

Hi, I am P S Geethanjali, a college student learning something new every day about what's happening in the world of Artificial Intelligence and Machine Learning. I'm passionate about exploring the latest AI technologies and how they solve real-world problems. In my free time, you will find me reading books or listening to songs for relaxation.

RelatedPosts

Candidate during Interview

9 Best AI Interview Assistant Tools For Job Seekers in 2025

May 1, 2025
AI Generated Tom and Jerry Video

AI Just Created a Full Tom & Jerry Cartoon Episode

April 12, 2025
Amazon Buy for Me AI

Amazon’s New AI Makes Buying from Any Website Easy

April 12, 2025
Microsoft New AI version of Quake 2

What Went Wrong With Microsoft’s AI Version of Quake II?

April 7, 2025
AI Reasoning Model Better Method

This Simple Method Can Make AI Reasoning Faster and Smarter

April 3, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.