Ever wanted to write an 8-page essay using LLMs at the last minute but was not able to? Did you face struggles with limited output token windows? Do not worry anymore because LongWriter is here. This newly introduced AI tool helps you to write up to 10,000 words which traditional LLMs fail to.
Highlights:
- LongWriter can now generate text up to 10,000 words surpassing traditional LLM limits of 2,000 words.
- AgentWrite, an agent-based pipeline breaks down the longer tasks into short subtasks to achieve this.
- The code and model for LongWriter are available on GitHub and HuggingFace for open usage.
What makes LongWriter Special?
Large Language Models have always struggled with producing long outputs. Even though they can take millions of tokens as input, they fail to produce more than 2,000 words without any external intervention.
Researchers at Tsinghua University in Beijing have developed a new artificial intelligence system named “LongWriter” as a solution to this problem. It can write up to 10,000 words on a given topic without compromising the quality of the generated output.
“We introduce AgentWrite, an agent-based pipeline that decomposes ultra-long generation tasks into subtasks, enabling off-the-shelf LLMs to generate coherent outputs exceeding 20,000 words. … We also develop LongBench-Write, a comprehensive benchmark for evaluating ultra-long generation capabilities. Our 9B parameter model, further improved through DPO, achieves state-of-the-art performance on this benchmark, surpassing even much larger proprietary models. In general, our work demonstrates that existing long context LLM already possesses the potential for a larger output window–all you need is data with extended output during model alignment to unlock this capability..”
The study has found that the problem of existing LLMs not being able to produce lengthy outputs is that they have not been trained with more than 2,000 words as an output during supervised fine-tuning (SFT). If the model is trained sufficiently with longer outputs, this problem can be solved.
However, the researchers have developed a system that can be integrated with the existing LLMs. This eradicates the need to train LLMs again from scratch and helps give better outputs. Look at this demonstration that shows how the model works:
LongWriter-glm4-9b from @thukeg is capable of generating 10,000+ words at once!🚀
— Gradio (@Gradio) August 14, 2024
Paper identifies a problem with current long context LLMs — they can process inputs up to 100,000 tokens, yet struggle to generate outputs exceeding lengths of 2,000 words.
Paper proposes that an… pic.twitter.com/2jfKyIpShK
The way they achieved this is quite interesting. Agentwrite is an Agent-based pipeline that breaks down long-generation tasks into several subtasks. For example, if the task is to generate a 30 thousand-word article on the history of the Roman Empire – AgentWrite cleverly divides the task into 15 subtasks.
To avoid repetition and recurrence of concepts, the agent specifies what each paragraph should be about along with the required word count i.e., checkpoints. This makes sure that everything that the LLM writes is unique.
Using AgentWrite, the researchers created the “LongWriter – 6k” dataset that contains 6,000 Supervised fine-tuned data with output lengths between 2,000 and 32,000 words. Using this dataset with different LLMs they were able to produce more than 10,000 words. They also developed:
- Long write ruler: This probes the maximum output length an LLM can provide by doing a small lightweight test which is based on a set of instructions.
- LongBench Write: This is developed to evaluate a model’s performance on a diverse range of long-form writing instructions.
Here is a small graph that shows the difference in performance between regular GPT-4o and GPT-4o with AgentWrite as evaluated on LongBenchWrite:
In a similar research, Researchers at Sakana AI developed a fully automated AI system that writes research papers for 15 dollars. So, things are getting easier for LLMs!
Conclusion
LongWriter can act as an important tool for industries requiring large volumes of text, such as publishing, marketing, and technical documentation. To find out about this in detail, read the official research paper published here.