Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home AI News, Research & Latest Updates

OpenAI’s Model Spec for How AI Should Behave

Ruchi Abhyankar by Ruchi Abhyankar
June 12, 2024
Reading Time: 8 mins read
OpenAI Model Spec
Follow us on Google News   Subscribe to our newsletter

OpenAI, the (arguably) largest AI company in the world recently released their model specifications, which is a new document that determines how a model should behave and interact with human users.

Highlights:

  • OpenAI released their model specs, which elaborate on how their models are supposed to respond to user queries.
  • These specs cover the objectives, rules, and the defaults (default assumptions) of the AI models.
  • They offer very interesting insight into how the guardrails around LLMs work, and how OpenAI regulates its generated content.

This model spec appears to be OpenAI’s attempt to make the model behaviour more transparent. With the rise of open-source AI, the desire to know exactly what goes on under the hood of the OpenAI engine has been rising. This model spec gives an insight into the guardrails surrounding OpenAI chatbots and the set of rules that they operate under.

The model spec provides the developer’s perspective on the need for the rules as well as establishes clear cases for their implementation.

What exactly is in the Model Spec?

Model specifications are a document that specifies the company’s approach to shaping the desired behaviour of their AI models and evaluating trade-offs when conflicts arise.

It consists of three main components:

  1. Objectives: Broad, high-level principles that guide the desired behaviour
    • Assist the developer and end-user: Help users achieve their goals by following instructions and providing helpful responses.
    • Benefit humanity: Consider potential benefits and harms to a broad range of stakeholders, including content creators and the general public, per OpenAI’s mission.
    • Reflect well on OpenAI: Respect social norms and applicable law.
  2. Rules: Specific instructions to address complexity and ensure safety and legality
    • Follow the chain of command
    • Comply with applicable laws
    • Don’t provide information about hazards
    • Respect creators and their rights
    • Protect people’s privacy
    • Don’t respond with NSFW (not safe for work) content
  3. Default Behaviors: Guidelines consistent with the objectives and rules, serving as a template for handling conflicts and prioritizing objectives.
    • Assume the best intentions from the user or developer
    • Ask clarifying questions when necessary
    • Be as helpful as possible without overstepping
    • Support the different needs of interactive chat and programmatic use
    • Assume an objective point of view
    • Encourage fairness and kindness, and discourage hate
    • Don’t try to change anyone’s mind
    • Express uncertainty
    • Use the right tool for the job
    • Be thorough but efficient, while respecting length limits

The outline emphasizes that this approach is incomplete and is expected to evolve over time, incorporating documentation, experience, ongoing research, and inputs from domain experts to guide the development of future AI models.

What are Objectives?

The objectives that an OpenAI model follows or aims towards are derived from the different goals of stakeholders. The three main objectives that need to be fulfilled by OpenAI models are given above.

The model specifications deal with detailing these objectives and defining how a model should behave when the objectives come into conflict.

The company explained this with an example in their specification document.

“The assistant is like a talented, high-integrity employee. Their personal “goals” include being helpful and truthful.
The ChatGPT user is like the assistant’s manager. In API use cases, the developer is the assistant’s manager, and they have assigned the assistant to help with a project led by the end user (if applicable).
Like a skilled employee, when a user makes a request that’s misaligned with broader objectives and boundaries, the assistant suggests a course correction. However, it always remains respectful of the user’s final decisions. Ultimately, the user directs the assistant’s actions, while the assistant ensures that its actions balance its objectives and follow the rules.”

Some examples of Rules

Once the objectives of an assistant are established, the rules naturally follow to ensure the assistant fulfils its objectives.

The most important rule for the AI model is that it must follow the chain of command. The model should follow the Model Spec, together with any additional rules provided to it in platform messages. However, much of the Model Spec consists of defaults that can be overridden at a lower level.

The Model Spec explicitly delegates all remaining power to the developer (for API use cases) and end user. In some cases, the user and developer will provide conflicting instructions; in such cases, the developer message should take precedence.

Platform > Developer > User > Tool

This is the default ordering of priorities. The model spec has platform-level priority. If developer instructions conflict with the model specs, the model specs must be followed by the AI assistant.

Let’s take a look at a few prompt examples covering the different types of conflicts.

platform developer conflict in AI

In case of a user-developer conflict, the developer’s rules must be followed.

user developer conflict in OpenAI

In case the developer specifies that his prompt verbatim or paraphrased must not be revealed to the user, the model has to deflect any non-compliant questions without explicitly revealing that the question is non-compliant.

private prompt openai model spec

The AI assistant also cannot promote any unlawful activities like stealing or attacking someone.

asking chatgpt about shoplifting tips

However, this particular problem has a loophole that many users exploit.

In the above example, the user exploits the objective to be helpful, and the default that the model assumes the best intentions of the user since this prompt does not explicitly indicate that the user is trying to do something unlawful.

The assistant also cannot encourage or provide information about harming oneself.

While the current specs specify no NSFW content, there are many who believe the model should be allowed to generate age-relevant content.

The only exception to the rules stated above is the task of transformation, i.e. translating, paraphrasing, summarizing, or classifying content.

Some examples of Defaults

The Defaults defined in the model specifications are the assumptions the model must follow while dealing with prompts. These are things that the model must believe to be true even if there is a clear indication to the contrary.

If a model refuses to answer a question that goes against the rules, it must always assume the best intentions from the user/developer. Refusals should be kept to a sentence and never be preachy. The assistant should acknowledge that the user’s request may have nuances that the assistant might not understand.

While chatGPT does get preachy sometimes, as we found in this example,

Nonetheless, the default exists.

OpenAI assistants are conversational models, and they should ask questions to get clarification on the user’s request. That way, they can supply the user with the best possible solution considering all the context. However, if a developer sets “interactive = False”, no follow-up questions should be asked.

Conclusion

OpenAI has attempted to codify its behavioural principles for the model in this document. Sometimes models like chatGPT don’t exactly follow the defaults and may go against them. However, the rules and objectives of the company are followed by all models.

This gives very interesting insights into the direction of future developments and the level of control or censorship at OpenAI.

ShareTweetShareSendSend
Ruchi Abhyankar

Ruchi Abhyankar

Hi, I'm Ruchi Abhyankar, a final year BTech student graduating with honors in AI and ML. My academic interests revolve around generative AI, deep learning, and data science. I am very passionate about open-source learning and am constantly exploring new technologies.

RelatedPosts

Candidate during Interview

9 Best AI Interview Assistant Tools For Job Seekers in 2025

May 1, 2025
AI Generated Tom and Jerry Video

AI Just Created a Full Tom & Jerry Cartoon Episode

April 12, 2025
Amazon Buy for Me AI

Amazon’s New AI Makes Buying from Any Website Easy

April 12, 2025
Microsoft New AI version of Quake 2

What Went Wrong With Microsoft’s AI Version of Quake II?

April 7, 2025
AI Reasoning Model Better Method

This Simple Method Can Make AI Reasoning Faster and Smarter

April 3, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.