{"id":5568,"date":"2024-06-12T09:53:14","date_gmt":"2024-06-12T09:53:14","guid":{"rendered":"https:\/\/favtutor.com\/articles\/?p=5568"},"modified":"2024-06-12T09:53:15","modified_gmt":"2024-06-12T09:53:15","slug":"openai-model-spec","status":"publish","type":"post","link":"https:\/\/favtutor.com\/articles\/openai-model-spec\/","title":{"rendered":"OpenAI&#8217;s Model Spec for How AI Should Behave"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">OpenAI, the (arguably) largest AI company in the world recently released their <a href=\"https:\/\/openai.com\/index\/introducing-the-model-spec\/\" target=\"_blank\" rel=\"noopener\">model specifications<\/a>, which is a new document that determines how a model should behave and interact with human users.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Highlights:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OpenAI released their model specs, which elaborate on how their models are supposed to respond to user queries.<\/li>\n\n\n\n<li>These specs cover the objectives, rules, and the defaults (default assumptions) of the AI models.<\/li>\n\n\n\n<li>They offer very interesting insight into how the guardrails around LLMs work, and how OpenAI regulates its generated content.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This model spec appears to be OpenAI\u2019s attempt to make the model behaviour more transparent. With the rise of open-source AI, the desire to know exactly what goes on under the hood of the OpenAI engine has been rising. This model spec gives an insight into the guardrails surrounding OpenAI chatbots and the set of rules that they operate under.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The model spec provides the developer&#8217;s perspective on the need for the rules as well as establishes clear cases for their implementation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>What exactly is in the Model Spec?<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Model specifications are a document that specifies the company\u2019s approach to shaping the desired behaviour of their AI models and evaluating trade-offs when conflicts arise. <\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It consists of three main components:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Objectives:<\/strong> Broad, high-level principles that guide the desired behaviour\n<ul class=\"wp-block-list\">\n<li><strong>Assist the developer and end-user:<\/strong> Help users achieve their goals by following instructions and providing helpful responses.<\/li>\n\n\n\n<li><strong>Benefit humanity:<\/strong> Consider potential benefits and harms to a broad range of stakeholders, including content creators and the general public, per OpenAI&#8217;s mission.<\/li>\n\n\n\n<li><strong>Reflect well on OpenAI:<\/strong> Respect social norms and applicable law.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Rules:<\/strong> Specific instructions to address complexity and ensure safety and legality\n<ul class=\"wp-block-list\">\n<li>Follow the chain of command<\/li>\n\n\n\n<li>Comply with applicable laws<\/li>\n\n\n\n<li>Don&#8217;t provide information about hazards<\/li>\n\n\n\n<li>Respect creators and their rights<\/li>\n\n\n\n<li>Protect people&#8217;s privacy<\/li>\n\n\n\n<li>Don&#8217;t respond with NSFW (not safe for work) content<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Default Behaviors:<\/strong> Guidelines consistent with the objectives and rules, serving as a template for handling conflicts and prioritizing objectives.\n<ul class=\"wp-block-list\">\n<li>Assume the best intentions from the user or developer<\/li>\n\n\n\n<li>Ask clarifying questions when necessary<\/li>\n\n\n\n<li>Be as helpful as possible without overstepping<\/li>\n\n\n\n<li>Support the different needs of interactive chat and programmatic use<\/li>\n\n\n\n<li>Assume an objective point of view<\/li>\n\n\n\n<li>Encourage fairness and kindness, and discourage hate<\/li>\n\n\n\n<li>Don&#8217;t try to change anyone&#8217;s mind<\/li>\n\n\n\n<li>Express uncertainty<\/li>\n\n\n\n<li>Use the right tool for the job<\/li>\n\n\n\n<li>Be thorough but efficient, while respecting length limits<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">The outline emphasizes that this approach is incomplete and is expected to evolve over time, incorporating documentation, experience, ongoing research, and inputs from domain experts to guide the development of future AI models.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>What are Objectives?<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The objectives that an OpenAI model follows or aims towards are derived from the different goals of stakeholders. The three main objectives that need to be fulfilled by OpenAI models are given above.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The model specifications deal with detailing these objectives and defining how a model should behave when the objectives come into conflict.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The company explained this with an example in their specification document.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"wp-block-paragraph\">\u201cThe assistant is like a talented, high-integrity employee. Their personal &#8220;goals&#8221; include being helpful and truthful.<br>The ChatGPT user is like the assistant&#8217;s manager. In API use cases, the developer is the assistant&#8217;s manager, and they have assigned the assistant to help with a project led by the end user (if applicable).<br>Like a skilled employee, when a user makes a request that&#8217;s misaligned with broader objectives and boundaries, the assistant suggests a course correction. However, it always remains respectful of the user&#8217;s final decisions. Ultimately, the user directs the assistant&#8217;s actions, while the assistant ensures that its actions balance its objectives and follow the rules.\u201d<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Some examples of Rules<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Once the objectives of an assistant are established, the rules naturally follow to ensure the assistant fulfils its objectives.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The most important rule for the AI model is that it must follow the chain of command. The model should follow the Model Spec, together with any additional rules provided to it in platform messages. However, much of the Model Spec consists of defaults that can be overridden at a lower level.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Model Spec explicitly delegates all remaining power to the developer (for API use cases) and end user. In some cases, the user and developer will provide conflicting instructions; in such cases, the developer message should take precedence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Platform > Developer > User > Tool<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is the default ordering of priorities. The model spec has platform-level priority. If developer instructions conflict with the model specs, the model specs must be followed by the AI assistant.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let&#8217;s take a look at a few prompt examples covering the different types of conflicts.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1185\" height=\"367\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/1-1.png\" alt=\"platform developer conflict in AI\" class=\"wp-image-5569\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/1-1.png 1185w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/1-1-768x238.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/1-1-750x232.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/1-1-1140x353.png 1140w\" sizes=\"(max-width: 1185px) 100vw, 1185px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">In case of a user-developer conflict, the developer\u2019s rules must be followed.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1179\" height=\"425\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/2-1.png\" alt=\"user developer conflict in OpenAI\" class=\"wp-image-5570\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/2-1.png 1179w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/2-1-768x277.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/2-1-750x270.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/2-1-1140x411.png 1140w\" sizes=\"(max-width: 1179px) 100vw, 1179px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">In case the developer specifies that his prompt verbatim or paraphrased must not be revealed to the user, the model has to deflect any non-compliant questions without explicitly revealing that the question is non-compliant.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1184\" height=\"628\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/3-1.png\" alt=\"private prompt openai model spec\" class=\"wp-image-5571\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/3-1.png 1184w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/3-1-768x407.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/3-1-750x398.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/3-1-1140x605.png 1140w\" sizes=\"(max-width: 1184px) 100vw, 1184px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">The AI assistant also cannot promote any unlawful activities like stealing or attacking someone.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1196\" height=\"235\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/4-1.png\" alt=\"asking chatgpt about shoplifting tips\" class=\"wp-image-5572\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/4-1.png 1196w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/4-1-768x151.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/4-1-750x147.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/4-1-1140x224.png 1140w\" sizes=\"(max-width: 1196px) 100vw, 1196px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">However, this particular problem has a loophole that many users exploit.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1185\" height=\"237\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/5-1.png\" alt=\"\" class=\"wp-image-5573\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/5-1.png 1185w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/5-1-768x154.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/5-1-750x150.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/5-1-1140x228.png 1140w\" sizes=\"(max-width: 1185px) 100vw, 1185px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">In the above example, the user exploits the objective to be helpful, and the default that the model assumes the best intentions of the user since this prompt does not explicitly indicate that the user is trying to do something unlawful.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The assistant also cannot encourage or provide information about harming oneself.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1181\" height=\"261\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/6-1.png\" alt=\"\" class=\"wp-image-5574\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/6-1.png 1181w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/6-1-768x170.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/6-1-750x166.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/6-1-1140x252.png 1140w\" sizes=\"(max-width: 1181px) 100vw, 1181px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">While the current specs specify no NSFW content, there are many who believe the model should be allowed to generate age-relevant content.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1182\" height=\"577\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/7-1.png\" alt=\"\" class=\"wp-image-5575\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/7-1.png 1182w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/7-1-768x375.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/7-1-750x366.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/7-1-1140x556.png 1140w\" sizes=\"(max-width: 1182px) 100vw, 1182px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">The only exception to the rules stated above is the task of transformation, i.e. translating, paraphrasing, summarizing, or classifying content.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1183\" height=\"490\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/8-1.png\" alt=\"\" class=\"wp-image-5576\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/8-1.png 1183w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/8-1-768x318.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/8-1-750x311.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/8-1-1140x472.png 1140w\" sizes=\"(max-width: 1183px) 100vw, 1183px\" \/><\/figure>\n<\/div>\n\n\n<h3 class=\"wp-block-heading\"><strong>Some examples of Defaults<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The Defaults defined in the model specifications are the assumptions the model must follow while dealing with prompts. These are things that the model must believe to be true even if there is a clear indication to the contrary.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If a model refuses to answer a question that goes against the rules, it must always assume the best intentions from the user\/developer. Refusals should be kept to a sentence and never be preachy. The assistant should acknowledge that the user&#8217;s request may have nuances that the assistant might not understand.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1191\" height=\"269\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/9-1.png\" alt=\"\" class=\"wp-image-5577\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/9-1.png 1191w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/9-1-768x173.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/9-1-750x169.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/9-1-1140x257.png 1140w\" sizes=\"(max-width: 1191px) 100vw, 1191px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">While chatGPT does get preachy sometimes, as we found in this example,<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"846\" height=\"317\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/10-1.png\" alt=\"\" class=\"wp-image-5578\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/10-1.png 846w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/10-1-768x288.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/10-1-750x281.png 750w\" sizes=\"(max-width: 846px) 100vw, 846px\" \/><\/figure>\n<\/div>\n\n\n<p class=\"wp-block-paragraph\">Nonetheless, the default exists.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">OpenAI assistants are conversational models, and they should ask questions to get clarification on the user&#8217;s request. That way, they can supply the user with the best possible solution considering all the context. However, if a developer sets \u201cinteractive = False\u201d, no follow-up questions should be asked.<\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"aligncenter size-full\"><img decoding=\"async\" width=\"1174\" height=\"337\" src=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/11-1.png\" alt=\"\" class=\"wp-image-5579\" srcset=\"https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/11-1.png 1174w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/11-1-768x220.png 768w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/11-1-750x215.png 750w, https:\/\/favtutor.com\/articles\/wp-content\/uploads\/2024\/06\/11-1-1140x327.png 1140w\" sizes=\"(max-width: 1174px) 100vw, 1174px\" \/><\/figure>\n<\/div>\n\n\n<h2 class=\"wp-block-heading\"><strong>Conclusion<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">OpenAI has attempted to codify its behavioural principles for the model in this document. Sometimes models like chatGPT don\u2019t exactly follow the defaults and may go against them. However, the rules and objectives of the company are followed by all models. <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This gives very interesting insights into the direction of future developments and the level of control or censorship at OpenAI.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Let&#8217;s discuss OpenAI&#8217;s model specs, which elaborate on how their models are supposed to respond to user queries.<\/p>\n","protected":false},"author":20,"featured_media":5581,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jnews-multi-image_gallery":[],"jnews_single_post":null,"jnews_primary_category":{"id":"","hide":""},"footnotes":""},"categories":[57],"tags":[56,61,60],"class_list":["post-5568","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","tag-ai","tag-chatgpt","tag-openai"],"_links":{"self":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/5568","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/users\/20"}],"replies":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/comments?post=5568"}],"version-history":[{"count":2,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/5568\/revisions"}],"predecessor-version":[{"id":5582,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/posts\/5568\/revisions\/5582"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media\/5581"}],"wp:attachment":[{"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/media?parent=5568"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/categories?post=5568"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/favtutor.com\/articles\/wp-json\/wp\/v2\/tags?post=5568"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}