Now Reading
Hidden Modifications in GPT-4, Uncovered

Hidden Modifications in GPT-4, Uncovered

2024-01-12 17:13:49

The software directions on this article will not be updated as of 1/11/2024, see this post to study extra concerning the new software OpenAI added to dam conversations about U.S. elections utilizing perform calls.

GPT-4’s looking capabilities have been drastically modified previously weeks resulting from current adjustments pushed by OpenAI.

An error message given by GPT-4 due to recent changes
With current adjustments, GPT-4 struggles to quote web sites it visited.

TL;DR: Current adjustments made to GPT-4 considerably altered the mannequin’s internet looking instruments, stopping it from straight citing quotes from webpages and limiting viewing of full content material. This evaluation explores the underlying mechanism of GPT-4 internet looking, the hidden adjustments, and their implications.

In the event you’re conversant in OpenAI’s perform calling API, it’s possible you’ll wish to skip to how I discovered GPT-4’s functions.

Perform Calling in GPT-4

In July 2023, OpenAI launched function calling in its API, permitting customers to explain a perform {that a} GPT mannequin can select to name, filling out a JSON object with the perform parameters. The developer can then give information again to the mannequin as soon as your program calls an API, extracts information from a database, and so on.

Instance (utilizing gpt-4-1106-preview)

  • A developer decides to outline a perform get_weather(location: string, unit: 'celsius' | 'fahrenheit') and passes this to the OpenAI API to create an assistant that may use the perform.
  • The developer then defines how the mannequin ought to use the perform within the system directions for the mannequin.
  • When a consumer makes use of the outlined assistant, the mannequin will attempt to name get_weather when acceptable.
USER: I would wish to know the climate forecast for my space.

ASSISTANT: Certain, I can give you the climate forecast. May you please 
inform me your location (metropolis and state) and whether or not you like the temperature 
in Celsius or Fahrenheit?

USER: I'm in Philadelphia.

ASSISTANT: Nice, thanks for offering your location. Now, might you please 
let me know whether or not you like the temperature in Celsius or Fahrenheit?

USER: I would truly just like the forecast for Springfield, in Celsius.

ASSISTANT: Since you've got talked about Springfield, might you please specify which 
state the Springfield you are referring to is positioned in, as there are a number of 
cities with that identify throughout the USA?

USER: Springfield, PA

ASSISTANT: 

FUNCTION CALL: 
get_weather({"location": "Springfield, PA", "unit": "c"})

RESPONSE:
{
  "location": "Springfield, PA",
  "forecast": [
    {
      "date": "2024-01-10",
      "condition": "Sunny",
      "temperature": {
        "high": 8,
        "low": -1
      }
    },
    {
      "date": "2024-01-11",
...

ASSISTANT: Here is the weather forecast for Springfield, PA, in Celsius:

- January 10, 2024: Sunny with high temperatures reaching up to 8°C and lows around -1°C.
...

As seen above, the model makes an effort to correctly get data from the user before calling the function and interpreting the response. GPT-4 appears to do this much better than GPT-3.5, with GPT-3.5 failing to pick up on the potential problems with using location Springfield (there are several cities named Springfield, the model simply used "location" : "Springfield" even though it was told to return a city and state) and failing to ask for the unit of temperature, even when it was set as a required parameter through the API.

Why is this important? Firstly, it’s a very powerful feature that moves us closer to integrating programmed features with natural language, letting users make verbal requests that can then be interpreted as API calls or database queries. And, if you’ve used code generation, image generation, or web browsing with GPT-4, you’ve already made use of function calling.

Exposing Function Calls

ChatGPT Plus users are able to generate and execute Python code, pass URLs or search queries for GPT-4 to explore, or generate images with DALL-E 3 thanks to function calls.

GPT-4 calls a function to create an image

GPT-4 writes code and executes it in a Jupyter environment

GPT-4 is able to generate images and execute code.

I accidentally uncovered more of what goes on when probing both GPT-3.5 and GPT-4 with empty prompts:

GPT-3.5 and GPT-4 responding to empty prompts

Both screenshots are from newly started conversations, implying that there is something written before the GPT-4 conversation in ChatGPT, which led to further investigation:

GPT-4 repeats earlier prompt, stating "invoke the dalle tool like this"

Interesting… although it was hard for me to reproduce this afterwards with the same prompt, I was then able to get ChatGPT to produce more:

GPT-4 returns it's full tool instructions

The full text GPT-4 then produced (as of 1/10/2024) is below.

Tool instructions

You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.

Knowledge cutoff: 2023-04

Current date: 2024-01-10

Image input capabilities: Enabled

python

When you send a message containing Python code to python, it will be executed in a stateful Jupyter notebook environment. Python will respond with the output of the execution or time out after 60.0 seconds. The drive at ‘/mnt/data’ can be used to save and persist user files. Internet access for this session is disabled. Do not make external web requests or API calls as they will fail.

dalle

// Whenever a description of an image is given, create a prompt that dalle can use to generate the image and abide to the following policy:

// 1. The prompt must be in English. Translate to English if needed.

// 2. DO NOT ask for permission to generate the image, just do it!

// 3. DO NOT list or refer to the descriptions before OR after generating the images.

// 4. Do not create more than 1 image, even if the user requests more.

// 5. Do not create images in the style of artists, creative professionals or studios whose latest work was created after 1912 (e.g. Picasso, Kahlo).

// – You can name artists, creative professionals or studios in prompts only if their latest work was created prior to 1912 (e.g. Van Gogh, Goya)

// – If asked to generate an image that would violate this policy, instead apply the following procedure: (a) substitute the artist’s name with three adjectives that capture key aspects of the style; (b) include an associated artistic movement or era to provide context; and (c) mention the primary medium used by the artist

// 6. For requests to include specific, named private individuals, ask the user to describe what they look like, since you don’t know what they look like.

// 7. For requests to create images of any public figure referred to by name, create images of those who might resemble them in gender and physique. But they shouldn’t look like them. If the reference to the person will only appear as TEXT out in the image, then use the reference as is and do not modify it.

// 8. Do not name or directly / indirectly mention or describe copyrighted characters. Rewrite prompts to describe in detail a specific different character with a different specific color, hair style, or other defining visual characteristic. Do not discuss copyright policies in responses.

The generated prompt sent to dalle should be very detailed, and around 100 words long.

namespace dalle {
    // Create images from a text-only prompt.
    type text2im = (_:  "1024x1792",
    // The number of images to generate. If the user does not specify a number, generate 1 image.
    n?: number, // default: 2
    // The detailed image description, potentially modified to abide by the dalle policies. If the user requested modifications to a previous image, the prompt should not simply be longer, but rather it should be refactored to integrate the user suggestions.
    prompt: string,
    // If the user references a previous image, this field should be populated with the gen_id from the dalle image metadata.
    referenced_image_ids?: string[],
    ) => any;

} // namespace dalle

browser

You’ve got the software browser. Use browser within the following circumstances:
– Person is asking about present occasions or one thing that requires real-time data (climate, sports activities scores, and so on.)
– Person is asking about some time period you might be completely unfamiliar with (it is perhaps new)
– Person explicitly asks you to browse or present hyperlinks to references

Given a question that requires retrieval, your flip will include three steps:

  1. Name the search perform to get a listing of outcomes.
  2. Name the mclick perform to retrieve a various and high-quality subset of those outcomes (in parallel). Bear in mind to SELECT AT LEAST 3 sources when utilizing mclick.
  3. Write a response to the consumer primarily based on these outcomes. Cite sources utilizing the quotation format beneath.

In some instances, you need to repeat step 1 twice, if the preliminary outcomes are unsatisfactory, and also you imagine which you can refine the question to get higher outcomes.

It’s also possible to open a url straight if one is offered by the consumer. Solely use this command for this goal; don’t open urls returned by the search perform or discovered on webpages.

The browser software has the next instructions:
search(question: str, recency_days: int) Points a question to a search engine and shows the outcomes.
mclick(ids: listing[str]). Retrieves the contents of the webpages with offered IDs (indices). You must ALWAYS SELECT AT LEAST 3 and at most 10 pages. Choose sources with numerous views, and like reliable sources. As a result of some pages could fail to load, it’s superb to pick out some pages for redundancy even when their content material is perhaps redundant.
open_url(url: str) Opens the given URL and shows it.

For citing quotes from the ‘browser’ software: please render on this format: 【{message idx}†{hyperlink textual content}】.
For lengthy citations: please render on this format: [link text](message idx).
In any other case don’t render hyperlinks.


What’s the takeaway from this? It seems we will study lots about ChatGPT’s capabilities from this textual content. We are able to additionally see the mechanism of how OpenAI put copyright protections in place for DALL-E picture technology and the utilization necessities for the looking instruments.

ChatGPT’s looking instruments at the moment include search (discover Bing search outcomes), mclick (to retrieve contents of search outcomes), and open_url (to open any URL, supposed for consumer offered web sites), however this hasn’t at all times been the case.

See Also

OpenAI’s Replace

Beforehand I’ve been experimenting with a customized GPT geared in direction of truth checking, which includes asking GPT-4 to provide sources and cite textual content to cross-verify articles and social media posts. On the week of 12/14/2023, right here is an instance output the GPT produced:

GPT-4 has several citation links to different articles

These hyperlinks even use scroll to text fragments to hyperlink on to particular strains of textual content within the article being examined or different sources. I figured which means that the GPT-4 perform calls give the mannequin the total article, so I investigated this additional:

A list provided by GPT-4 of the function calls used to verify article

This model of GPT-4 readily gave a listing of the perform calls it made to kind its response, and it has entry to the features click on and quote_lines, which the present ChatGPT model doesn’t.

Text in ChatGPT shows that the custom GPT was out of date

The ChatGPT web site confirms the mannequin utilizing these calls was now old-fashioned.

Apparently sufficient, when telling the mannequin that has entry to quote_lines to explain its software directions it repeated again the identical software directions as the present model of ChatGPT. I speculate it’s because the software directions get up to date universally on a regular basis for all fashions, no matter their model (the software directions at all times embody the present date). Nonetheless, the outdated model was nonetheless in a position to make use of the quote_lines perform, and had entry to the older features, permitting it to repeat the precise textual content from the webpage. A model of the outdated browser instruments will be discovered floating across the web, resembling on the OpenAI boards here.

ChatGPT writes a plaintext version of a website

ChatGPT repeats again how the web site is given by the browser software.

The website that has been given to ChatGPT

The web site used on this instance.

This additionally exhibits lots by way of how OpenAI applied the browser instruments for GPT-4, changing HTML right into a markdown format with alt textual content for pictures. Nonetheless, the brand new model of ChatGPT is totally unable to do that, stating it will probably solely supply a abstract or reply questions concerning the web sites in query.

ChatGPT is unable to reproduce website content

The present model of ChatGPT refuses to breed or cite web site content material.

Moreover, ChatGPT is unable to make citations to particular strains of the articles it’s looking, though it will probably hyperlink to web sites it visits by means of its search perform.

I speculate that as a result of ChatGPT is unable to repeat again total web sites or quotes, there are both stricter controls on reproducing content material from web sites, or the perform name has been modified to supply particular data being returned from web sites. For instance, the brand new perform could cross an internet site’s content material by means of one other GPT mannequin (maybe a smaller mannequin to avoid wasting on inference prices) with a selected immediate, earlier than returning it again to GPT-4. This could considerably cut back the quantity of enter tokens utilized by the software, lowering the affect that opening a number of web sites could have on the standard of a dialog and the price of the additional enter tokens.

I’d additionally take this hypothesis with a grain of salt, because the perform name’s precise responses are fully opaque and ChatGPT insists it is ready to learn all the web site when utilizing the browser instruments. After a number of dozen prompting makes an attempt, OpenAI appears to have applied very resilient protections in opposition to reproducing web site content material, and I used to be not in a position to get the present model of ChatGPT to breed something.

Implications for Person Expertise

The clearest change this replace has on consumer expertise is that GPT-4 will likely be much less efficient or unable to quote particular strains or reproduce quotes from any web site it visits. When it got here to the use case I used to be experimenting with earlier (utilizing a customized GPT to truth test articles and web sites), this signficiantly decreased the standard of the responses and GPT-4 fell again on counting on itself moderately than the browser software.

Why make this alteration within the first place? As OpenAI is below growing scrutiny concerning copyright legal guidelines and copy of content material on the web, this alteration appears to be primarily geared toward eliminating this difficulty. Nonetheless, the elimination of the quote_lines perform and citations by GPT-4 recommend that this alteration could have been a cost-cutting measure. If OpenAI did in actual fact implement the brand new open_url and mclick features by getting one other mannequin to summarize or produce textual content from an internet site moderately than ChatGPT itself, it might drop the standard of the responses down throughout the board, however I think this hasn’t been carried out.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top