代码收藏家技术教程 2024-03-14

Python调用GPT-4 API的详细文档指南

导入`os`和`openai`Python 包

若使用的是 Jupyter Notebook（例如 DataCamp Workspace），从导入一些函数也很有帮助

IPython.display

# Import the os package
import os

# Import the openai package
import openai

# From the IPython.display package, import display and Markdown
from IPython.display import display, Markdown

# Import yfinance as yf
import yfinance as yf

另一个设置任务是将刚刚创建的环境变量放在 openai 包可以看到的位置。

# Set openai.api_key to the OPENAI environment variable
openai.api_key = os.environ["OPENAI"]

通过API调用GPT的Code Pattern

调用 OpenAI API 并获取聊天响应的代码模式如下：

response = openai.ChatCompletion.create(
              model="MODEL_NAME",
              messages=[{"role": "system", "content": 'SPECIFY HOW THE AI ASSISTANT SHOULD BEHAVE'},
                        {"role": "user", "content": 'SPECIFY WANT YOU WANT THE AI ASSISTANT TO SAY'}
              ])

第一次对话：生成数据集

生成示例数据集对于根据不同的数据场景测试代码或向其他人演示代码非常有用。要从 GPT 获得有用的响应，您需要精确并指定数据集的详细信息，包括：

行数和列数

列的名称

每列包含内容的描述

数据集的输出格式

eg.

Create a small dataset about total sales over the last year.
The format of the dataset should be a data frame with 12 rows and 2 columns.
The columns should be called "month" and "total_sales_usd".
The "month" column should contain the shortened forms of month names
from "Jan" to "Dec". The "total_sales_usd" column should
contain random numeric values taken from a normal distribution
with mean 100000 and standard deviation 5000. Provide Python code to
generate the dataset, then provide the output in the format of a markdown table.

让我们将此消息包含在前面的 Code Pattern 中。

# Define the system message
system_msg = 'You are a helpful assistant who understands data science.'

# Define the user message
user_msg = 'Create a small dataset about total sales over the last year. The format of the dataset should be a data frame with 12 rows and 2 columns. The columns should be called "month" and "total_sales_usd". The "month" column should contain the shortened forms of month names from "Jan" to "Dec". The "total_sales_usd" column should contain random numeric values taken from a normal distribution with mean 100000 and standard deviation 5000. Provide Python code to generate the dataset, then provide the output in the format of a markdown table.'

# Create a dataset using GPT
response = openai.ChatCompletion.create(model="gpt-3.5-turbo",
                                        messages=[{"role": "system", "content": system_msg},
                                         {"role": "user", "content": user_msg}])

检查 GPT 的响应是否正常

GPT 模型返回带有四个值之一的状态代码，这些值记录在聊天文档的响应格式部分中。

stop：API返回完整的模型输出

length：由于 max_tokens 参数或 token 限制导致模型输出不完整

content_filter：由于我们的内容过滤器中的标记而省略了内容

null：API 响应仍在进行中或不完整

GPT API 以 JSON 格式将数据发送到 Python，因此响应变量包含深度嵌套的列表和字典。工作起来有点痛苦！

对于名为的响应变量response，状态代码存储在以下位置。

response["choices"][0]["finish_reason"]

响应内容可以像平常一样打印print(content)，但它是 Markdown 内容，Jupyter 笔记本可以通过以下方式渲染：display(Markdown(content))

Here's the Python code to generate the dataset:

import numpy as np
import pandas as pd
# Set random seed for reproducibility
np.random.seed(42)
# Generate random sales data
sales_data = np.random.normal(loc=100000, scale=5000, size=12)
# Create month abbreviation list
month_abbr = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
# Create dataframe
sales_df = pd.DataFrame({'month': month_abbr, 'total_sales_usd': sales_data})
# Print dataframe
print(sales_df)

And here's the output in markdown format:

| month | total_sales_usd |

|-------|----------------|

| Jan | 98728.961189 |

| Feb | 106931.030292 |

| Mar | 101599.514152 |

| Apr | 97644.534384 |

| May | 103013.191014 |

| Jun | 102781.514665 |

| Jul | 100157.741173 |

| Aug | 104849.281004 |

| Sep | 100007.081991 |

| Oct | 94080.272682 |

| Nov | 96240.993328 |

| Dec | 104719.371461 |

使用辅助函数调用 GPT

希望 OpenAI 能够改进其 Python 包的接口，以便内置此类功能。同时，请随意在您自己的代码中使用它。

该函数有两个参数。

system：包含系统消息的字符串。

user_assistant：交替用户消息和助理消息的字符串数组。

返回值是生成的内容。

def chat(system, user_assistant):
  assert isinstance(system, str), "`system` should be a string"
  assert isinstance(user_assistant, list), "`user_assistant` should be a list"
  system_msg = [{"role": "system", "content": system}]
  user_assistant_msgs = [
      {"role": "assistant", "content": user_assistant[i]} if i % 2 else {"role": "user", "content": user_assistant[i]}
      for i in range(len(user_assistant))]

  msgs = system_msg + user_assistant_msgs
  response = openai.ChatCompletion.create(model="gpt-3.5-turbo",
                                          messages=msgs)
  status_code = response["choices"][0]["finish_reason"]
  assert status_code == "stop", f"The status code was {status_code}."
  return response["choices"][0]["message"]["content"]

该函数的用法示例是

response_fn_test = chat("You are a machine learning expert.",["Explain what a neural network is."])

display(Markdown(response_fn_test))

A neural network is a type of machine learning model that is inspired by the architecture of the human brain. It consists of layers of interconnected processing units, called neurons, that work together to process and analyze data.

Each neuron receives input from other neurons or from external sources, processes that input using a mathematical function, and then produces an output that is passed on to other neurons in the network.

The structure and behavior of a neural network can be adjusted by changing the weights and biases of the connections between neurons. During the training process, the network learns to recognize patterns and make predictions based on the input it receives.

Neural networks are often used for tasks such as image classification, speech recognition, and natural language processing, and have been shown to be highly effective at solving complex problems that are difficult to solve with traditional rule-based programming methods.