How I save the OpenAI API token usage

Every month when I open the OpenAI API bill, I wonder if an extra zero has been added. Solutions include: avoiding duplicate articles, providing clear and limited answer options, bundling multiple requests into one request, not having the robot provide extra explanations, choosing the appropriate model, and small-scale testing.

Photo by Siyan Ren on StockSnap

Solutions

1. Prevent Sending Duplicate Content to the OpenAI API

When dealing with a large volume of articles, you might not notice that you are sending duplicate article content to the OpenAI API. Please refer to this article on how to find duplicate articles (text-based data) in MySQL (written in Traditional Chinese).

If the results of a task calling the OpenAI API do not need to frequently change, consider storing the results for reuse. For example, you can store the results of short article translations, and before calling the API to translate again, check if it has already been translated before.

2. Providing clear and limited answer options

While it's acceptable to pose open-ended questions to explore the capabilities of ChatGPT, keep in mind that such questions can lead to longer responses that might increase costs. To achieve concise and cost-effective answers, consider refining your question by providing specific and limited options for the AI to select from. For example:

Initial question for exploration:

Please offer five keywords for the following articles: 

``` 

Long text 

```

Refined question:

Please select one of the following keywords: keyword1, keyword2, keyword3, keyword4, keyword5, for the subsequent articles: 

``` 

Long text

```

3. Bundling multiple requests into one request

If the article content is short such as title or user comments, I will combine them into one API request.

Refined prompt:

Each row is the article number and content. For each article, select the keywords: keyword1, keyword2, keyword3, keyword4, keyword5. Provide your answer in the CSV format: "article number", "comma_separated_keywords" 

``` 

No1. short text of article No.1 (without return symbol) 

No2. short text of article No.2 (without return symbol) 

... 

No5. short text of article No.5 (without return symbol) 

```

4. No Additional Explanation Needed

While GPT-4 often attempts to provide explanations for its answers, if you have already explored the topic, you can frame your questions in a way that skips the elaboration. For example:

For the subsequent articles, please select from the keywords: keyword1, keyword2, keyword3, keyword4, keyword5. No further explanation required

``` 

Long text 

```

5. Choosing the appropriate model

For complex tasks, GPT-4 is recommended, while simpler tasks like translation can utilize GPT-3.5. For more information, please refer to the following article: Models - OpenAI API.

6. Choose Long Article Splitting Strategy (chunk)

Different language models have token limits that affect how much text they can process. The max_length API parameter accounts for both input and output. Models like gpt3.5-turbo and gpt-4 have specific token limits like 4,097 and 8,192, respectively. Exceeding these limits requires you to split articles into smaller pieces for processing.

You can choose not to split articles, but this restricts you to processing only portions of them. Choices include focusing on the beginning and end or just the final paragraphs.

For article splitting, tools like LangChain's text-split-explorer can help, offering options for delimiters, chunk size, and chunk overlap.

7. Small-scale testing

Prepare several sample texts and verify the API results to ensure they are as expected before processing a large number of articles.


Alternative version of this article

Wiki: How to optimize your OpenAI API token usage

I also publish this article to OpenAI API forum

References

留言