November 23, 2023
O. Wolfson
This tutorial guides you through creating a Python script that processes Italian phrases from a JSON file, obtains explanations using OpenAI's GPT-3 model, and saves these explanations in another JSON file. A key aspect of this project is the use of a virtual environment for better dependency management and project isolation.
virtualenv
installed.Before starting, it’s crucial to set up a virtual environment. This keeps your project dependencies separate from your global Python installation.
Create a Virtual Environment:
bashpython -m venv openai-env
This command creates a new virtual environment named openai-env
.
Activate the Virtual Environment:
bashopenai-env\Scripts\activate
bashsource openai-env/bin/activate
Install Required Packages:
With the environment activated, install the openai
package:
bashpip install openai
In your Python script, import the necessary libraries:
pythonimport openai
import json
from tqdm import tqdm
import time
Set your OpenAI API key:
pythonopenai.api_key = "your-api-key"
You can get an API key from OpenAI. Sign up for an account and create a new API key. This may require you to enter your credit card information and pay a small fee, depending on the account type and your usage.
Load your JSON file containing the Italian phrases:
pythonwith open("italian-language-phrases.json", "r") as file:
phrases = json.load(file)
See the JSON data file used in this example.
Check for an existing explanations file. If not found, create an empty list:
pythontry:
with open("italian-phrase-explanations.json", "r") as file:
explanations = json.load(file)
except FileNotFoundError:
explanations = []
Utilize the tqdm library for a progress bar:
pythontotal_phrases = len(phrases)
estimated_time = total_phrases * 30 # Assuming 30 seconds per phrase
print(f"Starting to process {total_phrases} phrases. Estimated time: {estimated_time//60} minutes {estimated_time%60} seconds.\n")
for item in tqdm(phrases, desc="Processing phrases", unit="phrase"):
# ...processing logic here...
Inside the loop, use OpenAI to get explanations for each new phrase:
pythonfor item in tqdm(phrases, desc="Processing phrases", unit="phrase"):
# Skip already processed phrases
if any(exp["id"] == item["id"] for exp in explanations):
continue
phrase = item["phrase"]
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": f"Explain the Italian phrase: {phrase}"}],
)
explanation = {"id": item["id"], "explanation": completion.choices[0].message.content}
explanations.append(explanation)
with open("italian-phrase-explanations.json", "w") as file:
json.dump(explanations, file, indent=4)
Once all phrases are processed, output a completion message:
pythonprint("COMPLETE! Explanations written to italian-phrase-explanations.json")
Run the script from your terminal:
bash# On Windows:
python process-phrases.py
# On macOS and Linux:
python3 process-phrases.py
output:
Check out the phrase explanations here. Note that the each object contains markdown that can be rendered as HTML.
pythonimport openai
import json
from tqdm import tqdm
import time
# OpenAI API key
openai.api_key = "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"
# Read the input JSON containing Italian phrases
with open("italian-language-phrases.json", "r") as file:
phrases = json.load(file)
# Check if explanations file already exists, if not, initialize an empty list
try:
with open("italian-phrase-explanations.json", "r") as file:
explanations = json.load(file)
except FileNotFoundError:
explanations = []
# Estimate Time Remaining
total_phrases = len(phrases)
estimated_time = total_phrases * 30 # Assuming 30 seconds per phrase
print(
f"Starting to process {total_phrases} phrases. Estimated time: {estimated_time//60} minutes {estimated_time%60} seconds.\n"
)
# Progress bar using tqdm
for item in tqdm(phrases, desc="Processing phrases", unit="phrase"):
# If the item's explanation already exists, skip it
if any(exp["id"] == item["id"] for exp in explanations):
continue
# Printing progress updates
i = phrases.index(item) + 1
print(f"\nProcessing phrase {i} of {total_phrases}...")
phrase = item[]
completion = openai.ChatCompletion.create(
model=,
messages=[
{
: ,
: ,
}
],
)
explanation = {
: item[],
: completion.choices[].message.content,
}
explanations.append(explanation)
(, ) file:
json.dump(explanations, file, indent=)
()
()
By using a virtual environment, this script provides a reliable and isolated way to process Italian phrases with OpenAI’s API. This method is essential for maintaining a clean and conflict-free development environment.
deactivate
in your terminal.requirements.txt
file for easy setup of the environment on different machines.This article now includes a complete guide on setting up a virtual environment for your Python project, ensuring a more organized and efficient development process, especially when integrating powerful tools like OpenAI's API.