This tutorial guides you through creating a Python script that processes Italian phrases from a JSON file, obtains explanations using OpenAI's GPT-3 model, and saves these explanations in another JSON file. A key aspect of this project is the use of a virtual environment for better dependency management and project isolation.
Prerequisites
Basic understanding of Python.
An OpenAI API key.
Python environment with virtualenv installed.
A JSON file with Italian phrases.
Step 1: Setting Up a Virtual Environment
Before starting, it’s crucial to set up a virtual environment. This keeps your project dependencies separate from your global Python installation.
Create a Virtual Environment:
bash
python -m venv openai-env
This command creates a new virtual environment named openai-env.
Activate the Virtual Environment:
On Windows:
bash
openai-env\Scripts\activate
On macOS and Linux:
bash
source openai-env/bin/activate
Install Required Packages:
With the environment activated, install the openai package:
bash
pip install openai
Step 2: Importing Libraries
In your Python script, import the necessary libraries:
python
import openai
import json
from tqdm import tqdm
import time
Step 3: OpenAI API Key Configuration
Set your OpenAI API key:
python
openai.api_key = "your-api-key"
You can get an API key from OpenAI. Sign up for an account and create a new API key. This may require you to enter your credit card information and pay a small fee, depending on the account type and your usage.
Step 4: Reading Input Data
Load your JSON file containing the Italian phrases:
python
withopen("italian-language-phrases.json", "r") as file:
phrases = json.load(file)
total_phrases = len(phrases)
estimated_time = total_phrases * 30# Assuming 30 seconds per phraseprint(f"Starting to process {total_phrases} phrases. Estimated time: {estimated_time//60} minutes {estimated_time%60} seconds.\n")
for item in tqdm(phrases, desc="Processing phrases", unit="phrase"):
# ...processing logic here...
Step 7: Processing Phrases and Storing Explanations
Inside the loop, use OpenAI to get explanations for each new phrase:
python
for item in tqdm(phrases, desc="Processing phrases", unit="phrase"):
# Skip already processed phrasesifany(exp["id"] == item["id"] for exp in explanations):
continue
phrase = item["phrase"]
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role": "user", "content": f"Explain the Italian phrase: {phrase}"}],
)
explanation = {"id": item["id"], "explanation": completion.choices[0].message.content}
explanations.append(explanation)
withopen("italian-phrase-explanations.json", "w") as file:
json.dump(explanations, file, indent=4)
Step 8: Finalizing the Script
Once all phrases are processed, output a completion message:
python
print("COMPLETE! Explanations written to italian-phrase-explanations.json")
Step 9: Running the Script
Run the script from your terminal:
bash
# On Windows:
python process-phrases.py
# On macOS and Linux:
python3 process-phrases.py
output:
Check out the phrase explanations here. Note that the each object contains markdown that can be rendered as HTML.
Complete Python Script Code
python
import openai
import json
from tqdm import tqdm
import time
# OpenAI API key
openai.api_key = "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"# Read the input JSON containing Italian phraseswithopen("italian-language-phrases.json", "r") as file:
phrases = json.load(file)
# Check if explanations file already exists, if not, initialize an empty listtry:
withopen("italian-phrase-explanations.json", "r") as file:
explanations = json.load(file)
except FileNotFoundError:
explanations = []
# Estimate Time Remaining
total_phrases = len(phrases)
estimated_time = total_phrases * 30# Assuming 30 seconds per phraseprint(
f"Starting to process {total_phrases} phrases. Estimated time: {estimated_time//60} minutes {estimated_time%60} seconds.\n"
)
# Progress bar using tqdmfor item in tqdm(phrases, desc="Processing phrases", unit="phrase"):
# If the item's explanation already exists, skip itifany(exp["id"] == item["id"] for exp in explanations):
continue# Printing progress updates
i = phrases.index(item) + 1print(f"\nProcessing phrase {i} of {total_phrases}...")
phrase = item["phrase"]
# Query OpenAI for an explanation
completion = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": f"Give me a brief linguistic / grammatical breakdown of the following Italian phrase (formatted in markdown): {phrase}. Please do not address the question or questioner in your response. Just deliver the explanation itself, as the entire text response will be used directly in some documentation. please format each explanation so that all Italian words are in **bold** and the corresponding English are *bold** words as well. The explanation should be in markdown format.",
}
],
)
# Append the explanation to the explanations list
explanation = {
"id": item["id"],
"explanation": completion.choices[0].message.content,
}
explanations.append(explanation)
# Write the current explanation to the JSON file immediatelywithopen("italian-phrase-explanations.json", "w") as file:
json.dump(explanations, file, indent=4)
print(f"Processed and saved explanation for phrase {i}.\n")
print(f"COMPLETE! Explanations written to italian-phrase-explanations.json")
Conclusion
By using a virtual environment, this script provides a reliable and isolated way to process Italian phrases with OpenAI’s API. This method is essential for maintaining a clean and conflict-free development environment.
Additional Tips
Deactivate your virtual environment when you're finished by typing deactivate in your terminal.
Consider maintaining a requirements.txt file for easy setup of the environment on different machines.
Regularly update your dependencies to catch up with the latest versions and security patches.
This article now includes a complete guide on setting up a virtual environment for your Python project, ensuring a more organized and efficient development process, especially when integrating powerful tools like OpenAI's API.