Documentation Index Fetch the complete documentation index at: https://mintlify.com/portkey-AI/gateway/llms.txt
Use this file to discover all available pages before exploring further.
Overview
OpenAI is one of the leading AI providers, offering powerful language models (GPT-4, GPT-3.5, o1), image generation (DALL-E), speech (Whisper, TTS), and more. Portkey provides full support for all OpenAI capabilities.
Base URL: https://api.openai.com/v1
Supported Features
✅ Chat Completions (including streaming)
✅ Completions (legacy)
✅ Embeddings
✅ Image Generation (DALL-E)
✅ Image Editing
✅ Text-to-Speech (TTS)
✅ Speech-to-Text (Whisper transcription)
✅ Audio Translation
✅ Realtime API (WebSocket)
✅ Function Calling & Tools
✅ Vision (GPT-4 Vision)
✅ Batch API
✅ Fine-tuning
✅ File Operations
Quick Start
Chat Completions
from portkey_ai import Portkey
client = Portkey(
provider = "openai" ,
Authorization = "sk-***" # Your OpenAI API key
)
response = client.chat.completions.create(
model = "gpt-4o" ,
messages = [
{ "role" : "system" , "content" : "You are a helpful assistant." },
{ "role" : "user" , "content" : "What is the capital of France?" }
]
)
print (response.choices[ 0 ].message.content)
Streaming Responses
stream = client.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Count from 1 to 5" }],
stream = True
)
for chunk in stream:
if chunk.choices[ 0 ].delta.content:
print (chunk.choices[ 0 ].delta.content, end = "" )
Popular Models
Model Context Window Description Best For gpt-4o128K tokens Latest GPT-4 Omni model General purpose, multimodal gpt-4o-mini128K tokens Faster, cost-effective GPT-4 High-volume tasks gpt-4-turbo128K tokens Enhanced GPT-4 Complex reasoning gpt-3.5-turbo16K tokens Fast and efficient Simple tasks, high throughput o1-preview128K tokens Advanced reasoning Math, science, coding o3-mini128K tokens Efficient reasoning Balanced performance text-embedding-3-large8K tokens Latest embeddings Semantic search, RAG dall-e-3N/A Image generation High-quality images whisper-1N/A Speech-to-text Transcription tts-1N/A Text-to-speech Voice generation
Configuration Options
client = Portkey(
provider = "openai" ,
Authorization = "sk-***" ,
openai_organization = "org-***" , # Optional: Organization ID
openai_project = "proj_***" , # Optional: Project ID
openai_beta = "assistants=v2" # Optional: Beta features
)
Header Description Required AuthorizationOpenAI API key (Bearer token) Yes openai_organizationOrganization ID No openai_projectProject ID No openai_betaBeta feature flags No
Advanced Features
Function Calling
tools = [
{
"type" : "function" ,
"function" : {
"name" : "get_weather" ,
"description" : "Get the current weather in a location" ,
"parameters" : {
"type" : "object" ,
"properties" : {
"location" : {
"type" : "string" ,
"description" : "City name"
},
"unit" : {
"type" : "string" ,
"enum" : [ "celsius" , "fahrenheit" ]
}
},
"required" : [ "location" ]
}
}
}
]
response = client.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "What's the weather in Paris?" }],
tools = tools,
tool_choice = "auto"
)
Vision (GPT-4 Vision)
response = client.chat.completions.create(
model = "gpt-4o" ,
messages = [{
"role" : "user" ,
"content" : [
{ "type" : "text" , "text" : "What's in this image?" },
{
"type" : "image_url" ,
"image_url" : { "url" : "https://example.com/image.jpg" }
}
]
}]
)
Embeddings
response = client.embeddings.create(
model = "text-embedding-3-large" ,
input = "The quick brown fox jumps over the lazy dog"
)
embedding = response.data[ 0 ].embedding
print ( f "Embedding dimension: { len (embedding) } " )
Image Generation (DALL-E)
response = client.images.generate(
model = "dall-e-3" ,
prompt = "A futuristic city with flying cars at sunset" ,
size = "1024x1024" ,
quality = "hd" ,
n = 1
)
image_url = response.data[ 0 ].url
print ( f "Generated image: { image_url } " )
Text-to-Speech
response = client.audio.speech.create(
model = "tts-1" ,
voice = "alloy" ,
input = "Hello! This is a text-to-speech example."
)
# Save the audio file
with open ( "output.mp3" , "wb" ) as f:
f.write(response.content)
Speech-to-Text (Whisper)
with open ( "audio.mp3" , "rb" ) as audio_file:
response = client.audio.transcriptions.create(
model = "whisper-1" ,
file = audio_file,
language = "en"
)
print (response.text)
Fallback Configuration
Use Anthropic as fallback for OpenAI:
config = {
"strategy" : { "mode" : "fallback" },
"targets" : [
{
"provider" : "openai" ,
"api_key" : "sk-***" ,
"override_params" : { "model" : "gpt-4o" }
},
{
"provider" : "anthropic" ,
"api_key" : "sk-ant-***" ,
"override_params" : { "model" : "claude-3-5-sonnet-20241022" }
}
]
}
client = Portkey().with_options( config = config)
response = client.chat.completions.create(
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
Load Balancing
Distribute requests between OpenAI and Azure OpenAI:
config = {
"strategy" : { "mode" : "loadbalance" },
"targets" : [
{
"provider" : "openai" ,
"api_key" : "sk-***" ,
"weight" : 0.5
},
{
"provider" : "azure-openai" ,
"api_key" : "***" ,
"resource_name" : "my-resource" ,
"deployment_id" : "gpt-4" ,
"api_version" : "2024-02-15-preview" ,
"weight" : 0.5
}
]
}
client = Portkey().with_options( config = config)
Batch API
# Create a batch job
response = client.batches.create(
input_file_id = "file-abc123" ,
endpoint = "/v1/chat/completions" ,
completion_window = "24h"
)
batch_id = response.id
# Retrieve batch status
batch = client.batches.retrieve(batch_id)
print ( f "Status: { batch.status } " )
Error Handling
from portkey_ai.exceptions import (
RateLimitError,
APIError,
AuthenticationError
)
try :
response = client.chat.completions.create(
model = "gpt-4o" ,
messages = [{ "role" : "user" , "content" : "Hello" }]
)
except RateLimitError as e:
print ( f "Rate limit exceeded: { e } " )
except AuthenticationError as e:
print ( f "Authentication failed: { e } " )
except APIError as e:
print ( f "API error: { e } " )
Request
{
"model" : "gpt-4o" ,
"messages" : [
{ "role" : "system" , "content" : "You are a helpful assistant." },
{ "role" : "user" , "content" : "Hello!" }
],
"temperature" : 0.7 ,
"max_tokens" : 150 ,
"top_p" : 1 ,
"frequency_penalty" : 0 ,
"presence_penalty" : 0
}
Response
{
"id" : "chatcmpl-123" ,
"object" : "chat.completion" ,
"created" : 1677652288 ,
"model" : "gpt-4o" ,
"choices" : [{
"index" : 0 ,
"message" : {
"role" : "assistant" ,
"content" : "Hello! How can I assist you today?"
},
"finish_reason" : "stop"
}],
"usage" : {
"prompt_tokens" : 20 ,
"completion_tokens" : 10 ,
"total_tokens" : 30
}
}
Best Practices
Use streaming for long responses to improve user experience
Implement retry logic with exponential backoff for rate limits
Cache embeddings to reduce costs and latency
Use gpt-4o-mini for high-volume, simpler tasks
Set max_tokens to control costs and response length
Use system messages to guide model behavior consistently
Implement fallbacks to other providers for reliability
Pricing
For up-to-date OpenAI pricing, visit:
OpenAI Pricing View detailed pricing for all OpenAI models
Azure OpenAI Use OpenAI models through Azure
Fallback Routing Set up fallbacks from OpenAI
Caching Cache OpenAI responses
Function Calling Advanced function calling guide