Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/portkey-AI/gateway/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Amazon Bedrock provides access to foundation models from leading AI companies including Anthropic, Meta, Mistral, Cohere, and Amazon through a unified API with AWS security, compliance, and infrastructure. Service: bedrock (data plane) and bedrock-runtime (inference)

Supported Features

  • ✅ Chat Completions (Converse API)
  • ✅ Streaming
  • ✅ Embeddings
  • ✅ Image Generation (Stable Diffusion, Titan)
  • ✅ Function Calling (via Converse API)
  • ✅ Batch Inference
  • ✅ Model Customization (Fine-tuning)
  • ✅ Guardrails
  • ✅ Multiple Authentication Methods

Quick Start

Basic Configuration

from portkey_ai import Portkey

client = Portkey(
    provider="bedrock",
    aws_access_key_id="AKIA***",
    aws_secret_access_key="***",
    aws_region="us-east-1"
)

response = client.chat.completions.create(
    model="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[
        {"role": "user", "content": "Explain AWS Bedrock in simple terms"}
    ]
)

print(response.choices[0].message.content)

Available Models

Anthropic Claude

Model IDModelContextBest For
anthropic.claude-3-5-sonnet-20241022-v2:0Claude 3.5 Sonnet200KMost capable
anthropic.claude-3-5-haiku-20241022-v1:0Claude 3.5 Haiku200KFast, efficient
anthropic.claude-3-opus-20240229-v1:0Claude 3 Opus200KComplex tasks
anthropic.claude-3-sonnet-20240229-v1:0Claude 3 Sonnet200KBalanced
anthropic.claude-3-haiku-20240307-v1:0Claude 3 Haiku200KSpeed

Meta Llama

Model IDContextDescription
meta.llama3-3-70b-instruct-v1:0128KLatest Llama 3.3
meta.llama3-1-405b-instruct-v1:0128KLargest Llama 3.1
meta.llama3-1-70b-instruct-v1:0128KEfficient Llama 3.1
meta.llama3-1-8b-instruct-v1:0128KFast, compact

Mistral AI

Model IDContextDescription
mistral.mistral-large-2407-v1:0128KMost capable
mistral.mistral-large-2402-v1:032KPrevious generation
mistral.mistral-small-2402-v1:032KCost-effective

Amazon Titan

Model IDTypeDescription
amazon.titan-text-premier-v1:0TextPremier text model
amazon.titan-text-express-v1TextFast generation
amazon.titan-embed-text-v2:0EmbeddingsText embeddings
amazon.titan-image-generator-v2:0ImageImage generation

Cohere

Model IDTypeDescription
cohere.command-r-plus-v1:0ChatMost capable
cohere.command-r-v1:0ChatBalanced
cohere.embed-english-v3EmbeddingsEnglish embeddings
cohere.embed-multilingual-v3EmbeddingsMultilingual

AI21 Labs

Model IDDescription
ai21.jamba-1-5-large-v1:0Latest Jamba
ai21.jamba-1-5-mini-v1:0Compact Jamba

Stability AI

Model IDTypeDescription
stability.stable-diffusion-xl-v1ImageSDXL 1.0
stability.sd3-large-v1:0ImageStable Diffusion 3

Authentication Methods

1. Access Keys (Default)

client = Portkey(
    provider="bedrock",
    aws_access_key_id="AKIA***",
    aws_secret_access_key="***",
    aws_session_token="***",  # Optional for temporary credentials
    aws_region="us-east-1"
)

2. Assumed Role

client = Portkey(
    provider="bedrock",
    aws_auth_type="assumedRole",
    aws_role_arn="arn:aws:iam::123456789012:role/BedrockRole",
    aws_external_id="external-id",  # Optional
    aws_region="us-east-1"
)

3. IAM Role (EC2, ECS, Lambda)

# Automatically uses instance/container IAM role
client = Portkey(
    provider="bedrock",
    aws_region="us-east-1"
)

4. Environment Variables

export AWS_ACCESS_KEY_ID="AKIA***"
export AWS_SECRET_ACCESS_KEY="***"
export AWS_REGION="us-east-1"
client = Portkey(provider="bedrock")

Advanced Features

Streaming

stream = client.chat.completions.create(
    model="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Count to 10"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Function Calling (Converse API)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools
)

Embeddings

response = client.embeddings.create(
    model="amazon.titan-embed-text-v2:0",
    input="AWS Bedrock provides access to foundation models"
)

embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")

Image Generation

response = client.images.generate(
    model="stability.sd3-large-v1:0",
    prompt="A serene mountain landscape at sunset",
    size="1024x1024"
)

image_url = response.data[0].url

Batch Inference

Create batch jobs for cost-effective inference:
# Create batch job
response = client.batches.create(
    model="anthropic.claude-3-5-sonnet-20241022-v2:0",
    input_file_id="s3://my-bucket/input.jsonl",
    output_data_config={
        "s3OutputDataConfig": {
            "s3Uri": "s3://my-bucket/output/"
        }
    }
)

batch_id = response.id

# Check status
batch = client.batches.retrieve(batch_id)
print(f"Status: {batch.status}")

Cross-Region Inference

Use inference profiles for cross-region routing:
response = client.chat.completions.create(
    model="us.anthropic.claude-3-5-sonnet-20241022-v2:0",  # Inference profile
    messages=[{"role": "user", "content": "Hello"}]
)

Multi-Region Configuration

Load balance across AWS regions:
config = {
    "strategy": {"mode": "loadbalance"},
    "targets": [
        {
            "provider": "bedrock",
            "aws_access_key_id": "AKIA***",
            "aws_secret_access_key": "***",
            "aws_region": "us-east-1",
            "weight": 0.5
        },
        {
            "provider": "bedrock",
            "aws_access_key_id": "AKIA***",
            "aws_secret_access_key": "***",
            "aws_region": "us-west-2",
            "weight": 0.5
        }
    ]
}

client = Portkey().with_options(config=config)

Fallback Configuration

Fallback from Bedrock Claude to Anthropic:
config = {
    "strategy": {"mode": "fallback"},
    "targets": [
        {
            "provider": "bedrock",
            "aws_access_key_id": "AKIA***",
            "aws_secret_access_key": "***",
            "aws_region": "us-east-1",
            "override_params": {"model": "anthropic.claude-3-5-sonnet-20241022-v2:0"}
        },
        {
            "provider": "anthropic",
            "api_key": "sk-ant-***",
            "override_params": {"model": "claude-3-5-sonnet-20241022"}
        }
    ]
}

client = Portkey().with_options(config=config)

Error Handling

from portkey_ai.exceptions import (
    RateLimitError,
    APIError,
    AuthenticationError
)

try:
    response = client.chat.completions.create(
        model="anthropic.claude-3-5-sonnet-20241022-v2:0",
        messages=[{"role": "user", "content": "Hello"}]
    )
except AuthenticationError as e:
    print(f"AWS credentials error: {e}")
except RateLimitError as e:
    print(f"Rate limit or quota exceeded: {e}")
except APIError as e:
    print(f"Bedrock API error: {e}")

Best Practices

  1. Use IAM roles - More secure than access keys
  2. Enable VPC endpoints - Private connectivity
  3. Request model access - Models require explicit access approval
  4. Use inference profiles - Better availability and routing
  5. Monitor with CloudWatch - Track usage and costs
  6. Set up guardrails - Content filtering and safety
  7. Use batch inference - Cost-effective for large workloads
  8. Implement retry logic - Handle throttling gracefully

Model Access

Before using models, request access in the AWS Console:
  1. Go to AWS Bedrock Console
  2. Navigate to Model access
  3. Click Manage model access
  4. Select models and request access
  5. Wait for approval (usually instant)
Models are region-specific. Request access in each region you plan to use.

Regional Availability

Bedrock is available in multiple AWS regions:
  • US: us-east-1, us-west-2
  • Europe: eu-central-1, eu-west-1, eu-west-3
  • Asia Pacific: ap-southeast-1, ap-northeast-1, ap-south-1
Model availability varies by region. Check the AWS Bedrock documentation for details.

Pricing

Bedrock pricing includes:
  • On-demand: Pay per request/token
  • Provisioned throughput: Reserved capacity
  • Model customization: Additional costs for fine-tuning

AWS Bedrock Pricing

View detailed Bedrock pricing

Anthropic

Direct Anthropic integration

Load Balancing

Multi-region load balancing

Guardrails

Content filtering

Batch Processing

Batch inference guide