Compatibilidade com a OpenAI

Os modelos Gemini são acessíveis através das bibliotecas OpenAI (Python e TypeScript/JavaScript), juntamente com a API REST. Apenas a Google Cloud autorização é suportada através da biblioteca OpenAI no Vertex AI. Se ainda não estiver a usar as bibliotecas da OpenAI, recomendamos que chame a API Gemini diretamente.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = "PROJECT_ID"
location = "us-central1"

# # Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

response = client.chat.completions.create(
  model="google/gemini-2.0-flash-001",
  messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain to me how AI works"}
  ]
)

print(response.choices[0].message)

O que mudou?

  • api_key=credentials.token: para usar a autenticação Google Cloud , obtenha um Google Cloud token de autorização através do código de exemplo.

  • base_url: isto indica à biblioteca OpenAI para enviar pedidos para Google Cloud em vez do URL predefinido.

  • model="google/gemini-2.0-flash-001": escolha um modelo Gemini compatível entre os modelos alojados no Vertex.

A pensar

Os modelos Gemini 2.5 são preparados para ponderar problemas complexos, o que resulta num raciocínio significativamente melhorado. A API Gemini inclui um parâmetro"thinking budget" que oferece um controlo detalhado sobre o quanto o modelo vai pensar.

Ao contrário da API Gemini, a API OpenAI oferece três níveis de controlo de raciocínio: "baixo", "médio" e "alto", que são mapeados nos bastidores para orçamentos de tokens de raciocínio de 1000, 8000 e 24 000.

Para desativar o raciocínio, defina o esforço de raciocínio como None.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

# # Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

response = client.chat.completions.create(
  model="google/gemini-2.5-flash-preview-04-17",
  reasoning_effort="low",
  messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {
          "role": "user",
          "content": "Explain to me how AI works"
      }
  ]
)
print(response.choices[0].message)

Streaming

A API Gemini suporta respostas de streaming.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)
response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "Hello!"}
],
stream=True
)

for chunk in response:
  print(chunk.choices[0].delta)

Chamada de funções

A chamada de funções facilita a obtenção de resultados de dados estruturados a partir de modelos generativos e é suportada na API Gemini.

Python

import openai
from google.auth import default
import google.auth.transport.requests

# TODO(developer): Update and un-comment below lines
#project_id = PROJECT_ID
location = "us-central1"

credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token
)

tools = [
{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Get the weather in a given location",
    "parameters": {
      "type": "object",
      "properties": {
        "location": {
          "type": "string",
          "description": "The city and state, e.g. Chicago, IL",
        },
        "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
      },
      "required": ["location"],
    },
  }
}
]

messages = [{"role": "user", "content": "What's the weather like in Chicago today?"}]
response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=messages,
tools=tools,
tool_choice="auto"
)

print(response)

Compreensão de imagens

Os modelos Gemini são nativamente multimodais e oferecem o melhor desempenho da classe em muitas tarefas de visão comuns.

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

# Function to encode the image
def encode_image(image_path):
with open(image_path, "rb") as image_file:
  return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
#base64_image = encode_image("Path/to/image.jpeg")

response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "What is in this image?",
      },
      {
        "type": "image_url",
        "image_url": {
          "url":  f"data:image/jpeg;base64,{base64_image}"
        },
      },
    ],
  }
],
)

print(response.choices[0])

Gerar imagem

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

# Function to encode the image
def encode_image(image_path):
with open(image_path, "rb") as image_file:
  return base64.b64encode(image_file.read()).decode('utf-8')

# Getting the base64 string
#base64_image = encode_image("Path/to/image.jpeg")
base64_image = encode_image("/content/wayfairsofa.jpg")

response = client.chat.completions.create(
model="google/gemini-2.0-flash",
messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "What is in this image?",
      },
      {
        "type": "image_url",
        "image_url": {
          "url":  f"data:image/jpeg;base64,{base64_image}"
        },
      },
    ],
  }
],
)

print(response.choices[0])

Compreensão de áudio

Analisar entrada de áudio:

Python

from google.auth import default
import google.auth.transport.requests

import base64
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

with open("/path/to/your/audio/file.wav", "rb") as audio_file:
base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')

response = client.chat.completions.create(
  model="gemini-2.0-flash",
  messages=[
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "Transcribe this audio",
      },
      {
            "type": "input_audio",
            "input_audio": {
              "data": base64_audio,
              "format": "wav"
        }
      }
    ],
  }
],
)

print(response.choices[0].message.content)

Resultados estruturados

Os modelos Gemini podem gerar objetos JSON em qualquer estrutura que definir.

Python

from google.auth import default
import google.auth.transport.requests

from pydantic import BaseModel
from openai import OpenAI

# TODO(developer): Update and un-comment below lines
# project_id = "PROJECT_ID"
location = "us-central1"

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
credentials.refresh(google.auth.transport.requests.Request())

# OpenAI Client
client = openai.OpenAI(
  base_url=f"https://{location}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{location}/endpoints/openapi",
  api_key=credentials.token,
)

class CalendarEvent(BaseModel):
  name: str
  date: str
  participants: list[str]

completion = client.beta.chat.completions.parse(
  model="google/gemini-2.0-flash",
  messages=[
      {"role": "system", "content": "Extract the event information."},
      {"role": "user", "content": "John and Susan are going to an AI conference on Friday."},
  ],
  response_format=CalendarEvent,
)

print(completion.choices[0].message.parsed)

Limitações atuais

  • As chaves de acesso têm uma duração de 1 hora por predefinição. Após a expiração, têm de ser atualizados. Consulte este exemplo de código para mais informações.

  • O suporte das bibliotecas da OpenAI ainda está em pré-visualização enquanto expandimos o suporte de funcionalidades. Para obter ajuda com perguntas ou problemas, publique na Google Cloud comunidade.

O que se segue?