Qwen models on Vertex AI offer fully managed and serverless models as APIs. To use a Qwen model on Vertex AI, send a request directly to the Vertex AI API endpoint. Because Qwen models use a managed API, there's no need to provision or manage infrastructure.
You can stream your responses to reduce the end-user latency perception. A streamed response uses server-sent events (SSE) to incrementally stream the response.
Available Qwen models
The following models are available from Qwen to use in Vertex AI. To access a Qwen model, go to its Model Garden model card.
Qwen3-Next-80B Instruct
Qwen3-Next-80B Instruct is a language model from the Qwen3-Next family of models. It is designed for following specific commands and handling very long pieces of text. It uses a smart design called Mixture-of-Experts (MoE), which activates a subset of available parameters to process information, which makes it faster and more cost-effective to run than other models of its size.
The Instruct version is tuned for reliable, direct answers in chat and agent applications and its large context window allows it to maintain an entire conversation or large document in memory.
Go to the Qwen3-Next-80B Instruct model card
Qwen3-Next-80B Thinking
Qwen3-Next-80B Thinking is a language model from the Qwen3-Next family of models. It is specialized for complex problem-solving and deep reasoning. Its "thinking" mode generates a visible, step-by-step reasoning process alongside the final answer, making it ideal for tasks requiring transparent logic, like mathematical proofs, intricate code debugging, or multi-step agent planning.
Go to the Qwen3-Next-80B Thinking model card
Qwen3 Coder (Qwen3 Coder)
Qwen3 Coder (Qwen3 Coder
) is a large-scale, open-weight model
developed for advanced software development tasks. The model's key feature is
its large context window, allowing it to process and understand large codebases
comprehensively.
Go to the Qwen3 Coder model card
Qwen3 235B (Qwen3 235B)
Qwen3 235B (Qwen3 235B
) is a large 235B parameter model. The model is
distinguished by its "hybrid thinking" capability, which allows users to
dynamically switch between a methodical, step-by-step "thinking" mode for
complex tasks like mathematical reasoning and coding, and a rapid "non-thinking"
mode for general-purpose conversation. Its large context window makes it
suitable for use cases requiring deep reasoning and long-form comprehension.
Go to the Qwen3 235B model card
Before you begin
To use Qwen models with Vertex AI, you must perform the
following steps. The Vertex AI API
(aiplatform.googleapis.com
) must be enabled to use
Vertex AI. If you already have an existing project with the
Vertex AI API enabled, you can use that project instead of creating a
new project.
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator
), which contains theresourcemanager.projects.create
permission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin
), which contains theserviceusage.services.enable
permission. Learn how to grant roles. -
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
Roles required to select or create a project
- Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-
Create a project: To create a project, you need the Project Creator
(
roles/resourcemanager.projectCreator
), which contains theresourcemanager.projects.create
permission. Learn how to grant roles.
-
Verify that billing is enabled for your Google Cloud project.
-
Enable the Vertex AI API.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM role (
roles/serviceusage.serviceUsageAdmin
), which contains theserviceusage.services.enable
permission. Learn how to grant roles. - Go to one of the following Model Garden model cards, then click Enable.