Explore

Fine-tune FLUX fast

Customize FLUX.1 [dev] with the fast FLUX trainer on Replicate

Train the model to recognize and generate new concepts using a small set of example images, for specific styles, characters, or objects. It's fast (under 2 minutes), cheap (under $2), and gives you a warm, runnable model plus LoRA weights to download.

Get started Learn more

Featured models

minimax / hailuo-02

Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.

22K runs

minimax / hailuo-02-fast

A low cost and fast version of Hailuo 02. Generate 6s and 10s videos in 512p

446 runs

bytedance / omni-human

Turns your audio/video/images into professional-quality animated videos

534 runs

google / veo-3-fast

A faster and cheaper version of Google’s Veo 3 video model, with audio

14.6K runs

google / veo-3

Sound on: Google’s flagship Veo 3 text to video model, with audio

129.4K runs

flux-kontext-apps / kontext-emoji-maker

Use kontext to turn any image into an emoji, using a lora by starsfriday

499 runs

wan-video / wan-2.2-t2v-fast

A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B text-to-video

5.2K runs

black-forest-labs / flux-krea-dev

An opinionated text-to-image model from Black Forest Labs in collaboration with Krea that excels in photorealism. Creates images that avoid the oversaturated "AI look".

13.8K runs

wan-video / wan-2.2-i2v-a14b

Image-to-video at 720p and 480p with Wan 2.2 A14B

2.9K runs

Official models

Official models are always on, maintained, and have predictable pricing.

minimax / hailuo-02

Generate videos, and Videos from images

22K runs

bytedance / omni-human

Turns your audio/video/images into professional-quality animated videos

534 runs

openai / clip

Official CLIP models, generate CLIP (clip-vit-large-patch14) text & image embeddings

90 runs

ibm-granite / granite-speech-3.3-8b

Granite-speech-3.3-8b is a compact and efficient speech-language model, specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST).

733 runs

black-forest-labs / flux-krea-dev

An opinionated text-to-image model from Black Forest Labs in collaboration with Krea that excels in photorealism. Creates images that avoid the oversaturated "AI look".

13.8K runs

wan-video / wan-2.2-i2v-a14b

Generate videos, Videos from images, and Make videos with Wan

2.9K runs

minimax / video-01

Generate videos, and Videos from images

554.9K runs

ibm-granite / granite-3.3-8b-instruct

Use LLMs

857K runs

ibm-granite / granite-vision-3.3-2b

Granite-vision-3.3-2b is a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

5.3K runs

bytedance / seedance-1-lite

Generate videos, and Videos from images

237.6K runs

bytedance / seedance-1-pro

Generate videos, and Videos from images

168.2K runs

luma / photon-flash

Generate images

128.3K runs

luma / ray-2-540p

Generate videos, and Videos from images

9.8K runs

luma / ray-2-720p

Generate videos, and Videos from images

24.2K runs

luma / ray-flash-2-720p

Generate videos, and Videos from images

25.5K runs

luma / reframe-image

Change the aspect ratio of any photo using AI (not cropping)

6.1K runs

View all official models

I want to…

Generate images

Use AI To Generate Images & Photos with an API

Caption videos

Use AI To Caption Videos with an API

Generate speech

Convert text to speech

Use a face to make images

Make realistic images of people instantly

Generate videos

Use AI To Generate Videos with an API

Upscale images

Upscaling models that create high-quality images from low-quality images

Generate music

Use AI To Generate Music with an API

Edit images

Use AI To Edit Any Image with an API

Transcribe speech

Models that convert speech to text

Extract text from images

Optical character recognition (OCR) and text extraction

Remove backgrounds

Models that remove backgrounds from images and videos

Use the FLUX family of models

The FLUX family of text-to-image models from Black Forest Labs

Restore images

Models that improve or restore images by deblurring, colorization, and removing noise

Enhance videos

Upscaling models that create high-quality video from low-quality videos

Edit Videos

Tools for editing videos.

Videos from images

Use AI To Generate Videos from images with an API

Make videos with Wan

Generate videos with Wan, the fastest and highest quality open-source video generation model.

Use Kontext fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Caption images

Use AI To Caption Images with an API

Chat with images

Ask language models about images

Use LLMs

Models that can understand and generate text

Make 3D stuff

Models that generate 3D objects, scenes, radiance fields, textures and multi-views.

Use handy tools

Toolbelt-type models for videos and images.

Control image generation

Guide image generation with more than just text. Use edge detection, depth maps, and sketches to get the results you want.

Sing with voices

Voice-to-voice cloning and musical prosody

Get embeddings

Models that generate embeddings from inputs

Try for free

Get started with these models without adding a credit card. Whether you're making videos, generating images, or upscaling photos, these are great starting points.

Use official models

Official models are always on, maintained, and have predictable pricing.

Detect objects

Models that detect or segment objects in images and videos.

Use FLUX fine-tunes

Browse the diverse range of fine-tunes the community has custom-trained on Replicate

Popular models

openai/whisper

Convert speech in audio to text

Updated 8 months, 1 week ago 109.9M runs

prunaai/flux.1-dev

This is the fastest Flux Dev endpoint in the world, contact us for more at pruna.ai

Updated 1 week, 1 day ago 11.8M runs

turian/insanely-fast-whisper-with-video

whisper-large-v3, incredibly fast, with video transcription

Updated 1 year, 6 months ago 2.5M runs

andreasjansson/clip-features

Return CLIP features for the clip-vit-large-patch14 model

Updated 2 years, 4 months ago 97.3M runs

851-labs/background-remover

Remove backgrounds from images.

Updated 7 months, 2 weeks ago 5.4M runs

salesforce/blip

Generate image captions

Updated 2 years, 10 months ago 167.3M runs

xinntao/gfpgan

Practical face restoration algorithm for *old photos* or *AI-generated faces*

Updated 2 years, 10 months ago 36M runs

bytedance/sdxl-lightning-4step

SDXL-Lightning by ByteDance: a fast text-to-image model that makes high-quality images in 4 steps

Updated 4 months, 2 weeks ago 1B runs

Latest models

minimax/hailuo-02

Hailuo 2 is a text-to-video and image-to-video model that can make 6s or 10s videos at 768p (standard) or 1080p (pro). It excels at real world physics.

Updated 19 hours ago 22K runs

minimax/hailuo-02-fast

A low cost and fast version of Hailuo 02. Generate 6s and 10s videos in 512p

Updated 20 hours ago 446 runs

bytedance/omni-human

Turns your audio/video/images into professional-quality animated videos

Updated 1 day, 11 hours ago 534 runs

google/veo-3-fast

A faster and cheaper version of Google’s Veo 3 video model, with audio

Updated 1 day, 12 hours ago 14.6K runs

google/veo-3

Sound on: Google’s flagship Veo 3 text to video model, with audio

Updated 1 day, 13 hours ago 129.4K runs

flux-kontext-apps/kontext-emoji-maker

Use kontext to turn any image into an emoji, using a lora by starsfriday

Updated 1 day, 14 hours ago 499 runs

wan-video/wan-2.2-t2v-fast

A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B text-to-video

Updated 1 day, 17 hours ago 5.2K runs

wan-video/wan-2.2-i2v-fast

A very fast and cheap PrunaAI optimized version of Wan 2.2 A14B image-to-video

Updated 1 day, 17 hours ago 9.3K runs

fire/part-crafter

PartCrafter is a structured 3D mesh generation model that creates multiple parts and objects from a single RGB image.

Updated 2 days ago 18 runs

fire/flux

The image generation model tailored for local development and personal use

Updated 2 days, 2 hours ago 57 runs

fire/v-sekai.mediapipe-labeler

Mediapipe Blendshape Labeler - Predicts the blend shapes of an image.

Updated 2 days, 2 hours ago 210 runs

zsxkib/wan-2.2-with-sound

wan-video/wan-2.2 (all variants) + topazlabs/video-upscale + zsxkib/smart-thinksound

Updated 2 days, 6 hours ago 9 runs

lucataco/wan-2.2-i2v-audio

Wan 2.2 A14B image-to-video with MMaudio

Updated 2 days, 8 hours ago 33 runs

ethulia/chengyu-generator

Updated 2 days, 9 hours ago 40 runs

8w9ag/tom-cruise-runs

Updated 2 days, 11 hours ago 6 runs

bytedance/seededit-3.0

Text-guided image editing model that preserves original details while making targeted modifications like lighting changes, object removal, and style conversion

Updated 2 days, 11 hours ago 64.3K runs

openai/clip

Official CLIP models, generate CLIP (clip-vit-large-patch14) text & image embeddings

Updated 2 days, 13 hours ago 90 runs

ibm-granite/granite-speech-3.3-8b

Granite-speech-3.3-8b is a compact and efficient speech-language model, specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST).

Updated 2 days, 14 hours ago 733 runs

zsxkib/smart-thinksound

Automatically generates expert ThinkSound prompts by analyzing your video w/ Claude 4 - no more struggling with complex audio descriptions

Updated 2 days, 15 hours ago 328 runs

black-forest-labs/flux-krea-dev

An opinionated text-to-image model from Black Forest Labs in collaboration with Krea that excels in photorealism. Creates images that avoid the oversaturated "AI look".

Updated 2 days, 18 hours ago 13.8K runs

notdaniel/voxtral-small-24b-2507

Voxtral Small is an enhancement of Mistral Small 3 that incorporates state-of-the-art audio input capabilities and excels at speech transcription, translation and audio understanding.

Updated 3 days, 4 hours ago 22 runs

fofr/not-real-wan

Make a very realistic looking real-world AI video via FLUX 1.1 Pro and Wan 2.2 i2v

Updated 3 days, 9 hours ago 45 runs

aihilums/sehatsanjha

Updated 3 days, 11 hours ago 35.7K runs

lucataco/seed-x-ppo

Seed-X-PPO-7B by ByteDance-Seed, a powerful series of open-source multilingual translation language models

Updated 3 days, 13 hours ago 16 runs

mattrothenberg/dithercam

Updated 3 days, 16 hours ago 15 runs

wan-video/wan-2.2-i2v-a14b

Image-to-video at 720p and 480p with Wan 2.2 A14B

Updated 4 days, 8 hours ago 2.9K runs

minimax/video-01

Generate 6s videos with prompts or images. (Also known as Hailuo). Use a subject reference to make a video with a character and the S2V-01 model.

Updated 4 days, 8 hours ago 554.9K runs

ibm-granite/granite-3.3-8b-instruct

Granite-3.3-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.

Updated 4 days, 9 hours ago 857K runs

ibm-granite/granite-vision-3.3-2b

Updated 4 days, 9 hours ago 5.3K runs

press1209/musicology-v1-beta

Updated 4 days, 10 hours ago 38 runs

pipeline-examples/upcase

Updated 4 days, 13 hours ago 96 runs

0xdino/cyberrealistic-pony-semireal-v36

Updated 4 days, 20 hours ago 108 runs

0xdino/cyberrealistic-pony-v125

Updated 4 days, 21 hours ago 803 runs

lucataco/higgs-audio-v2

Higgs Audio v2, a powerful text-to-speech audio foundation model that excels in expressive audio generation

Updated 5 days, 9 hours ago 693 runs

fofr/any-comfyui-workflow

Run any ComfyUI workflow. Guide: https://github.com/replicate/cog-comfyui

Updated 5 days, 13 hours ago 6.3M runs

nvidia/canary-qwen-2.5b

🎤The best open-source speech-to-text model as of Jul 2025, transcribing audio with record 5.63% WER and enabling AI tasks like summarization directly from speech✨

Updated 5 days, 16 hours ago 43 runs

flux-kontext-apps/in-scene

InScene is a LoRA by Peter O’Malley (POM) that's designed to generate images that maintain scene consistency with a source image. It is trained on top of Flux.1-Kontext.dev.

Updated 5 days, 21 hours ago 404 runs

bytedance/seedance-1-lite

A video generation model that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 720p resolution

Updated 6 days, 9 hours ago 237.6K runs

bytedance/seedance-1-pro

A pro version of Seedance that offers text-to-video and image-to-video support for 5s or 10s videos, at 480p and 1080p resolution

Updated 6 days, 9 hours ago 168.2K runs

ndreca/hunyuan3d-2-test

Updated 1 week ago 286 runs

ndreca/hunyuan3d-2.1-test

Updated 1 week ago 46 runs

spuuntries/ilearnmate-icts

Updated 1 week ago 13 runs

simbrams/ri

Realistic Inpainting with ControlNET (M-LSD + SEG)

Updated 1 week ago 556K runs

prunaai/hidream-e1.1

Edit an image with a prompt. This is the hidream-e1.1 model accelerated with the pruna optimisation engine.

Updated 1 week ago 23.1K runs

dessix/moss-ttsd

MOSS-TTSD (text to spoken dialogue) is an open-source bilingual spoken dialogue synthesis model that supports both Chinese and English. It can transform dialogue scripts between two speakers into natural, expressive conversational speech.

Updated 1 week, 1 day ago 51 runs

8w9ag/papercopy

Generate an image using the previously generated image as the input with a recursive prompt.

Updated 1 week, 1 day ago 17 runs

flux-kontext-apps/real-earth

Turn satellite imagery into professional-quality aerial shots

Updated 1 week, 1 day ago 101 runs

flux-kontext-apps/zoom-out

"Zoom out" with this FLUX Kontext LoRA

Updated 1 week, 1 day ago 240 runs

flux-kontext-apps/place-it-overlay

Overlay one image over another to merge them

Updated 1 week, 1 day ago 253 runs

luma/photon-flash

Accelerated variant of Photon prioritizing speed while maintaining quality

Updated 1 week, 1 day ago 128.3K runs

Fine-tune FLUX fast

Customize FLUX.1 [dev] with the fast FLUX trainer on Replicate

Featured models

minimax / hailuo-02

minimax / hailuo-02-fast

bytedance / omni-human

google / veo-3-fast

google / veo-3

flux-kontext-apps / kontext-emoji-maker

wan-video / wan-2.2-t2v-fast

black-forest-labs / flux-krea-dev

wan-video / wan-2.2-i2v-a14b

Official models

minimax / hailuo-02

bytedance / omni-human

google / veo-3-fast

google / veo-3

wan-video / wan-2.2-t2v-fast

bytedance / seededit-3.0

openai / clip

ibm-granite / granite-speech-3.3-8b

black-forest-labs / flux-krea-dev

wan-video / wan-2.2-i2v-a14b

minimax / video-01

ibm-granite / granite-3.3-8b-instruct

ibm-granite / granite-vision-3.3-2b

bytedance / seedance-1-lite

bytedance / seedance-1-pro

luma / photon-flash

luma / ray-2-540p

luma / ray-2-720p

luma / ray-flash-2-720p

luma / reframe-image

I want to…

Popular models

Latest models