Creative Tools | AI Tools Directory

ACE++

ACE++ is an instruction-based image creation and editing toolkit that uses context-aware content filling for tasks such as portrait generation, subject-driven image editing, and local editing. The tool supports diffusion-based models, provides installation instructions, demos, and guides for fine-tuning using LoRA, and is hosted on Hugging Face.

Adobe Firefly

Adobe Firefly is an AI art generator developed by Adobe, enabling users to create images, audio, vectors, and videos from text prompts. It integrates with Adobe Creative Cloud, enhancing workflows with generative AI capabilities such as Text-to-Image, Generative Fill, and more.

AI Comic Factory

An AI tool that generates illustrated comic panels from text descriptions, enabling creative storytelling.

AI Image & Photo Restoration

A collection of AI-powered tools on Replicate designed for restoring and enhancing images, including models like CodeFormer and others for upscaling, colorization, and noise removal.

AI Image Upscaler With Super Resolution

An image upscaling tool using Real-ESRGAN, designed to improve image resolution and quality, available on Replicate.

AI-WebTV

AI-WebTV is a live, automated video generation demonstration hosted on Hugging Face Spaces. It streams generated video content using a fine-tuned Modelscope-based model (producing outputs similar to the Zeroscope model), and features an automated prompt database with different themes. The project serves as a public demonstration with research-only guidelines to avoid violent or excessively gory content.

Clarity AI Upscaler

Clarity AI Upscaler is an advanced image upscaling tool that utilizes Stable Diffusion processes to enhance and recreate details in images, providing users with the option to balance fidelity and creativity through parameters such as diffusion strength. The tool supports tiled diffusion techniques for handling large images and incorporates ControlNet for maintaining structural integrity while enhancing details.

ClearerVoice-Studio

An open-source, AI-powered speech processing toolkit offering state-of-the-art pretrained models and utilities for tasks such as speech enhancement, separation, super-resolution, and target speaker extraction.

CLIP Interrogator

A prompt engineering tool that leverages OpenAI's CLIP and Salesforce's BLIP to analyze an input image and generate optimized text prompts. These prompts can be used with text-to-image models like Stable Diffusion to produce creative art.

DeepBrain AI Studios

An AI tool for generating realistic AI avatars and creating text-to-video content tailored for creative projects.

DeepFaceLab

Industry-leading software for creating deepfakes, used widely by creators to swap faces and generate realistic video manipulations.

Deepgram

Developer-focused Voice AI platform offering high-accuracy, real-time speech-to-text APIs (e.g., Nova-3).

Depth Anything V2

An interactive Hugging Face Space that leverages deep learning to generate depth maps from images. This tool extracts depth information from 2D images, which can be used for creative 3D effects, image editing, or further computer vision tasks.

Easel AI

An AI tool that offers advanced face swap and avatar generation, preserving user likeness and enabling creative image manipulations.

FaceFusion

FaceFusion is an industry-leading face manipulation platform that enables advanced face swapping, deepfake creation, and lip-syncing. It features a command-line interface with various job management commands (batch-run, headless-run, etc.) and provides installers for Windows and macOS.

FLUX Kontext max - Multi-Image List

An AI tool that combines multiple images using FLUX Kontext Max, a premium image editing model from Black Forest Labs. It accepts a list of images to creatively merge them and produce enhanced, text-guided composite outputs. The tool is available on Replicate and is designed for versatile image editing tasks, including creative compositing and improved typography generation.

FLUX.1 Kontext – Text Removal

A dedicated application built on the FLUX.1 Kontext image editing model from Black Forest Labs that removes all text from an image. The tool is available on Replicate with API access and a playground for experimentation, showcasing its specialized text removal functionality.

fofr/color-matcher

A model hosted on Replicate that performs color matching and white balance correction for images via an API. It allows users to automatically adjust image colors to achieve better balance.

Fooocus

Fooocus is an open-source, offline image generation tool built on the Stable Diffusion XL architecture and Gradio. It streamlines the image generation process by reducing manual tweaks to prompt-based generation, requiring minimal GPU memory (4GB) and fewer user interactions to produce images.

ghibli-easycontrol

An open-source model hosted on Replicate that transforms input images with a Ghibli-style aesthetic, offering high-quality, fast, and cost-effective image translation via an API.

GPT-SoVITS

A few-shot voice cloning and text-to-speech WebUI that can train a TTS model with just 1 minute of voice data. It supports zero-shot and few-shot TTS, cross-lingual inference, and includes integrated tools for voice separation, dataset segmentation, and ASR, making it easier to build and deploy custom TTS models.

HeyGem

HeyGem is an open-source AI avatar project that enables offline video synthesis on Windows. It precisely clones your appearance and voice to generate ultra-realistic digital avatars, allowing users to create personalized videos without an internet connection.

img2prompt

An AI model that extracts approximate text prompts from input images, optimized for stable diffusion using a modified CLIP Interrogator method. It enables users to generate descriptive prompts that can be used to recreate or modify images.

inswapper

inswapper is an open-source, one-click face swapper and restoration tool powered by insightface. It utilizes ONNX runtime for inference, along with integration of face restoration techniques (e.g., CodeFormer) to enhance image quality and produce realistic face swaps.

Kling Lip Sync

Kling Lip Sync is an API that changes the lip movements of a person in a video to match supplied audio or text. It allows users to add lip-sync to any video, integrating video content with new audio inputs. The model sends data from Replicate to Kuaishou and offers pricing based on the seconds of video generated.

krita-ai-tools

A collection of AI-powered tools designed as a plugin for Krita, enhancing digital painting workflows with advanced features like precise segmentation and mask generation using BiRefNet models. Built against Krita 5.2.x, it improves selection accuracy and performance for digital art creation.

LHM

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds is an open‐source implementation for reconstructing and animating 3D human models from a single image. It offers GPU-optimized pipelines, Docker support, and integration with animation frameworks like ComfyUI.

LuminaBrush

A creative ML app hosted on Hugging Face Spaces that lets users explore and generate artistic images using community-built AI models.

MagicQuill

MagicQuill is an intelligent interactive image editing system that enables precise image modification through AI-powered suggestions and a user-friendly interface, featuring functionalities like local editing and drag-and-drop support.

New Plant Disease Detection

An AI tool that analyzes uploaded plant leaf images to diagnose diseases and provide a confidence level along with visual highlights.

Photoshop Fusion Beta

An AI-powered beta extension for Photoshop aimed at enhancing digital creativity through generative image editing features.

Real-ESRGAN

An AI-powered image upscaling tool that enlarges images while enhancing details and reducing artifacts, often used for improving image resolution.

Replica

An AI tool capable of replicating human voice characteristics to generate expressive, high-quality speech from text.

Retrieval-based Voice Conversion WebUI

An open-source web UI that enables voice conversion using retrieval-based methods, offering configurable options and support for different models.

sparklearningstudiollc/nikujkakdiya-new-model

An AI image generation API model hosted on Replicate that supports text-to-image, image-to-image, and inpainting modes. It offers extensive configuration options including prompt strength, custom dimensions, aspect ratio, LoRA weight integration, and various output settings for generating images according to user prompts.

stoodioai/test-yash-model-4-new-2

A custom trained generative image model that produces unique fashion designs. It supports text-to-image and image-to-image (inpainting) modes via an API, with configurable parameters such as prompt, aspect ratio, model type, and output quality.

Submagic

An AI-powered video tool that automatically identifies the best moments in your videos and converts them into viral clips.

test-yash-model-4-new-2

A custom diffusion-based model designed for generating unique fashion designs from text prompts. The API reference page provides detailed parameters for controlling aspects like prompt strength, aspect ratio, model selection, and output format.

topazlabs/image-upscale

An AI-powered, professional-grade image upscaling tool by Topaz Labs. It offers multiple enhancement models (Standard, Low Resolution, CGI, High Fidelity, Text Refine) to upscale images up to 6x with options for facial enhancement, making it ideal for improving various image types including digital art and text-heavy photos.

Ultimate SD Upscale with ControlNet Tile

An advanced image upscaling model leveraging Stable Diffusion 1.5 and ControlNet Tile to enhance image quality. Accessible via an API on Replicate and optimized to run with Nvidia A100 GPUs.

Upscayl

Upscayl is a free and open-source AI-powered image upscaler that enlarges and enhances low-resolution images using advanced AI algorithms. It is available for Linux, macOS, and Windows, and requires a Vulkan compatible GPU.

VCClient Real-time Voice Changer

An open‑source, AI‑powered real‑time voice conversion tool that uses various models (e.g., RVC, Beatrice v1/v2) to transform voices dynamically. It supports multiple platforms (Windows, Mac, Linux, Google Colab) and offers both standalone and networked configurations.

Whisper French Demo

A Hugging Face Space demo that leverages Whisper-based speech recognition specifically tuned for French. Users can interact with this web app to transcribe French audio using state-of-the-art Whisper technology, making it a practical tool for ASR in the French language.

WhisperX

WhisperX is an Automatic Speech Recognition (ASR) tool that provides fast and accurate transcriptions with word-level timestamps and speaker diarization features, enhancing the capabilities of OpenAI's Whisper model.