Qwen-Image-Edit
Semantic and appearance editing powered by AI
Qwen-Image-Edit is a powerful AI image editing model built upon the 20B Qwen-Image foundation model. It combines semantic editing capabilities with precise appearance control, enabling both high-level creative transformations and detailed pixel-level modifications. The model excels at text editing tasks, supporting bilingual Chinese and English text rendering while preserving original fonts, sizes, and styles.
The system uses a dual-input approach, feeding images into both Qwen2.5-VL for visual semantic control and a VAE Encoder for visual appearance control. This allows for sophisticated editing tasks like IP character creation, object rotation, style transfer, and precise element additions or removals.
Features:
🎨 Semantic Editing: Transform images while maintaining character consistency, enabling IP creation, novel view synthesis, and artistic style transfers.
🔧 Appearance Control: Add, remove, or modify specific elements while keeping other regions completely unchanged with pixel-perfect precision.
✏️ Precise Text Editing: Edit bilingual Chinese and English text directly in images while preserving original typography, fonts, and formatting.
🔄 Chained Editing: Use step-by-step corrections to progressively refine images, perfect for complex edits like calligraphy correction.
🎭 MBTI Character Creation: Built-in prompts for creating personality-based content, demonstrated through Capybara mascot variations.
🌟 SOTA Performance: Achieves state-of-the-art results on multiple public benchmarks for image editing tasks.