Pretraining is All You Need for Image-to-Image Translation

Tengfei Wang
HKUST
Ting Zhang
Microsoft Research Asia
Bo Zhang
Microsoft Research Asia
Hao Ouyang
HKUST

Dong Chen
Microsoft Research Asia
Qifeng Chen
HKUST
Fang Wen
Microsoft Research Asia

Diverse samples synthesized by our approach.

Abstract

We propose to use pretraining to boost general image-to-image translation. Prior image-to-image translation methods usually need dedicated architectural design and train individual translation models from scratch, struggling for high-quality generation of complex scenes, especially when paired training data are not abundant. In this paper, we regard each image-to-image translation problem as a downstream task and introduce a simple and generic framework that adapts a pretrained diffusion model to accommodate various kinds of image-to-image translation. We also propose adversarial training to enhance the texture synthesis in the diffusion model training, in conjunction with normalized guidance sampling to improve the generation quality. We present extensive empirical comparison across various tasks on challenging benchmarks such as ADE20K, COCO-Stuff, and DIODE, showing the proposed pretraining-based image-to-image translation (PITI) is capable of synthesizing images of unprecedented realism and faithfulness.

Comaprison with other methods.

Approach

The overall framework. We can perform pretraining on huge data via different pretext tasks and learn a highly semantic latent space that models general and high-quality image statistics. For downstream tasks, we perform conditional finetuning to map the task-specific conditions to this pretrained semantic space. By leveraging the pretrained knowledge, our model renders plausible images based on different conditions.

Image Editing

Numerical Results

Additional Results

BibTeX

                        
@inproceedings{wang2022pretraining,
 title = {Pretraining is All You Need for Image-to-Image Translation},
  author = {Wang, Tengfei and Zhang, Ting and Zhang, Bo and Ouyang, Hao and Chen, Dong and Chen, Qifeng and Wen, Fang},
  booktitle = {arXiv},
  year = {2022},
}