PACGen: Generate Anything Anywhere in Any Scene

Yuheng Li, Haotian Liu, Yangming Wen, Yong Jae Lee University of Wisconsin–Madison

  • Word Count: 2,683
  • Estimated Read Time: 9-12 minutes

Summary: The content proposes PACGen, a method that combines techniques from DreamBooth and GLIGEN models to enable both personalized and controllable text-to-image generation. PACGen addresses entanglement issues in existing personalized generative models through data augmentation during training. This allows PACGen to associate a personalized concept solely with its identity, not its location or size. During inference, a regionally-guided sampling technique ensures high quality generation while maintaining location control. Experimental results show that PACGen can generate personalized concepts with controllable location and size, achieving comparable or better fidelity than alternative baselines. The authors envision potential applications for PACGen in art, advertising and entertainment design.

Evaluation: PACGen demonstrates significant potential to enable impactful new creative applications by providing fine-grained control in personalized generative models. The ability to generate personalized concepts with controllable placement within desired contexts could enable improvements in AI creation tools for advertisement designers, artists and citizens. However, the potential misuse of this technology by bad actors generating manipulated content remains a concern. While PACGen represents an important advance, further research is needed to address potential issues and ensure the responsible development and use of such generative modeling techniques. Overall, PACGen is applicable for augmenting and improving large language model and generative adversarial network systems by enabling controllable personalized generative priors.