Façade Design Support System with Control of Image Generation using GAN

Authors

  • Shosuke Haji Meiji University
  • Kazuki Yamaji
  • Tomohiro Takagi
  • So Takahashi
  • Yukihiko Hayase
  • Yasuko Ebihara
  • Hiroshi Ito
  • Yoshiyuki Sakai
  • Tomoyuki Furukawa

DOI:

https://doi.org/10.52731/liir.v003.068

Keywords:

StyleGAN, façade generation, multimodal retrieval, image editing

Abstract

Designing the façade, which is the front side of a building, is a crucial yet time consuming part of the architectural design process. Advancements in image generation have led to generative models capable of producing creative, high-quality images. However, it is difficult to apply existing image generative models to façade design as the generated images should provide the architect with inspiration while also reflecting the designer’s knowledge and building requirements. The existing models are inadequate for controlling image generation. Thus, we propose a system that supports designers in coming up with ideas for façade design by enabling them to intervene in image generation. The proposed system first determines the base image by using text-to-image retrieval. Next, the system generates diverse images using adversarial generation networks and the user selects images in alternation. This allows for repeated divergence and convergence of ideas and provides the user with inspiration. Our experiments demonstrated that the proposed system is able to arrive at the target idea while providing a variety of ideas through controlled generation.

References

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil

Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural

information processing systems, 27, 2014..

Zhengwei Wang, Qi She, and Tomas E Ward. Generative adversarial networks in computer

vision: A survey and taxonomy. ACM Computing Surveys (CSUR), 54(2):1– 38, 2021.

Eoin Brophy, Zhengwei Wang, Qi She, and Tomas Ward. Generative adversarial networks in

time series: A survey and taxonomy. arXiv preprint arXiv:2107.11098, 2021.

Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision

and pattern recognition, pages 4401–4410, 2019.

Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance

normalization. In Proceedings of the IEEE international conference on computer vision, pages 1501–1510, 2017. [6] Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, and Ming-Hsuan Yang.

Gan inversion: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence,

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation

with conditional adversarial networks. In Proceedings of the IEEE conference on computer

vision and pattern recognition, pages 1125–1134, 2017.

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. In

Proceedings of the IEEE conference on computer vision and pattern recognition, pages

–8807, 2018.

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to- image

translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.

Qi Mao, Hsin-Ying Lee, Hung-Yu Tseng, Siwei Ma, and Ming-Hsuan Yang. Mode seeking

generative adversarial networks for diverse image synthesis. In Proceedings of the

IEEE/CVF conference on computer vision and pattern recognition, pages 1429– 1437, 2019.

Mohamed R Ibrahim, James Haworth, and Nicola Christie. Re-designing cities with conditional adversarial networks. arXiv preprint arXiv:2104.04013, 2021.

MaximilianBachlandDanielCFerreira.City-gan:Learningarchitecturalstylesusing a

custom conditional gan architecture. arXiv preprint arXiv:1907.05280, 2019.

Sagar Joglekar, Daniele Quercia, Miriam Redi, Luca Maria Aiello, Tobias Kauer, and

Nishanth Sastry. Facelift: a transparent deep learning framework to beautify urban scenes.

Royal Society open science, 7(1):190987, 2020.

Nelson Nauata, Kai-Hung Chang, Chin-Yi Cheng, Greg Mori, and Yasutaka Furukawa.

House-gan: Relational generative adversarial networks for graph-constrained house layout

geeration. In European Conference on Computer Vision, pages 162– 177. Springer, 2020.

Nelson Nauata, Sepidehsadat Hosseini, Kai-Hung Chang, Hang Chu, Chin-Yi Cheng, and

Yasutaka Furukawa. House-gan++: Generative adversarial layout refinement networks.

arXiv preprint arXiv:2103.02574, 2021.

Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao.

Lsun: Construction of a large-scale image dataset using deep learning with humans in the

loop. arXiv preprint arXiv:1506.03365, 2015.

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjo ̈rn Ommer.

High-resolution image synthesis with latent diffusion models. In Proceedings of the

IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695,

Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila.

Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF

conference on computer vision and pattern recognition, pages 8110–8119, 2020.

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans

for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017.

Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila.

Training generative adversarial networks with limited data. Advances in Neural Information

Processing Systems, 33:12104–12114, 2020.

Yichun Shi, Xiao Yang, Yangyue Wan, and Xiaohui Shen. Semanticstylegan: Learning

compositional generative priors for controllable image synthesis and editing. In Proceedings

of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11254–

, 2022.

HuanLing,KarstenKreis, DaiqingLi, SeungWookKim, AntonioTorralba, and Sanja Fidler.

Editgan: High-precision semantic image editing. Advances in Neural Information Processing

Systems, 34:16331–16345, 2021.

Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. Interpreting the latent space of gans

for semantic face editing. In Proceedings of the IEEE/CVF conference on computer vision

and pattern recognition, pages 9243–9252, 2020.

Erik Ha ̈rko ̈nen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. Ganspace: Discovering interpretable gan controls. Advances in Neural Information Processing Systems,

:9841–9850, 2020.

Yujun Shen and Bolei Zhou. Closed-form factorization of latent semantics in gans.

In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1532–1540, 2021.

Andrey Voynov and Artem Babenko. Unsupervised discovery of interpretable directions in

the gan latent space. In International conference on machine learning, pages 9786–9796.

PMLR, 2020.

Rameen Abdal, Peihao Zhu, Niloy J Mitra, and Peter Wonka. Styleflow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous

normalizing flows. ACM Transactions on Graphics (ToG), 40(3):1–21, 2021.

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini

Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and

Ilya Sutskever. Learning transferable visual models from natural language supervision, 2021.

Guillaume Couairon, Matthieu Cord, Matthijs Douze, and Holger Schwenk. Embedding

arithmetic for text-driven image transformation. arXiv preprint arXiv:2112.03162, 2021.

Downloads

Published

2023-02-17