Façade Design Support System with Control of Image Generation using GAN
DOI:
https://doi.org/10.52731/liir.v003.068Keywords:
StyleGAN, façade generation, multimodal retrieval, image editingAbstract
Designing the façade, which is the front side of a building, is a crucial yet time consuming part of the architectural design process. Advancements in image generation have led to generative models capable of producing creative, high-quality images. However, it is difficult to apply existing image generative models to façade design as the generated images should provide the architect with inspiration while also reflecting the designer’s knowledge and building requirements. The existing models are inadequate for controlling image generation. Thus, we propose a system that supports designers in coming up with ideas for façade design by enabling them to intervene in image generation. The proposed system first determines the base image by using text-to-image retrieval. Next, the system generates diverse images using adversarial generation networks and the user selects images in alternation. This allows for repeated divergence and convergence of ideas and provides the user with inspiration. Our experiments demonstrated that the proposed system is able to arrive at the target idea while providing a variety of ideas through controlled generation.
References
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil
Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural
information processing systems, 27, 2014..
Zhengwei Wang, Qi She, and Tomas E Ward. Generative adversarial networks in computer
vision: A survey and taxonomy. ACM Computing Surveys (CSUR), 54(2):1– 38, 2021.
Eoin Brophy, Zhengwei Wang, Qi She, and Tomas Ward. Generative adversarial networks in
time series: A survey and taxonomy. arXiv preprint arXiv:2107.11098, 2021.
Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision
and pattern recognition, pages 4401–4410, 2019.
Xun Huang and Serge Belongie. Arbitrary style transfer in real-time with adaptive instance
normalization. In Proceedings of the IEEE international conference on computer vision, pages 1501–1510, 2017. [6] Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, and Ming-Hsuan Yang.
Gan inversion: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence,
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation
with conditional adversarial networks. In Proceedings of the IEEE conference on computer
vision and pattern recognition, pages 1125–1134, 2017.
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. In
Proceedings of the IEEE conference on computer vision and pattern recognition, pages
–8807, 2018.
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to- image
translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
Qi Mao, Hsin-Ying Lee, Hung-Yu Tseng, Siwei Ma, and Ming-Hsuan Yang. Mode seeking
generative adversarial networks for diverse image synthesis. In Proceedings of the
IEEE/CVF conference on computer vision and pattern recognition, pages 1429– 1437, 2019.
Mohamed R Ibrahim, James Haworth, and Nicola Christie. Re-designing cities with conditional adversarial networks. arXiv preprint arXiv:2104.04013, 2021.
MaximilianBachlandDanielCFerreira.City-gan:Learningarchitecturalstylesusing a
custom conditional gan architecture. arXiv preprint arXiv:1907.05280, 2019.
Sagar Joglekar, Daniele Quercia, Miriam Redi, Luca Maria Aiello, Tobias Kauer, and
Nishanth Sastry. Facelift: a transparent deep learning framework to beautify urban scenes.
Royal Society open science, 7(1):190987, 2020.
Nelson Nauata, Kai-Hung Chang, Chin-Yi Cheng, Greg Mori, and Yasutaka Furukawa.
House-gan: Relational generative adversarial networks for graph-constrained house layout
geeration. In European Conference on Computer Vision, pages 162– 177. Springer, 2020.
Nelson Nauata, Sepidehsadat Hosseini, Kai-Hung Chang, Hang Chu, Chin-Yi Cheng, and
Yasutaka Furukawa. House-gan++: Generative adversarial layout refinement networks.
arXiv preprint arXiv:2103.02574, 2021.
Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong Xiao.
Lsun: Construction of a large-scale image dataset using deep learning with humans in the
loop. arXiv preprint arXiv:1506.03365, 2015.
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjo ̈rn Ommer.
High-resolution image synthesis with latent diffusion models. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695,
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila.
Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition, pages 8110–8119, 2020.
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans
for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017.
Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila.
Training generative adversarial networks with limited data. Advances in Neural Information
Processing Systems, 33:12104–12114, 2020.
Yichun Shi, Xiao Yang, Yangyue Wan, and Xiaohui Shen. Semanticstylegan: Learning
compositional generative priors for controllable image synthesis and editing. In Proceedings
of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11254–
, 2022.
HuanLing,KarstenKreis, DaiqingLi, SeungWookKim, AntonioTorralba, and Sanja Fidler.
Editgan: High-precision semantic image editing. Advances in Neural Information Processing
Systems, 34:16331–16345, 2021.
Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. Interpreting the latent space of gans
for semantic face editing. In Proceedings of the IEEE/CVF conference on computer vision
and pattern recognition, pages 9243–9252, 2020.
Erik Ha ̈rko ̈nen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. Ganspace: Discovering interpretable gan controls. Advances in Neural Information Processing Systems,
:9841–9850, 2020.
Yujun Shen and Bolei Zhou. Closed-form factorization of latent semantics in gans.
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1532–1540, 2021.
Andrey Voynov and Artem Babenko. Unsupervised discovery of interpretable directions in
the gan latent space. In International conference on machine learning, pages 9786–9796.
PMLR, 2020.
Rameen Abdal, Peihao Zhu, Niloy J Mitra, and Peter Wonka. Styleflow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous
normalizing flows. ACM Transactions on Graphics (ToG), 40(3):1–21, 2021.
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini
Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and
Ilya Sutskever. Learning transferable visual models from natural language supervision, 2021.
Guillaume Couairon, Matthieu Cord, Matthijs Douze, and Holger Schwenk. Embedding
arithmetic for text-driven image transformation. arXiv preprint arXiv:2112.03162, 2021.