# Investigation of Facial Preference Using Gaussian Process Preference Learning and Generative Image Model
Masashi Komori Keito Shiroshita Kohyo Nakamura GAMISAN
## Abstract
In this study, we introduce a novel approach to investigate an intrinsic psychophysical function of human facial attractiveness using a sequential experimental design with a combination of Bayesian optimization (BO) and StyleGAN2. To estimate a facial attractiveness function from pairwise comparison data, we used a BO that incorporates Gaussian Process Preference Learning (GPPL). Fifty Japanese female university students provided facial photographs. We embedded each female facial image into the latent representation ($18 \times 512$ dimensions) in the StyleGAN2 network trained on the Flickr-Faces-HQ (FFHQ) dataset. Using PCA, the dimension of the latent representations is reduced to an 8-dimensional subspace, which we refer to here as the Japanese female face space. Nine participants participated in the pairwise comparison task in which they had to choose the more attractive of two facial images synthesized using StyleGAN2 in the face subspace and provided their evaluations in 100 trials. The stimuli for the first 80 trials were created from randomly generated parameters in the face subspace, while the remaining 20 trials were created from the parameters calculated using the acquisition function. Based on the results, we estimated the facial parameters corresponding to the most, the least, 25, 50, 75 percentile rank of attractiveness and reconstructed the faces. The results show that a combination of StyleGAN2 and GPPL methodologies is an effective way for elucidating human *kansei* evaluations to complex stimuli such as human faces.
## Introduction
We can intuitively make evaluations in one-dimensional quantities for stimuli that consist of complex physical features in multiple dimensions. A typical example of this is the perceptions of facial attractiveness. Characteristics of the human face are constituted by a large number of variables and facial attractiveness is evaluated by assessing these multidimensional variables\cite{cunningham1986measuring,benson1993extracting}. A number of studies have investigated how facial attractiveness is influenced by the facial characteristics. Clarifying such relationship between multidimensional physical features and one-dimensional psychological evaluations, that is, multidimensional psychophysical functions, is important for clarifying the mechanism of our intuitive judgment. However, it is difficult to collect psychological evaluations to multi-dimensional physical quantities in a brute-force manner because of its high cost. To solve this problem, this study applies Bayesian optimization to the elucidation of multidimensional psychophysical functions.
Bayesian optimization is a method of global sequential optimization that searches for the maximum/minimum value of a Black-Box function or estimates an unknown function\cite{mockus1989bayesian}. Bayesian optimization combines Gaussian process regression with the determination of the search point by the acquisition function. Bayesian optimization estimates the unknown function by repeating (1) collection of responses, (2) Gaussian process regression, (3) determination of search point and collection of responses.
However, there are problems in applying Bayesian optimization to the estimation of psychophysical functions. In a typical Bayesian optimization method, the response from an unknown function is expected to be a continuous quantity. However, it is difficult for humans to respond in continuous manner. For the estimation of psychophysical functions based on natural human responses, it is desirable to use discrete responses. In this study, we use Gaussian Process with Preference Learning (GPPL)\cite{chu2005preference,brochu2010tutorial}, which can estimate nonlinear functions from sparse pairwise preference data sets.
### Gaussian Process Preference Learning
GPPL can be considered as a combination of Thurstone's pairwise comparison model (probit model) and Gaussian process regression. We assume here that a participant has certain intrinsic psychophysical utility function $f(\mathbf{x})$ that maps the multivariate input $\mathbf{x}$ to a scalar value representing the evaluation on a given stimulus.
Consider a set of $m$ observed pairwise preference relations on the stimuli, denoted as
$$
\mathcal{D}=\{v_k\succ u_k;k=1,\ldots,m\}
$$
where $v_k \succ u_k$ means the stimulus $v_k$ is preferred to $u_k$.
By assuming Gaussian noise $\mathcal{N}(\delta;\mu,\sigma^2)$ on participant's responses, the likelihood of a pairwise comparison $v_k \succ u_k$ becomes:
$$
\mathcal{P}\left(v_k\succ u_k \mid f(v_k),f(u_k)\right)\\
=\int\int\mathcal{P}_{\mathrm{ideal}}\left(v_k\succ u_k \mid f(v_k)+\delta_v,f(u_k)+\delta_u\right)
\mathcal{N}(\delta_v;0,\sigma^2)\mathcal{N}(\delta_u;0,\sigma^2)d\delta_vd\delta_u\\
=\Phi(z_k)
$$
where $z_k=\frac{f(v_k)-f(u_k)}{\sqrt{2}\sigma_{\mathrm{noise}}}
,\Phi(z)=\int^{z}_{-\infty}\mathcal{N}(\gamma;0,1)d\gamma$
We assume that the intrinsic psychophysical function $f$ follows Gaussian process prior: $f \thicksim \mathcal{GP}(0,\mathcal{K})$, where $\mathcal{K}$ is a kernel function. Here, we used a RBF ARD kernel\cite{mackay1996bayesian}. Following previous studies, the posterior distribution $P(f\mid\mathcal{D})$ was approximated as a Gaussian using a Laplace approximation\cite{chu2005preference,brochu2010tutorial}.
## Facial Image Generation
### StyleGAN2
Previous studies of facial attractiveness have mainly relied on two techniques. The first is a method that uses digitally blended composite faces created by photographic superimposing techniques or 2D/3D morphing techniques \cite{langlois1990attractive,benson1993extracting,oosterhof2008functional}. The problem with these methods is that the expression of hair and facial texture differs greatly from natural facial photographs, and the unnaturalness of the photographs might affect the judgment of the participants. The other method is a measurement-based technique examining the relationship between the measured data of individual faces and evaluated attractiveness of individual facial photographs \cite{cunningham1986measuring}. This approach requires the use of facial photographs of real people, which raises the issue of violating the privacy of the face providers and the ethical issue of evaluating the attractiveness of real people's faces.
Recent developments in deep learning-based image generation models should be an effective method to solve the above problems. StyleGAN is a variant of the generative adversarial network (GAN) machine learning framework\cite{karras2019style}. It is a generator architecture that excels at generating images in which both local and global features are important, such as human faces, and can generate completely fake but realistic and convincing-looking face images. StyleGAN2 \cite{karras2020analyzing} is the second version of StyleGAN that removes various artifacts and improves the quality of the generated images and fixed the water-splotches issue in StyleGAN.
This study attempts to solve both the ethical problem of using real facial images and the ecological validity problem of using unnatural facial images that have been problematic in conventional attractiveness researches by using StyleGAN2 for generating face stimulus images. The relationship between the generated face images and the subjective attractiveness evaluations can be investigated by a Gaussian process regression model with the latent representations of StyleGAN2 as the explanatory variables and the subjective attractiveness evaluation as the response variable.
### Construction of Latent Facial Subspace
Kerras et al.\cite{karras2020analyzing}, the authors of StyleGAN2, introduce a dataset of human faces called Flickr-FacesHQ (FFHQ), which is a dataset of human faces of various races which consists of 70,000 high quality images at 1024×1024 resolution. We used the StyleGAN2 network, which has been trained with FFHQ, for our study.
In the StyleGAN2 latent face space, the vector has 18x512 dimensions, which is an enormous space to search with GPPL. In addition, the latent facial space consists of faces of other racial groups, so it is not suitable for experiments on Japanese people. Therefore, we created a low-dimensional subspace of the latent space using a reference face image. In addition, the latent facial space consists of male and female faces of other races and is not suitable for female facial attractiveness survey in Japan. Therefore, we created a low-dimensional subspace of the latent space consisting only of Asian female faces using reference facial images.
Fifty Japanese female university students provided facial photographs face at 1024×1024 resolution. A neutral expression of each face was captured using a digital camera. A gray screen was set up in the background. The foreheads of the models were exposed using a headband, after removing accessories such as eyeglasses.
First, the facial images of the 50 females were embedded into the latent space of StyleGAN2 with the method of Kerras et al.\cite{karras2020analyzing}. Then, PCA was performed on the resulting 18x512 dimensional latent representation (up to the 8th PC). Changes of facial image along each principal component are illustrated in Figure 1 to identify the facial features linked to each component. The 1st PC was found to be related to the height of the forehead and the shape of the eyes. The 2nd PC was linked to the thickness of the eyebrows and the length of the face. The 3rd, the 4th, the 5th , and the 6th were related to shape of the cheekbones, eyes, eyelid, jaw respectively. The 7th PC was linked to the width of the face. The 8th PC was found to be related not only to the shape of the face but also to the color and the texture of the face.
In this study, the 8-dimensional space obtained by the above procedure is referred to as the face subspace. We use GPPL to regress the subjective attractiveness evaluations on the 8-dimensional facial vectors of this subspace.

Figure 1 Facial image variation along each dimension of the face subspace (-2SD/Mean/+2SD).
## Assessment of Attractiveness
### Methods
#### Participants
Participants (n = 9; mean age = 21.11, SD= .57) participated in the experiment.
#### Procedure
Participants were instructed to select an attractive face from two images in 100 trials. The participants were requested to respond in terms of aesthetics and not in terms of their preference for the model as a sexual partner or a companion, according to their first impressions of the images, i.e., without contemplating their responses.
The stimulus images in the first 95 trials of the 100 trials were generated from parameters randomly selected from a range of ±2 SD in the eight-dimensional face subspace. In the next 5 trials, one of the image pairs was generated from the parameter corresponding to the current maximum of posterior mean in attractiveness. The other image was generated from the parameter corresponding to the maximum expected improvement (EI) \cite{brochu2010tutorial}.
One session was conducted for each participant. All stimuli were presented on an LCD monitor, and participants responded with a keyboard.The application used in the experiments was implemented in Psychopy environment\cite{peirce2007psychopy}.
### Results
Based on all Pairwise preferences, the attractiveness function of each participant was estimated using GPPL. The latent variables corresponding to the maximum posterior mean of the attractiveness evaluations of each participant are shown in Table 1. The 1st and the 3rd principal components showed the least variability among the participants.
The 1st PC is the component related to the height of the forehead. The third principal component is the component related to cheekbones. It is known that forehead shape is related to baby schema\{farkas1994anthropometry}, and cheekbone shape is related to sexual dimorphism \cite{enlow1966morphogenetic}. The similarity of preferences among participants for these PCs suggests that preferences that have evolutionary origin strongly influenced the judgments concerning the 1st and 3rd principal components. On the other hand, the second principal component with the highest variability was the component related to eyebrow shape.The preference for eyebrow shape might be culturally, rather than evolutionarily, derived.
The predicted mean and variance of the whole subspace were obtained with the interval to be 0.25 for all dimensions for each participant, resulting in 43,046,721 points. Figure 2 shows an example of predicted mean map of Participant 1 on PC1 and PC2. Further, for each participant, the predicted mean maximum, third quartile, median, first quartile, and minimum points were determined, and the corresponding images were generated (Figure 3).
Figure 3 shows that the faces corresponding to the maximum and minimum predicted mean values are similar among the participants, suggesting that there is a common trend in face attractiveness evaluation.
---
Table 1 Posterior mean maximum points, means and SDs
| | 1st PC | 2nd PC | 3rd PC | 4th PC | 5th PC | 6th PC | 7th PC | 8th PC |
|---------------|-------:|-------:|-------:|-------:|-------:|-------:|-------:|-------:|
| Participant 1 | -1.08 | -1.35 | 0.79 | 1.42 | 1.16 | 1.70 | 0.19 | 1.14 |
| Participant 2 | -0.60 | -0.65 | -0.48 | -0.28 | 1.25 | 0.96 | -0.21 | 1.14 |
| Participant 3 | -0.79 | 0.39 | 0.32 | 1.32 | 1.22 | 0.47 | 1.02 | 0.25 |
| Participant 4 | -0.69 | 0.67 | 0.70 | 0.60 | 1.23 | 1.11 | 1.08 | 0.28 |
| Participant 5 | -1.26 | 1.48 | 0.02 | 1.29 | 1.14 | 0.36 | -0.08 | -1.06 |
| Participant 6 | 0.08 | -0.75 | 0.31 | 0.59 | 0.57 | 0.06 | 1.02 | -0.11 |
| Participant 7 | -0.92 | 0.62 | 1.48 | 1.54 | 0.82 | 0.81 | 0.78 | 0.12 |
| Participant 8 | -1.15 | -0.12 | -0.96 | 0.14 | 0.98 | 1.21 | -0.30 | -0.84 |
| Participant 9 | -0.09 | 1.53 | 1.36 | 0.45 | 0.30 | 2.00 | 1.52 | 0.48 |
| Mean | -0.72 | 0.20 | 0.39 | 0.78 | 0.96 | 0.96 | 0.56 | 0.15 |
| S.D | 0.44 | 0.94 | 0.75 | 0.60 | 0.32 | 0.59 | 0.63 | 0.72 |
<!--
\begin{table}[]
\begin{tabular}{lllllllll}
\hline
Participant & 1st PC & 2nd PC & 3rd PC & 4th PC & 5th PC & 6th PC & 7th PC & 8th PC \\ \hline
1 & -1.08 & -1.35 & 0.79 & 1.42 & 1.16 & 1.7 & 0.19 & 1.14 \\
2 & -0.6 & -0.65 & -0.48 & -0.28 & 1.25 & 0.96 & -0.21 & 1.14 \\
3 & -0.79 & 0.39 & 0.32 & 1.32 & 1.22 & 0.47 & 1.02 & 0.25 \\
4 & -0.69 & 0.67 & 0.7 & 0.6 & 1.23 & 1.11 & 1.08 & 0.28 \\
5 & -1.26 & 1.48 & 0.02 & 1.29 & 1.14 & 0.36 & -0.08 & -1.06 \\
6 & 0.08 & -0.75 & 0.31 & 0.59 & 0.57 & 0.06 & 1.02 & -0.11 \\
7 & -0.92 & 0.62 & 1.48 & 1.54 & 0.82 & 0.81 & 0.78 & 0.12 \\
8 & -1.15 & -0.12 & -0.96 & 0.14 & 0.98 & 1.21 & -0.3 & -0.84 \\
9 & -0.09 & 1.53 & 1.36 & 0.45 & 0.3 & 2 & 1.52 & 0.48 \\ \hline
Mean & -0.72 & 0.2 & 0.39 & 0.78 & 0.96 & 0.96 & 0.56 & 0.15 \\
S.D. & 0.44 & 0.94 & 0.75 & 0.6 & 0.32 & 0.59 & 0.63 & 0.72 \\ \hline
\end{tabular}
\end{table}
-->


Figure 2 Examples of predicted mean contours of attractiveness on the PC1-PC2 and PC1-PC3. The height of each surface describes the strength of the attractiveness.

Figure 3 Face images corresponding to the ranking of the predicted mean value for each experimental participant (the least attractive/1st quartile/middle/3rd quartile/the most attractive)
---
## Discussion
This study provided a novel approach to clarify a intrinsic psychophysical function of human facial attractiveness using a sequential experimental design that combines Bayesian optimization (BO) with GPPL and StyleGAN2 methods. This is the first study on facial attractiveness that uses GPPL, and the results show that GPPL is an effective method for investigating human complex judgments of multivariate physical stimuli such as face. We used the StyleGAN2 to generate face images in this study. In conventional facial attractiveness studies, there are ethical problems of using real face images and ecological validity problems of using unnatural face images, but by using StyleGAN2, we have solved these problems. Furthermore, the dimensional reduction technique used in this study, which embeds the reference images into the latent space and then performs PCA have adequately reduced the enormous search space of the StyleGAN, indicating the effectiveness of our method in using trained deep learning networks for psychological studies. Though this study focuses on facial attractiveness, our proposed method can be applied to research in product design, experimental aesthetics, and other fields.
## References
@book{mockus1989bayesian,
title={Bayesian Approach to Global Optimization: Theory and Applications},
author={Mockus, Jonas},
year={1989},
publisher={Kluwer Academic Publishers}
}
@inproceedings{chu2005preference,
title={Preference learning with Gaussian processes},
author={Chu, Wei and Ghahramani, Zoubin},
booktitle={Proceedings of the 22nd international conference on Machine learning},
pages={137--144},
year={2005}
}
@article{brochu2010tutorial,
title={A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning},
author={Brochu, Eric and Cora, Vlad M and De Freitas, Nando},
journal={arXiv preprint arXiv:1012.2599},
year={2010}
}
@article{cunningham1986measuring,
title={Measuring the physical in physical attractiveness: quasi-experiments on the sociobiology of female facial beauty.},
author={Cunningham, Michael R},
journal={Journal of personality and social psychology},
volume={50},
number={5},
pages={925},
year={1986},
publisher={American Psychological Association}
}
@article{benson1993extracting,
title={Extracting prototypical facial images from exemplars},
author={Benson, Philip J and Perrett, David I},
journal={Perception},
volume={22},
number={3},
pages={257--262},
year={1993},
publisher={SAGE Publications Sage UK: London, England}
}
@article{langlois1990attractive,
title={Attractive faces are only average},
author={Langlois, Judith H and Roggman, Lori A},
journal={Psychological science},
volume={1},
number={2},
pages={115--121},
year={1990},
publisher={SAGE Publications Sage CA: Los Angeles, CA}
}
@article{oosterhof2008functional,
title={The functional basis of face evaluation},
author={Oosterhof, Nikolaas N and Todorov, Alexander},
journal={Proceedings of the National Academy of Sciences},
volume={105},
number={32},
pages={11087--11092},
year={2008},
publisher={National Acad Sciences}
}
@inproceedings{karras2019style,
title={A style-based generator architecture for generative adversarial networks},
author={Karras, Tero and Laine, Samuli and Aila, Timo},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={4401--4410},
year={2019}
}
@inproceedings{karras2020analyzing,
title={Analyzing and improving the image quality of stylegan},
author={Karras, Tero and Laine, Samuli and Aittala, Miika and Hellsten, Janne and Lehtinen, Jaakko and Aila, Timo},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={8110--8119},
year={2020}
}
@article{peirce2007psychopy,
title={PsychoPy—psychophysics software in Python},
author={Peirce, Jonathan W},
journal={Journal of neuroscience methods},
volume={162},
number={1-2},
pages={8--13},
year={2007},
publisher={Elsevier}
}
@article{enlow1966morphogenetic,
title={A morphogenetic analysis of facial growth},
author={Enlow, Donald H},
journal={American journal of orthodontics},
volume={52},
number={4},
pages={283--299},
year={1966},
publisher={Elsevier}
}
@book{farkas1994anthropometry,
title={Anthropometry of the Head and Face},
author={Farkas, Leslie G},
year={1994},
publisher={Lippincott Williams \& Wilkins}
}
@incollection{mackay1996bayesian,
title={Bayesian methods for backpropagation networks},
author={MacKay, David JC},
booktitle={Models of neural networks III},
pages={211--254},
year={1996},
publisher={Springer}
}