Figure 2 - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
Content may be subject to copyright.
Our LFS model also predicts subjective liking ratings for various kinds of photographs. (A). Example stimuli from the photography dataset. We took a wide range of images from the online photography (AVA) dataset, 29 and ran a further on-line experiment in new M-Turk participants (n = 382) to obtain value ratings for these images. (B). A linear model with low-level features alone captured liking ratings for photography. This model when trained on liking ratings for photography (from the current experiment) also captured liking ratings for paintings (from the previous experiment described in Figure 1), and the model trained on liking ratings for paintings could also liking ratings for photography. We note that in all cases the model was trained and tested on completely separate sets of participants. The significance was tested against the null distribution constructed from the analysis with permuted image labels. The error bars indicate the mean and the SEM.
Source publication
It is an open question whether preferences for visual art can be lawfully predicted from the basic constituent elements of a visual image. Moreover, little is known about how such preferences are actually constructed in the brain. Here we developed and tested a computational framework to gain an understanding of how the human brain constructs aesth...
Contexts in source publication
Context 1
... we found in a representation dissimilarity analysis over visual stimuli that lowlevel features seem to capture art genres, but high-level features go beyond the genres ( Figure S2). We also found that we could reliably predict value ratings for online participants (Figure 1G), not only when training the model on the online participants' data (using leave-one-out cross-validation) but also the model had been trained using in-lab participants' data. ...
Context 2
... potential concern we had was that our ability to predict artwork rating scores using this linear model might be somehow idiosyncratic due to specific properties of the stimuli used in our stimulus-set. To address this, we investigated the extent to which our findings generalize to other kinds of visual images by using a new image database of 716 images; 29 Figure 2A), this time involving photographs (as opposed to paintings) of various objects and scenes, including landscapes, animals, flowers, and pictures of food. We obtained ratings for these 716 images in a new m-Turk sample of 382 participants. ...
Context 3
... obtained ratings for these 716 images in a new m-Turk sample of 382 participants. Using the low-level attributes alone (these images were not annotated with high-level features), the linear integration model could reliably predict photograph ratings ( Figure 2B). The model performed well when trained and tested on the photograph database, but to our surprise, the same model (as trained on photographs) could also predict the ratings for paintings that we collected in our first experiment, and vice versa (a model trained on the painting ratings could predict photograph ratings). ...
Context 4
... also performed a control PPI analysis to test the specificity of the coupling to an experimental epoch by constructing a similar PPI regressor locked to the epoch of inter-trial-intervals (ITIs). This analysis showed a dramatically altered coupling that did not involve the same PPC and PFC regions ( Figure S12). These findings suggest that both PPC and LPFC are specifically involved in the integration of feature representations in order to compute the subjective value, and that this integration depends on connections between feature representations in PPC/lFPC and subjective value representations in mPFC. ...
Context 5
... results of the functional coupling analysis show that features that are represented in the PPC and lPFC are coupled with the region in mPFC encoding subjective value. This result dramatically contrasts with our control analysis focusing on ITI instead of stimulus presentations (Figure S12). ...
Context 6
... coupling with seed Figure S12: The same analysis as in Figure 7C, except here the epoch of the ITIs are taken as the psychological regressor, as opposed to the epoch of presentation of the visual stimuli In this situation, we did not observe robust coupling between mPFC value areas and lateral PFC and PPC, thereby supporting the possibility that increased coupling between lPFC, PPC and mPFC occurs specifically at the time of stimulus evaluation. ...
Similar publications
The government of India policy to make all utility services available to Indian citizens electronically is 'Digital India' initiative. Digital India programme was launched on 1 July 01 by Prime Minister Mr. Narendra Modi with a huge vision to transform India into a digitally empowered nation. Several initiatives have been taken under Digital India...
Citations
... This was done by manual annotation, but it can also be done with a human detection algorithm (e.g., see ref. 96). We included this presence-of-a-person feature in the low-level feature set originally 97 , though we found in our DCNN analysis that the feature shows a signature of a high-level feature 97 . Therefore in this current study, we included this presence of a person to the high-level feature set. ...
... We thus sought to identify a minimal subset of features that are commonly used by participants. In ref. 97, we performed this analysis using Matlab Sparse Gradient Descent Library (https://github.com/hiroyuki-kasai/SparseGDLibrary). For this, we first orthogonalized features by sparse PCA 98 . ...
... Decoding features from the deep neural network. We decoded the LFS model features from hidden layers by using linear (for continuous features) and logistic (for categorical features) regression models, as we described in ref. 97. We considered the activations of outputs of ReLU layers (total of 15 layers). ...
Little is known about how the brain computes the perceived aesthetic value of complex stimuli such as visual art. Here, we used computational methods in combination with functional neuroimaging to provide evidence that the aesthetic value of a visual stimulus is computed in a hierarchical manner via a weighted integration over both low and high level stimulus features contained in early and late visual cortex, extending into parietal and lateral prefrontal cortices. Feature representations in parietal and lateral prefrontal cortex may in turn be utilized to produce an overall aesthetic value in the medial prefrontal cortex. Such brain-wide computations are not only consistent with a feature-based mechanism for value construction, but also resemble computations performed by a deep convolutional neural network. Our findings thus shed light on the existence of a general neurocomputational mechanism for rapidly and flexibly producing value judgements across an array of complex novel stimuli and situations.
... This procedure is common practice in machine learning (e.g. Battleday et al., 2020;Hebart et al., 2020;Iigaya et al., 2020) and is the basis for how deep neural networks (DNNs) such as VGG16 (Simonyan & Zisserman, 2015) process images. In theory, this feature vector can be a representation of the stimulus on any level, e.g., the raw pixel values of an image, or -at the opposite extreme -the penultimate layer of a DNN. ...
... The challenge is taken on by unsupervised learning: determining the relevant feature space, how it is learned, how it reflects the expected true distribution, and how these features are re-weighted during learning (Baldi, 2012;Barlow, 1989;Ghahramani, 2004;Hinton & Sejnowski, 1999). Recent work in machine learning has succeeded in creating simple vector representations from images that predict human category (Battleday et al., 2020), similarity (Hebart et al., 2020), and aesthetic judgments (Iigaya et al., 2020). These feature representations can and do include high-level features. ...
... A single feature can capture semantic information, such as animacy (Hebart et al., 2020). What is more, high-level image properties, such as concreteness and valence, can be constructed from low-level image features that are extracted directly from pixel values (Iigaya et al., 2020). Based on these findings, one could imagine finding a suitable feature space for our model by exploiting feature representations of the final layers of a deep neural net (Battleday et al., 2020;Iigaya et al., 2020) or ratings of independent observers (Hebart et al., 2020). ...
People invest precious time and resources on experiences such as watching movies or listening to music. Yet, we still have a poor understanding of how such sensed experiences gain aesthetic value. We propose a model of aesthetic value that integrates existing theories with literature on conventional primary and secondary rewards such as food and money. We assume that the states of observers' sensory and cognitive systems adapt to process stimuli effectively in both the present and the future. These system states collectively comprise a probabilistic generative model of stimuli in the environment. Two interlinked components generate value: immediate sensory reward and the change in expected future reward. An immediate sensory reward is taken as the fluency with which a stimulus is processed, quantified by the likelihood of that stimulus given an observer's state. The change in expected future reward is taken as the change in fluency with which likely future stimuli will be processed. It is quantified by the change in the divergence between the observer's system state and the distribution of stimuli that the observer expects to see over the long term. Simulations show that a simple version of the model can account for empirical data on the effects of exposure, complexity, and symmetry on aesthetic value judgments. Taken together, our model melds processing fluency theories (immediate reward) and learning theories (change in expected future reward). Its application offers insight as to how the interplay of immediate processing fluency and learning gives rise to aesthetic value judgments. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
... how subjective value was formed through bottom-up processing (Iigaya et al., 2020;Pham et al., 2021) indicates that the extended BVS might serve as a gateway for domain-specific value computation or attribute value computation (Lim et al., 2013). A natural interpretation of the higher pleasantness-rating scores for young faces found in both age groups is that young faces are more rewarding than older faces, regardless of participants' age. ...
Own-age bias is a well-known bias reflecting the effects of age, and its role has been demonstrated, particularly, in face recognition. However, it remains unclear whether an own-age bias exists in facial impression formation. In the present study, we used three datasets from two published and one unpublished functional magnetic resonance imaging (fMRI) study that employed the same pleasantness rating task with fMRI scanning and preferential choice task after the fMRI to investigate whether healthy young and older participants showed own-age effects in face preference. Specifically, we employed a drift-diffusion model to elaborate the existence of own-age bias in the processes of preferential choice. The behavioral results showed higher rating scores and higher drift rate for young faces than for older faces, regardless of the ages of participants. We identified a young-age effect, but not an own-age effect. Neuroimaging results from aggregation analysis of the three datasets suggest a possibility that the ventromedial prefrontal cortex (vmPFC) was associated with evidence accumulation of own-age faces; however, no clear evidence was provided. Importantly, we found no age-related decline in the responsiveness of the vmPFC to subjective pleasantness of faces, and both young and older participants showed a contribution of the vmPFC to the parametric representation of the subjective value of face and functional coupling between the vmPFC and ventral visual area, which reflects face preference. These results suggest that the preferential choice of face is less susceptible to the own-age bias across the lifespan of individuals.
... In the case of the ventral visual pathway, this could include factors such as visual openness (the degree to which a scene provides a wide angle of view; Greene and Oliva, 2009) and concreteness (the degree to which an image depicts specific representational content; Chatterjee et al., 2010). Both of these factors have been shown to be positively correlated with aesthetic ratings (Franz et al., 2005;Biederman and Vessel, 2006;Vartanian et al., 2015;Iigaya et al., 2020) and to modulate neural activity in higher level visual regions (though for openness in the opposite direction) (Henderson et al., 2011;Iigaya et al., 2020). In the case of the dorsal visual pathway, this may include higher-order motion cues (such as that observed in clouds), responses to optic flow, object tracking, or the degree to which a landscape affords exploration. ...
... In the case of the ventral visual pathway, this could include factors such as visual openness (the degree to which a scene provides a wide angle of view; Greene and Oliva, 2009) and concreteness (the degree to which an image depicts specific representational content; Chatterjee et al., 2010). Both of these factors have been shown to be positively correlated with aesthetic ratings (Franz et al., 2005;Biederman and Vessel, 2006;Vartanian et al., 2015;Iigaya et al., 2020) and to modulate neural activity in higher level visual regions (though for openness in the opposite direction) (Henderson et al., 2011;Iigaya et al., 2020). In the case of the dorsal visual pathway, this may include higher-order motion cues (such as that observed in clouds), responses to optic flow, object tracking, or the degree to which a landscape affords exploration. ...
During aesthetically appealing visual experiences, visual content provides a basis for computation of affectively tinged representations of aesthetic value. How this happens in the brain is largely unexplored. Using engaging video clips of natural landscapes, we tested whether cortical regions that respond to perceptual aspects of an environment (e.g., spatial layout, object content and motion) were directly modulated by rated aesthetic appeal. Twenty-four participants watched a series of videos of natural landscapes while being scanned using functional magnetic resonance imaging (fMRI) and reported both continuous ratings of enjoyment (during the videos) and overall aesthetic judgments (after each video). Although landscape videos engaged a greater expanse of high-level visual cortex compared to that observed for images of landscapes, independently localized category-selective visual regions (e.g., scene-selective parahippocampal place area and motion-selective hMT+) were not significantly modulated by aesthetic appeal. Rather, a whole-brain analysis revealed modulations by aesthetic appeal in ventral (collateral sulcus) and lateral (middle occipital sulcus, posterior middle temporal gyrus) clusters that were adjacent to scene and motion selective regions. These findings suggest that aesthetic appeal per se is not represented in well-characterized feature- and category-selective regions of visual cortex. Rather, we propose that the observed activations reflect a local transformation from a feature-based visual representation to a representation of “elemental affect,” computed through information-processing mechanisms that detect deviations from an observer’s expectations. Furthermore, we found modulation by aesthetic appeal in subcortical reward structures but not in regions of the default-mode network (DMN) nor orbitofrontal cortex, and only weak evidence for associated changes in functional connectivity. In contrast to other visual aesthetic domains, aesthetically appealing interactions with natural landscapes may rely more heavily on comparisons between ongoing stimulation and well-formed representations of the natural world, and less on top-down processes for resolving ambiguities or assessing self-relevance.
... Assuming such linear mechanisms is common even in Machine Learning to lighten computations and mathematical analysis (Chung et al., 2018). In addition, linear methods have also been well-explored theoretically (Tsitsiklis and Van Roy, 1997;Maei, 2011;Mahmood and Sutton, 2015;Iigaya et al., 2020) and empirically (Dann et al., 2014;White and White, 2016) in the Machine Learning literature. Finally, arguments have been made that linear rules perform comparably to deep neural networks when predicting subjective aesthetic values (Iigaya et al., 2020). ...
... In addition, linear methods have also been well-explored theoretically (Tsitsiklis and Van Roy, 1997;Maei, 2011;Mahmood and Sutton, 2015;Iigaya et al., 2020) and empirically (Dann et al., 2014;White and White, 2016) in the Machine Learning literature. Finally, arguments have been made that linear rules perform comparably to deep neural networks when predicting subjective aesthetic values (Iigaya et al., 2020). However, modeling nonlinear processes with linear approximations should produce errors, or equivalently, regret in Machine Learning terminology (Kaelbling et al., 1996;Sutton and Barto, 2018; formally, regret is the difference between an agent's performance with that of an agent that acts optimally). ...
A theoretical framework for the reinforcement learning of aesthetic biases was recently proposed based on brain circuitries revealed by neuroimaging. A model grounded on that framework accounted for interesting features of human aesthetic biases. These features included individuality, cultural predispositions, stochastic dynamics of learning and aesthetic biases, and the peak-shift effect. However, despite the success in explaining these features, a potential weakness was the linearity of the value function used to predict reward. This linearity meant that the learning process employed a value function that assumed a linear relationship between reward and sensory stimuli. Linearity is common in reinforcement learning in neuroscience. However, linearity can be problematic because neural mechanisms and the dependence of reward on sensory stimuli were typically nonlinear. Here, we analyze the learning performance with models including optimal nonlinear value functions. We also compare updating the free parameters of the value functions with the delta rule, which neuroscience models use frequently, vs. updating with a new Phi rule that considers the structure of the nonlinearities. Our computer simulations showed that optimal nonlinear value functions resulted in improvements of learning errors when the reward models were nonlinear. Similarly, the new Phi rule led to improvements in these errors. These improvements were accompanied by the straightening of the trajectories of the vector of free parameters in its phase space. This straightening meant that the process became more efficient in learning the prediction of reward. Surprisingly, however, this improved efficiency had a complex relationship with the rate of learning. Finally, the stochasticity arising from the probabilistic sampling of sensory stimuli, rewards, and motivations helped the learning process narrow the range of free parameters to nearly optimal outcomes. Therefore, we suggest that value functions and update rules optimized for social and ecological constraints are ideal for learning aesthetic biases.
... This procedure is common practice in machine learning (e.g. Battleday et al., 2020;Hebart et al., 2020;Iigaya et al., 2020) and is the basis for how deep neural networks (DNNs) such as VGG16 (Simonyan & Zisserman, 2015) process images. In theory, this feature vector can be a representation of the stimulus on any level, e.g., the raw pixel values of an image, or -at the opposite extreme -the penultimate layer of a DNN. ...
... This is largely a problem of unsupervised learning: How does an agent build useful representations of sensory data (Baldi, 2012;Barlow, 1989;Ghahramani, 2004;Hinton & Sejnowski, 1999)? Recent work in machine learning has succeeded in creating simple vector representations from images that predict human category (Battleday et al., 2020), similarity (Hebart et al., 2020), and aesthetic judgments (Iigaya et al., 2020). Similar to these authors, one could imagine to find a suitable feature space for our model by exploiting feature representations of the final layers of a deep neural net (Battleday et al., 2020;Iigaya et al., 2020) or ratings of independent observers (Hebart et al., 2020). ...
... Recent work in machine learning has succeeded in creating simple vector representations from images that predict human category (Battleday et al., 2020), similarity (Hebart et al., 2020), and aesthetic judgments (Iigaya et al., 2020). Similar to these authors, one could imagine to find a suitable feature space for our model by exploiting feature representations of the final layers of a deep neural net (Battleday et al., 2020;Iigaya et al., 2020) or ratings of independent observers (Hebart et al., 2020). Iigaya and colleagues (2020), for instance, showed that people's liking ratings of artworks can be predicted well above chance by re-training the last three layers of VGG 16. ...
People invest precious time and resources on sensory experiences such as watching movies or listening to music. Yet, we still have a poor understanding of how sensory experiences gain aesthetic value. We propose a model of aesthetic value that integrates existing theories with literature on conventional primary and secondary rewards such as food and money. We assume that the states of observers' sensory and cognitive systems adapt to process stimuli effectively in both the present and the future. These system states collectively comprise a probabilistic generative model of stimuli in the environment. Two interlinked components generate value: immediate sensory reward and the change in expected future reward. Immediate sensory reward is taken as the fluency with which a stimulus is processed, quantified by the likelihood of that stimulus given an observer's state. The change in expected future reward is taken as the change in fluency with which likely future stimuli will be processed. It is quantified by the change in the divergence between the observer's system state and the distribution of stimuli that the observer expects to see over the long term.Simulations show that a simple version of the model can account for empirical data on the effects of exposure, complexity, and symmetry on aesthetic value judgments. Taken together, our model melds processing fluency theories (immediate reward) and learning theories (change in expected future reward). Its application offers insight as to how the interplay of immediate processing fluency and learning gives rise to aesthetic value judgments.
... Thus, closing word positions predict chills across individuals. Furthermore, using machine-learning algorithms, a large study in the visual domain demonstrated on both computational and neural levels that aesthetic preferences can be described and lawfully predicted across individuals from the physical features of paintings and photographs [48]. Seminal work on statistical image properties in visual arts has been conducted by Redies and colleagues [49,50]. ...
... In light of advances in the development of computational tools, artificial intelligence, and data collection from ever-larger sample sizes, we expect a significant increase of further evidence attesting to the value of feature-based approaches and their power to explain substantial amounts of variance in aesthetic experiences. Importantly, this still leaves room for variance that is bound to interindividual differences due to personality dispositions, personal history, and idiosyncratic weighting of particular stimulus features [4,48]. Moreover, the particular cognitive framing of an object (e.g., fiction versus documentary) can readily trigger top-down control mechanisms and modify aesthetic processing [53]. ...
Empirical aesthetics has found its way into mainstream cognitive science. Until now, most research has focused either on identifying the internal processes that underlie a perceiver’s aesthetic experience or on identifying the stimulus features that lead to a specific type of aesthetic experience. To progress, empirical aesthetics must integrate these approaches into a unified paradigm that encourages researchers to think in terms of temporal dynamics and interactions between: (i) the stimulus and the perceiver; (ii) different systems within the perceiver; and (iii) different layers of the stimulus. At this critical moment, empirical aesthetics must also clearly identify and define its key concepts, sketch out its agenda, and specify its approach to grow into a coherent and distinct discipline.
... Moreover, the use of pictures and words to represent food may help clarify the individual and collective roles of basic visual and semantic information in processing of nutritional content. Furthermore, computational modelling may be useful to clarify the respective contributions of each information processing level (Iigaya et al., 2020). ...
Many food decisions are made rapidly and without reflective processing. The ability to determine nutritional information accurately is a precursor of food decisions and is important for a healthy diet and weight management. However, little is known about the cognitive evaluation of food attributes based on visual information in relation to assessing nutritional content. We investigated the accuracy of visual encoding of nutritional information after brief and extended time exposures to food images. The following questions were addressed: (1) how accurately do people estimate energy and macronutrients after brief exposure to food images, and (2) how does estimation accuracy change with time exposure and the type of nutritional information? Participants were first asked to rate the energy density (calories) and macronutrient content (carbohydrates/fat/protein) of different sets of food images under three time conditions (97, 500 or 1000 ms) and then asked to perform the task with no time constraints. We calculated estimation accuracy by computing the correlations between estimated and actual nutritional information for each time exposure and compared estimation accuracy with respect to the type of nutritional information and the exposure time. The estimated and actual energy densities and individual macronutrient content were significantly correlated, even after a brief exposure time (97 ms). The degree of accuracy of the estimations did not differ with additional time exposure, suggesting that <100 ms was sufficient to predict the energy and macronutrients from food images. Additionally, carbohydrate estimates were less accurate than the estimates of other nutritional variables (proteins, fat and calories), regardless of the exposure time. These results revealed rapid and accurate assessment of food attributes based on visual information and the accuracy of visual encoding of nutritional information after brief and extended time exposure to food imagery.
... Visual and psychological processes related to art perception and processing have been proposed previously (Ramachandran and Hirstein, 1999;Leder et al., 2004), and neuroscientific studies assessed and localized brain activity in relation to esthetic value (Cela-Conde et al., 2004;Kawabata and Zeki, 2004;Vartanian and Skov, 2014;Lebreton et al., 2015). While esthetic value is considerably subjective, a recent study shows that (visual) esthetic value can be predicted by brain activity based on the integration and different weighing of (visual) features of the presented art image (Iigaya et al., 2020), including low-level (hue, saturation, lightness, color, brightness, blurring effects, edge detecting) (Li and Chen, 2009) and high-level features (color temperature, depth, abstract, emotion, complexity) (Chatterjee et al., 2010). Thus, presumably, primary and secondary rewards are not "randomly" processed in the brain but have -at least to a certain extent -a common ground in human brain computations of stimulus features, which have most likely evolved to serve adaptive behaviors in different environments (Skov and Nadal, 2018;Skov and Skov, 2019). ...
... Others showed that fat and carbohydrate content elicit a supra-additive response for food valuation in the ventral striatum independent of liking (DiFeliceantonio et al., 2018), further highlighting that the brain's reward evaluation for food involve nutrient sensors in the gut (De Araujo et al., 2020). Considering art evaluation, a recent preprint suggests that feature integration of artistic stimuli might be ordered in an hierarchical way from visual processing up to the integration from low-and high-level image features in the brain, in particular in higher-order areas such as parietal and prefrontal cortex (Iigaya et al., 2020). While it might seem counterintuitive to want art similar to wanting food, it has been argued that art objects, such as prints of art paintings or photographs, are often object of desire, not only of art collectors (Berridge and Kringelbach, 2008). ...
... However, knowledge about (secondary) reward-related neurobiology is still fragmentary, especially with regard to art and related value representations. The encoding of art viewing and experiencing seems to be multi-fold and research on its neural correlates has only begun to discover specific brain signaling (Chatterjee and Vartanian, 2016;Iigaya et al., 2020). Viewing artworks, i.e., paintings, elicits for example activation of the default mode network (DMN) and in subcortical areas like the striatum in relation to the ratings of the painting (Vessel et al., 2012. ...
While art is omnipresent in human history, the neural mechanisms of how we perceive, value and differentiate art has only begun to be explored. Functional magnetic resonance imaging (fMRI) studies suggested that art acts as secondary reward, involving brain activity in the ventral striatum and prefrontal cortices similar to primary rewards such as food. However, potential similarities or unique characteristics of art-related neuroscience (or neuroesthetics) remain elusive, also because of a lack of adequate experimental tools: the available collections of art stimuli often lack standard image definitions and normative ratings. Therefore, we here provide a large set of well-characterized, novel art images for use as visual stimuli in psychological and neuroimaging research. The stimuli were created using a deep learning algorithm that applied different styles of popular paintings (based on artists such as Klimt or Hundertwasser) on ordinary animal, plant and object images which were drawn from established visual stimuli databases. The novel stimuli represent mundane items with artistic properties with proposed reduced dimensionality and complexity compared to paintings. In total, 2,332 novel stimuli are available open access as “art.pics” database at https://osf.io/BTWNQ/ with standard image characteristics that are comparable to other common visual stimuli material in terms of size, variable color distribution, complexity, intensity and valence, measured by image software analysis and by ratings derived from a human experimental validation study [ n = 1,296 (684f), age 30.2 ± 8.8 y.o.]. The experimental validation study further showed that the art.pics elicit a broad and significantly different variation in subjective value ratings (i.e., liking and wanting) as well as in recognizability, arousal and valence across different art styles and categories. Researchers are encouraged to study the perception, processing and valuation of art images based on the art.pics database which also enables real reward remuneration of the rated stimuli (as art prints) and a direct comparison to other rewards from e.g., food or money.
Key Messages: We provide an open access, validated and large set of novel stimuli ( n = 2,332) of standardized art images including normative rating data to be used for experimental research. Reward remuneration in experimental settings can be easily implemented for the art.pics by e.g., handing out the stimuli to the participants (as print on premium paper or in a digital format), as done in the presented validation task. Experimental validation showed that the art.pics’ images elicit a broad and significantly different variation in subjective value ratings (i.e., liking, wanting) across different art styles and categories, while size, color and complexity characteristics remained comparable to other visual stimuli databases.
... Instead, we chose to limit the complexity of our theoretical framework at this first iteration to serve as a basic building block on which to incorporate the aforementioned factors. However, even a model based on low-level features can still be highly informative on aesthetic preferences of individuals, as recently demonstrated by Iigaya et al. (2020). Additionally, a reinforcement learning circuit is easily amenable to additional factors, for example Leong et al. incorporate attention directly into the reinforcement-learning circuitry computing subjective value, as we did with motivation (Leong et al., 2017). ...
... While we agree, we suggest that linearity is a suitable starting point. Recent work comparing a linear rule versus a deep neural network to predict subjective aesthetic value found that both fared comparably (Iigaya et al., 2020). We argue that this also applies for our theoretical framework for aesthetic learning. ...
How do we come to like the things that we do? Each one of us starts from a relatively similar state at birth, yet we end up with vastly different sets of aesthetic preferences. These preferences go on to define us both as individuals and as members of our cultures. Therefore, it is important to understand how aesthetic preferences form over our lifetimes. This poses a challenging problem: to understand this process, one must account for the many factors at play in the formation of aesthetic values and how these factors influence each other over time. A general framework based on basic neuroscientific principles that can also account for this process is needed. Here, we present such a framework and illustrate it through a model that accounts for the trajectories of aesthetic values over time. Our framework is inspired by meta-analytic data of neuroimaging studies of aesthetic appraisal. This framework incorporates effects of sensory inputs, rewards, and motivational states. Crucially, each one of these effects is probabilistic. We model their interactions under a reinforcement-learning circuitry. Simulations of this model and mathematical analysis of the framework lead to three main findings. First, different people may develop distinct weighing of aesthetic variables because of individual variability in motivation. Second, individuals from different cultures and environments may develop different aesthetic values because of unique sensory inputs and social rewards. Third, because learning is stochastic, stemming from probabilistic sensory inputs, motivations, and rewards, aesthetic values vary in time. These three theoretical findings account for different lines of empirical research. Through our study, we hope to provide a general and unifying framework for understanding the various aspects involved in the formation of aesthetic values over time.