<style> h3 {color: black} td {font-size: 16px} .reveal {font-size: 20px} .seafoam {color:#71eeb8} .tiny {font-size: 10px; margin-top: 0; color: white} .little {font-size: 15px; margin-top: 0; color: white} </style> ## Intrinsic Rewards in Human Curiosity-Driven Exploration Alex **Ten** (presenting) Jacqueline **Gottlieb** Pierre-Yves **Oudeyer** | | | | ------------------------------------------ | ------------------------------------------ | | ![](https://i.imgur.com/7IuvQlZ.png =100x) | ![](https://i.imgur.com/WrP24cL.png =100x) | Note: Hello and thank you for joining me in this short presentation. My name is Alex and I am a PhD student at the INRIA research center of Bordeaux. In this presentation, I am going to share with you some of the work I've done with my supervisors Pierre-Yves OUDEYER and Jacqueline GOTTLIEB. --- ### What motivates people to explore *time-extended learning activities* when there are no external incentives? ![](https://i.imgur.com/Fzom3Zf.png =500x) Note: In this line of work, we are interested in **what determines how people explore time-extended learning activities when there are no extrinsic rewards**. To illustrate, imagine someone who is interested programming. Now, how does this person decide to engage in one activity but forgo other potential activities? Well, study choices like these require people to estimate how much they can learn over time by engaging in various tasks. And an important part of such decisions is knowing that some things would probably not lead to any learning. --- ### Learning Progress Hypothesis * Evaluate competence * Evaluate change in competence over time * Engage in activities that results in most learning ![](https://i.imgur.com/zR7aeQo.png) 1-Unlearnable / 2-Very challenging / 3-Somewhat challenging / 4-Trivial Note: These principles are captured by the learning progress hypothesis which states that people follow an optimal strategy of choosing activities that result in maximal progress in learning. One way to do this is to track the trial-and-error performance and attend not only to the rates of successes OR errors, but also to how these rates change over time. OUR work shows some evidence for this idea: We present a computational model that suggests that the utility function -- underlying people's choices -- is composed of multiple components and that some form of learning progress is likely to be a part of that function. --- ### Recently introduced experimental paradigm to study self-directed exploration of learning activities ![https://i.imgur.com/z7DZ5PB.png](https://i.imgur.com/z7DZ5PB.png =500x) Note: To show this empirically, we introduced an experimental paradigm where people were free to explore 4 different learning activities, represented as families of cartoon monster characters. On each trial of the task people chose a family and then tried to guess what -- a randomly sampled individual from that family -- liked to eat, and then received binary feedback on their guess. We designed the learning activities to be more or less challenging, and also included an activity that was essentially unlearnable. --- ### Investigation of a large space of intrinsic motivations for self-directed exploration Model of choice utility: $$w_1 \color{gold}{\mathrm{M}}_{\alpha} + w_2\color{gold}{\mathrm{\Delta M}}_{\alpha, c} + w_3 \color{Gold}{\mathrm{V}}_{\alpha} + w_4 \color{gold}{\mathrm{\Delta V}}_{\alpha, c}$$ | | | |:------------------------------------------:| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ![](https://i.imgur.com/5disasx.png =350x) | $\color{gold}{\mathrm{M}}$ = mean competence (competence)<br/>$\color{gold}{\mathrm{V}}$ = competence variance (robustness)<br/>$\{\color{gold}{\mathrm{\Delta M}},~\color{gold}{\mathrm{\Delta V}}\}$ = two forms of learning progress: change in competence and competence variance over time<br/>$\{w_1,~w_2,~w_3,~w_4\}$ = utility parameters<br/>$\{\alpha,~c\}$ = temporal parameters | | Model comparisons | | Note: We then explored a linear utility model predicting individual choices from four distinct components, including (1) the estimated competence, (2) the variance of competence, and (3) also the estimates of how each of these quantities change over time. In addition to estimating **these utility-function** parameters, we simultaneously fitted the **temporal parameters of these quantities**. These temporal parameters control the degree to which information from the past influences the current estimates of: (1) competence, (2) competence variance and so on. We found that models with 3 to 4 components fit the observed choices better than do models with fewer components. We also demonstrate that objective improvement in our task could be predicted from exploratory tendencies characterized by individual utility models. --- ### To sum up - We show present empirical evidence that people follow learning progress to structure their exploration - We demonstrate the diversity of potential mechanisms for self-directed learning - We also demonstrate individual variability in exploratory strategies Note: In summary, we show some empirical support for the idea that humans follow some form of learning progress when deciding what to study next. Our research also highlights the diversity of the potential mechanisms for intrinsically-motivated exploration and the diversity of exploration strategies among individual learners. --- ## :v: ## See you at the Q\&A! Note: With this, let me thank you for your time and I'm hoping to see you at the Q&A!
{"metaMigratedAt":"2023-06-16T02:37:36.777Z","metaMigratedFrom":"YAML","title":"CogSci poster","breaks":true,"description":"View the slide with \"Slide Mode\".","slideOptions":"{\"enabled\":false,\"theme\":\"blood\",\"transition\":\"fade\"}","contributors":"[{\"id\":\"d29e1cd0-1a6a-4e37-adda-c2b5e8c297ec\",\"add\":30482,\"del\":23269}]"}
    502 views