YasmeenVH
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # To-Do DeepPlants ## Notes * Since switched to pixel system [1,84] the reward 1/distance has become very small. Example hierarchy general setting: ![](https://i.imgur.com/zEBvGA2.png) * Before 1/distance when environment was a range [0,1] we have rewards > 1 ## Meeting notes: ### to-do May 26: - [ ] Rainbow, complete by this week - [ ] Launch small tests while refactoring - [ ] implement comments - [ ] creative commons CC or MIT ### to-do May 19: - [ ] contextual bandits https://github.com/RonyAbecidan/NeuralTS - [ ] DEAP, CMA-ES specifically https://github.com/DEAP/deap (will not do?) - [x] fix branching assert for out of boundary branching - temporary fix, need to ask Manuel - [ ] look at atari hyperparamters for otherbaselines as starting point ### to-do May 12: - [x] change reward for control, hierarchy and fairness - [ ] change reward mnist: - [x] Sigmoid (launched on cluster, having issues) - [x] Log (tried but gave still negative values) - [x] tried (could be promising but very long to run) - [x] tried sqrt(x) - [ ] make mnist chllenge load directly from pytorch MNIST dataset and not load pictures. - [ ] rerun experiments: - [x] control ![](https://i.imgur.com/ytfM9l5.png) - [x] hierarchy ![](https://i.imgur.com/LgdHO0m.png) - [x] fairness![](https://i.imgur.com/luAXXKS.png) - [ ] mnist ### to-do May 5: - [ ] DO CONFIG FOR TESTING PEOPLE ### to-do April 28 - [x] Add appendix to explain hyperparameter - [x] IoU with and without discount % multiply on plant pixels - [x] Try bigger resolution only for human rendering ### to-do April 22 (meeting minutes) - [ ] Stable baseline to use all other baselines (i will ) - [x] Mean varation run time (pixel vs [0,1]) * Matrix: reset times: 0.000573272705078125 +/- 0.00014614290525266836 step times: 0.01752043294906616 +/- 0.012189358080591066 * Old version: reset times: 0.00019029617309570312 +/- 4.6077630014265054e-05 step times: 0.009691695785522461 +/- 0.008425047466218057 ![](https://i.imgur.com/5KPQDzi.png) where 1 = matrix 2 = old version - [x] Pick generalized setting of PPO for all Environments - [x] add plant pixel (will do in current branch YH) - [x] matrix implementation accross environments (started branch YH) ## "Timeline" ### Week April 19-26: - [x] Finish matrix implementation, merge into master - [x] Pick Hyperparameters for all environments - [x] Finalize paper transition into NeurIPS 2021 format ### Week April 26-3: *All experiments 3 times* Control : - [x] Control run experiments on easy case study PPO **(Launched)** - [x] Control run experiments on hard case study PPO **(Launched)** - [ ] Control run on standard setting PPO, stable-baselines, plot with oracle and random **(Launched PPO general, oracle and random)** Hierarchy: - [x] Hierarchy run experiments on easy case study PPO **(Launched)** - [x] Hierarchy run experiments on hard case study PPO **(Launched)** - [ ] Hierarchy run on standard setting PPO, stable-baselines, plot with oracle and random **(Launched PPO general, oracle and random)** Fairness: - [x] Fairness run experiments on target middle case study **(Launched)** - [x] Fairness run experiments on target above on case study **(Launched)** - [x] Fairness run experiments on easy plants case study **(Launched)** - [ ] Fairness run on standard setting PPO,stable-baselines, plot with oracle and random **(Launched PPO general and random, need to run oracle)** Mnist: **(Currently running)** - [ ] Mnist run on standard setting (MnistMix) PPO, stable-baslines, plot with oracle and random **(Launched PPO general, oracle and random)** - [x] Mnist compare difficulties of numbers 0-9, PPO **(Launched run*_mnist*)** - [x] boxplot and whiskers to see Mnist numbers and difficulties - [x] Curriculum ordering of numbers from easy to hard, compare to standard? (Really not sure we will pull this off but puttin this here) **(made first curriculum)** ### Week May 3-10: *Experiments from following week may still be on going* - [ ] Write up Result section with results from experiments **Control:** ![](https://i.imgur.com/II6i9zo.png) **Hierarchy:** ![](https://i.imgur.com/MGU6bcV.png) * Branches: ![](https://i.imgur.com/ra3Ude3.png) **Fairness:** ![](https://i.imgur.com/X6HHbZ0.png) **Mnist:** * comparison of digits * This results in an order of 3,6,2,1,4,5,7,8,9,0 * First curriculum is launched ``` if self.episode <= 2000: self.shape = self.path + '36' + '/' if 2000 <= self.episode <=4000: self.shape = self.path + '362' + '/' if 4000 <= self.episode <= 6000: self.shape = self.path + '3621' + '/' if 6000 <= self.episode <= 8000: self.shape = self.path + '36214' + '/' if 8000 <= self.episode <= 10000: self.shape = self.path + '362145' + '/' if 10000 <= self.episode <= 12000: self.shape = self.path + '3621457' + '/' if 12000 <= self.episode <= 14000: self.shape = self.path + '36214578' + '/' if 14000 <= self.episode <= 20000: self.shape = self.path + 'partymix' + '/' ``` ![](https://i.imgur.com/iCrDJ89.png) - [ ] Have growspace repo ready for sharing (May 7th) ### Week May 10-17: - [ ] If curriculum learning was not completed, do so in this week - [ ] If other baselines (not PPO) were not completed for first draft to so **Week May 17-24:** - [ ] Implement feedback from beta testers - [ ] .... **Week May 24-31:** ## Hyperparameter tuning ### Best hyperparams (3 seeds - w.r.t. Reward Mean) Final results: plots comparing all hyperparamters across all envs ![](https://i.imgur.com/oL9YDpv.png) ![](https://i.imgur.com/5gugyzq.png) #### Control (reward mean: 21.319, name:`best_sweep_control_Apr17_02-06-27846077`) lr = 0.006697 eps = 0.03068 gamma = 0.8964 use_gae = False gae_lambda = 0.3429 entropy_coef = 0.1316 value_loss_coef = 0.3638 max_grad_norm = 0.3406 num_steps = 4240 optimizer= "adam" ppo_epoch = 15 num_mini_batch = 25 clip_param = 0.3758 use_linear_lr_decay = True ### Hierarchy (reward mean : 18.588, name: `hierarchy333_Apr20_18-05-09847823)` lr = 0.02214 eps = 0.04762 gamma = 0.9452 use_gae = True gae_lambda = 0.6908 entropy_coef = 0.08532 value_loss_coef = 0.7231 max_grad_norm = 0.2814 num_steps = 2592 optimizer= "adam" ppo_epoch = 11 num_mini_batch = 65 clip_param = 0.211 use_linear_lr_decay = False ### Fairness (reward mean 3.837, name : `Fairness3seeds_Apr20_09-40-58847614`) lr = 0.02944 eps = 0.0444 gamma = 0.2065 use_gae = False gae_lambda =2-2 of 2   0.7383 entropy_coef = 0.2854 value_loss_coef = 0.2857 max_grad_norm = 0.1301 num_steps = 3480 optimizer= "adam" ppo_epoch = 6 num_mini_batch = 55 clip_param = 0.2582 use_linear_lr_decay = False ### Mnist (reward mean : 9.865, name : `SpotlightSweep_Apr22_05-06-18847760`) lr = 0.03955 eps = 0.03748 gamma = 0.9013 use_gae = True gae_lambda = 0.412 entropy_coef = 0.04617 value_loss_coef = 0.4661 max_grad_norm = 0.5232 num_steps = 2286 optimizer= "adam" ppo_epoch = 4 num_mini_batch = 31 clip_param = 0.08368 use_linear_lr_decay = True ### Control VS Continouous ![](https://i.imgur.com/HHNcCOs.png) lr = 0.02405 eps = 0.03392 gamma = 0.9325 use_gae = True gae_lambda = 0.7988 entropy_coef = 0.02783 value_loss_coef = 0.4377 max_grad_norm = 0.3353 num_steps = 3846 optimizer= "adam" ppo_epoch = 16 num_mini_batch = 27 clip_param = 0.09977 use_linear_lr_decay = True ### Best hyperparams (3 seeds - w.r.t. Episode_Reward) #### Control lr = 0.04747 eps = 0.003255 gamma = 0.2597 use_gae = False gae_lambda = 0.3391 entropy_coef = 0.3466 value_loss_coef = 0.3693 max_grad_norm = 0.1471 num_steps = 3463 optimizer= "adam" ppo_epoch = 15 num_mini_batch = 46 clip_param = 0.4327 use_linear_lr_decay = True ### Hierarchy lr = 0.06348 eps = 0.03238 gamma = 0.9805 use_gae = True gae_lambda = 0.7463 entropy_coef = 0.178 value_loss_coef = 0.563 max_grad_norm = 0.3398 num_steps = 3244 optimizer= "adam" ppo_epoch = 19 num_mini_batch = 32 clip_param = 0.08664 use_linear_lr_decay = False ### Fairness lr = 0.07827 eps = 0.04517 gamma = 0.1388 use_gae = True gae_lambda = 0.3265 entropy_coef = 0.308 value_loss_coef = 0.2915 max_grad_norm = 0.4387 num_steps = 3285 optimizer= "adam" ppo_epoch = 18 num_mini_batch = 23 clip_param =0.09442 use_linear_lr_decay = True ### Mnist lr = 0.05616 eps = 0.00107 gamma = 0.5172 use_gae = False gae_lambda = 0.8351 entropy_coef = 0.4723 value_loss_coef = 0.3531 max_grad_norm = 0.3382 num_steps = 4793 optimizer= "adam" ppo_epoch = 7 num_mini_batch = 23 clip_param = 0.1206 use_linear_lr_decay = False ### Control VS Continouous (the control run only) lr = 0.08721 eps = 0.02147 gamma = 0.3667 use_gae = True gae_lambda = 0.8944 entropy_coef = 0.3404 value_loss_coef = 0.6391 max_grad_norm = 0.3057 num_steps = 1858 optimizer= "adam" ppo_epoch = 15 num_mini_batch = 84 clip_param = 0.2914 use_linear_lr_decay = True ### Chosen Environment Parameters | Hyperparameters Growsapce | Control, Hierarchy & Fairness| Mnist | | -------- | -------- | -------- | -------- | |FIRST_BRANCH_HEIGHT| .24 | 0.05| 0.5 | |BRANCH_THICCNESS| 0.015 | 0.015| 0.05 | |BRANCH_LENGTH| 1/9 | 1/30| 1/5 | |MAX_BRANCHING| 10 | 1| 20 | |LIGHT_WIDTH| .25 | .1| 1| |LIGHT_DIF | 250 | 100| 400| ### Sweep.yaml for PPO hyperparameters | Parameter | previous setting | Min |Max | | -------- | -------- | -------- |-------- | | lr | 2.52-4 |0.003 |5e-6| | clip-param | 0.1 | Text |Text | |value-loss-coef | 0.5 | Text |Text | | num-processes | 1 | Text |Text | | num-steps | 2000 | Text |Text | | num-mini-batch | 4 | Text |Text | | log-interval | 1 | Text |Text | | entropy-coef | 0.01 | Text |Text | | use-gae | 0.01 | Text |Text | ### Things done I am thinking we can check the box when done - [x] Make different growspace environments a hyperparameter - [x] Run sweep on multi environments - [x] Make pposweep.yml file - [x] Pick paramaters that best generate a growing plant - [x] Run sweep on PPO hyperparameter for different reward structures control&hierarchy - [x] Run sweep on PPO hyperparameter for fairness - [X] Run sweep discrete versus continuous (sweep.yaml+pposweep.yaml) - [x] Run sweep on PPO hyperparameter for MnistMix - [X] add action plots for wandb log - [X] fix episode rewards - [X] plot for lighting displacement - [X] Plot continuous action space, how do we do this - [X] Mancuso email ## List of things (March) - ~~Add more digits for spolight challenge~~ - ~~Remove initial stem from the similarity score~~ - Adding curriculum for hierarchichal challenge and maybe to MNIST later, look here: https://lilianweng.github.io/lil-log/2020/01/29/curriculum-~~for-reinforcement-learning.html - ~~Confirm shading by tomorrow~~ - ~~Sweep with buddy system~~ - ~~Normalize reward over different digits tested on for MNIST challenge (look at how they did it for atari)~~ - Oracle MNIST - Oracle review for all - ~~Add digit 0 to MnistMix~~ ## Testing on Spotlight - ~~Run on larger amount of 1s for reset~~ - ~~Run on number 7s for every reset~~ - ~~alternate between 1 and 7s for every reset~~ - ~~Random digits for every reset~~ - ~~Look into curriculum based off of different tests mentioned above ~~ ## Add Configs for Buddy: | Hyperparameters Growsapce | Current Value| Min | Max | | -------- | -------- | -------- | -------- | |FIRST_BRANCH_HEIGHT| .24 | 0.05| 0.5 | |BRANCH_THICCNESS| 0.015 | 0.015| 0.05 | |BRANCH_LENGTH| 1/9 | 1/30| 1/5 | |MAX_BRANCHING| 10 | 1| 20 | |LIGHT_WIDTH| .25 | .1| 1| |LIGHT_DIF | 250 | 100| 400| ### Not sure if we should play around with these ones - LIGHT_DISPLACEMENT = .1 - LIGHT_W_INCREMENT = .1 - DEFAULT_RES = 84 - MIN_LIGHT_WIDTH = .1 - MAX_LIGHT_WIDTH = .5 # Miscellaneous ## Other researchers who could be interested * Heiko Hamann is Professor for Service Robotics at the University of Lübeck, Germany. Previous work include A robot to shape your natural plant * # Future of Growspace ## Future Baselines - Soft Actor-Critic - TRPO - https://github.com/hill-a/stable-baselines ## Future Work - Water Budget - Root system (same branching pattern) - DNA - Different plants? Could refactoring allow for different plant models --- <details><summary><mark>Default Sweep (better name here?)</mark></summary> program: main.py method: random metric: goal: maximize name: Episode_Reward parameters: ^lr:*MODIFIED* distribution: uniform min: 1.0e-5 max: 0.1 ^eps:*Leaving as is* distribution: uniform min: 1e-7 max: 0.05 ^gamma:* Leaving as is , values range from 0.18 - 96 * distribution: uniform min: 0.1 max: 0.99 ^use_gae:* Leave as is* distribution: categorical values: - True - False ^use_linear_lr_decay: *Leave as is* distribution: categorical values: - True - False ^gae_lambda:*MODIFIED* distribution: uniform min: 0.3 max: 0.99 ^entropy_coef:*MODIFIED , Do we want to try witout like Simone proposes (because we have a small action space* distribution: uniform min: 0.01 max: 0.5 ^value_loss_coef:*MODIFIED* distribution: uniform min: 0.25 max: 0.75 ^max_grad_norm: distribution: uniform min: 0.1 max: 0.9 ^num_steps:*MODIFIED* distribution: q_uniform min: 1000 max: 5000 ^ppo_epoch:*Leave as is* distribution: q_uniform min: 1 max: 20 ^num_mini_batch:*Leave as is* distribution: q_uniform min: 10 max: 100 ^clip_param:*MODIFIED* distribution: uniform min: 0.05 max: 0.5 ^seed: distribution: categorical values: [111,222,333] ^optimizer: distribution: categorical values: ["adam", "sgd"] ^momentum: distribution: uniform min: 0.95 max: 0.999 </details>

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully