Rodrigo
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # Not from scratch: predicting thermophysical properties through model-based transfer learning using graph convolutional networks ## Reviewer: 1 #### Comments: In this paper, the author introduced a graph convolutional network for the prediction of thermophysical properties with transfer learning. While a certain level of novelty was shown in the paper, there are several major and minor issues to be revised before the paper is published. <span style="color:blue">[Comments]</span> **This reviewer does not seem to have many complaints besides the need for a stronger emphasis on the pretraining side. Adding a section with more details might suffice.** <span style="color:blue">[Kiho Comments]</span> **I agreed with your opinion. The reviewer #1 is quite positive to our paper.** Major issues: > 1. To ensure the reproducibility of the paper, the framework and parameter of the convolutional network should be clarified in the method part. While the author uses fig 3 and 4 to symbolically express the network framework, a more precise and rigorous algorithm or flowchart should be given to state the input dimension, convolutional parameter (kernel size, filter size and type of convolution), other network parameters (number of layers, dense layers, etc.) <span style="color:green">[Add]</span> **Need detailed information about network parameters (can add more technical details now that reviewers align better with an ML background).** <span style="color:blue">[Kiho Comments]</span> **The inclusion of network parameters and further explanation will make the reviewer happy.** > 2. Transfer learning is supposed to be the highlight of the paper, however, the author didn’t explain enough details for their transfer learning framework. Apart from references to the history of transfer learning, very limited information was given in section 2.4. The author should explain the pre-trained model used for transfer learning, including input, output, training set, and feature used. Also, the author should clearly state the process to incorporate the pre-trained model with the new model. A clear framework flowchart or algorithm including network parameters should be given in this section. <span style="color:green">[Add]</span> **We need to add a section only with details of the TL part.** <span style="color:blue">[Kiho Comments]</span> **Also, the importance of TL can be emphasized in Abstract and Introduction. (Same response in reviewer #2 comment #1).** ##### Minor issue: > 1. In the result section, the author only reported mean absolute error and mean absolute percentage error. To better evaluate the similarity between predicted results and experiment results, the Pearson correlation $R^2$ should be calculated and reported in the result section. <span style="color:red">[Discuss]</span> **Do we really need to add these results?** <span style="color:blue">[Kiho Comments]</span> **It might be better to add R2 in the results.** > 2. Figure 2 and figure 6 are almost identical except for the input. The author should reorganize those two figures to avoid repetition and highlight the transfer learning framework. > <span style="color:purple">[Modify]</span> **We might need to combine both images but that would mean also restructure some of the sections.** <span style="color:blue">[Kiho Comments]</span> **It might be better to combine these figures to avoid the duplication of the image. Then, the restructure of the paper will be unavoidable.** ## Reviewer: 2 #### Comments: This work by Hormazabal et. al. is a thorough study on building a new methodology using GCNNs and transfer learning for a wide variety of organic and inorganic molecules. This manuscript is good fit for JCIM after some points are addressed: <span style="color:blue">[Comments]</span> **This reviewer asks for some extra analysis on the arising clusters seen on the TSN-e projections. Probably just adding results for inorganic/organic separately would make him happy (table or figure?).** <span style="color:blue">[Kiho Comments]</span> **You are right. The addtion of results might be necessary. I am not sure which format will be better, but any type of results (table or figure) will be acceptable. The reviwer #2 is also positive to our paper.** > 1. Although the goal of the manuscript is the methodology, specifying which thermophysical properties are optimized (critical temp, pressure and volume) in the abstract and the introduction will be useful for readers. <span style="color:purple">[Modify]</span> **Simple modification. Clarify the properties being predicted when talking about thermophysical properties** <span style="color:blue">[Kiho Comments]</span> **Great!** > 2. Figures 10 and 11 have a representation of the chemical space used tSNE but is missing chemical insights. What are the specific larger chemical groups (amides, alcohols, acids, etc) contained in those spaces? Which particular spaces are underrepresented? Are these spaces difficult to navigate computationally or experimentally? Additionally, there are formatting errors with Figures 10 and 11 with the size of the box and the outline around the figures. > 3. Following comment 2, some more detail about the organic vs inorganic chemical space in the dataset is desirable. The authors claim that it generalizes across both groups would need some more substantiation like metrics across both groups as well as the data split for organics vs inorganics in the dataset. > <span style="color:green">[Add]</span> * **Add more analysis to the TSNE plots.** * **Label the arising clusters and add comments.** * **Underrepresented parts of the data are hard to explore experimentally.** * **Not sure about the formating errors the reviewer refers to.** <span style="color:blue">[Kiho Comments]</span> **I also cannot find the formating error in the figures (size? outline?). In figure 10, the color representing both the upper limit (900) and the lower limit (200) is red. It might be confusing and I recommend to change the color map to avoid the confusion.** > 4. Would this methodology work for areas where the amount of experimental data is a very small fraction compared to the computational data? What is the minimum amount of experimental data required to make transfer learning meaningful vs only building a model on simulated data? Which properties would be ideal for this workflow vs not? Including this in the discussion will give new users a good sense for the applicability of this method for new work and new properties. <span style="color:green">[Add]</span> * **Ideal for properties were there is a strong correlation between graph structure (atom vecinity) and properties, since would be easier to transfer learned representations from models to the experimental task (Add paragraph discussing this).** * **Talk about the importance of data diversity vs size. Rather than the _"minimum amount of data"_, the question of _"how similar/dissimilar (statistically speaking, the support of the train data distribution) should the model vs experimental data be in order for this to be effective"_ is more important.** <span style="color:blue">[Kiho Comments]</span> **The reviewer wants to add the insight of the TL method in our paper. As the reviewer suggested, we can add a paragrah in the discussion to explain the applicability of the method to a new properties.** ## Reviewer: 3 #### Comments: This work developed a transfer learning framework (Fig. 6), where pre-training of a graph-convolutional neural network using the predictions from the best available GCM model helps to create an accurate prediction model for tasks with scarce experimental data. <span style="color:blue">[Comments]</span> **This reviewer seems to be one who understand the paper contributions and focus the most. All his comments seem adequate and would actually improve the paper. Thoroughly fixing these weaknesses is key.** <span style="color:blue">[Kiho Comments]</span> **I agreed with your opinion. The response letter to the reviewer #3 looks to be the most important.** #### Major concerns: > 1. Section 2.2, “To carry out fair comparisons…it is not known which samples were originally available in the original regression training set”: Is it possible to use time-split method for selecting test set? i.e. Use the experimental data after the publication date of J. Marrero and R. Gani (ref. 40) as test set. <span style="color:red">[Discuss]</span> **This is a really good idea. However, predictions used for comparison in this work are not the ones directly reported by Marrero and Gani original work. We still do not have access to the data used by ProPred. This point needs more detailed analysis (there might be a way to split data depending on time of publication, although might take effort)** <span style="color:blue">[Kiho Comments]</span> **If the data collection after the publication date is possible, the utilization of time-split method is the best option. However, I think it is too time-consuming. Then, we can select another option. Explaining and admitting the current limitation of the dataset, and including some discussion of the utilization of time-split method by referring another papers. Let me know if you have any other options.** > 2. Section 2.2, “In this study, the training dataset is chosen depending on the prediction discrepancy of the GCM and the experimental data available”: This approach doesn’t ensure that the test set in this study was not used for developed the GCM from J. Marrero and R. Gani. Although the GCM predictions on test set are less accurate than training set, they may be much better than random guess (if the test set were used in the GCM model). <span style="color:red">[Discuss]</span> **No approach can completely ensure which data was used for training without direct access to it. In order to make fair comparisons we must compromise in some way and in this way can at least ensure we are improving parts that were originally hard for previous methods.** <span style="color:blue">[Kiho Comments]</span> **Can you explain the structure of the dataset and why we have to select the method to classify the training dataset?** > 3. Section 2.2 Page 15 “By only using experimental points that are predicted accurately by the GCM, the chance of using problematic experimental data in the training process can be minimized.” - Such a data filtering approach will amplify the biases in previous model. What if the data used for constructing the previous GCM is problematic? <span style="color:purple">[Modify]</span> **This is 100% correct and pointing this in the manuscript is a great addition. However, we might not be able to get around these biases and must compromise in some way. Clarifying this is neccesary.** <span style="color:blue">[Kiho Comments]</span> **Yes. The clear classification might be necessary in the revised manuscript.** > 4. Instead of training a deep learning model to utilize the knowledge from existing models, does it help if taking the predictions from the existing models as additional input features for the traditional ML models in section 3.1? <span style="color:red">[Discuss]</span> **This is a good question that cannot be answered without proper testing. However, my intuition tells me that the model would tend to ignore structural information and overfit on the previous prediction feature. Maybe reframing the task as the difference $|y_m - y_p|$? There is probably some literature on this, so need to check. The pre-training task works as a regularizer too, and it can help filter out poor predictions. Kind of liks with the previous comment of the reviewer, about enforcing biases.** <span style="color:blue">[Kiho Comments]</span> **As you pointed out, we may find some relevant literature to support that the current approach. Then, we can answer the question and may include a further discussion in the revised manuscript.** > 5. Does the transfer learning procedures in this work appeared in previous machine learning literature? (Try to demonstrate the innovation in highlight) <span style="color:red">[Discuss]</span> **Our case is rather unusual since normally predictions are made for related tasks but not exactly the one of interest. There are some parallels to make other tasks and discussing the diferences might be useful (need to recheck literature of the past 2 years).** <span style="color:blue">[Kiho Comments]</span> **One more check in literature might be needed. Then, we can answer the question simply.** #### Minor concerns: > 6. <span style="color:green">[Add]</span> Page 7 first bullet point: Briefly describe/define GCMs here, since it’s the first time the “GCMs” appeared. > 7. <span style="color:purple">[Modify]</span> Page 7 second bullet point: The claim of “completely replaces the static group creation” method is an exaggeration since the transfer-learning model in this work depends on the predictions from traditional models. And there is no comparison between “GCMs model from previous literature (Ref. 40) + GCN with transfer learning” and “updated GCMs model with new experimental data (training set)”. > 8. <span style="color:purple">[Modify]</span> Page 15: Define “model data” in Table 2, Table 3 here; or referring to Figure 6. > 9. <span style="color:green">[Add]</span> Could you include the results from the latest GCM in Figure 7 for comparison? > 10. <span style="color:purple">[Modify]</span> Table 5: There is no need to show error on training set. And it’s helpful to include the best performing model (XGBoost) results mentioned in 3.1. > 11. <span style="color:red">[Discuss]</span> Section 3.2.1 feature analysis: Analyzing the results from Graph Attention Network seems less relevant to this work. **Reviewer is right that this seems a bit unrelated (we added it for previous journal) and might take attention from transfer learning but, I dont think just eliminating it is a good idea, since it might have appealed to previous reviewers.** <span style="color:blue">[Kiho Comments]</span> **I agreed with your opinion.** > 12. <span style="color:green">[Add]</span> Page 28 “This implies that either the original molecular structure fed to the model has errors, or the actual experimental data reported might have problems” – Activity cliffs could be another reason. > 13. <span style="color:purple">[Modify]</span> Figure 12: The caption and figure doesn’t match.

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully