mdnguyen1196
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Make a copy
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Make a copy Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # What the reviewers care about Note that one can submit a rebuttal pdf revision. Reviewer 1 (8). His complaints are mostly minor which can be addressed in an individualized reply. Reviewer 2 (6). His complaints are mostly about extensions of the algorithm to more expressive Rasch model like 2PL or 3PL or MIRT. He also has more minor comments about ethics discussions and some smaller comments which can be addressed in an individualized reply. Reviewer 3 (6). He also commented on extension of the model. He also has a small comment about joint estimation which can be addressed easily. Reviewer 4 (6). He commented that we should have compared our method to a Bayesian method and that we should have discussed more about our experiment results (why AUC and log likelihood are similar while the spectral algorithm does better in terms of top K accuracy). Perhaps the detailed experiment discussions can be addressed to this reviewer individually. Thoughts after reviews * May be we can position the paper more towards applying the spectral algorithm to IRT, as in bringing attention to the utility of spectral algorithm to IRT. # Main rebuttal We thank the reviewers for all of their comments. We’re glad to see that the reviewers appreciate the novel application of the spectral algorithm to the Rasch model and the theoretical guarantees obtained in our work. We will address here the main comments from the reviewers, mostly pertaining to our experiments and extensions of the spectral method. We will also address more specific comments from each reviewer individually. ## Extensions of the spectral algorithm As our work proposes a new approach to IRT based on spectral methods, extensions of the algorithm to more expressive IRT models are certainly promising research directions. To reviewer 3: The spectral algorithm can be extended to perform joint estimation of both users and items parameters. We can apply the spectral algorithm to obtain the item parameter $\hat\beta$. Each individual user parameter can be obtained by maximizing the log-likelihood $\hat \theta_l = \arg\max_{\theta} \sum_{i} X_{li}\log\frac{1}{1+\exp(-(\theta_l - \hat\beta_i))} + (1-X_{li})\log\frac{1}{1+\exp(-(\hat\beta_i-\theta_l))}.$ This is a concave optimization problem in $\theta_l$ and can be solved efficiently. To reviewer 2 and 3: We can also extend the spectral algorithm to account for population heterogeneity under the mixed Rasch model [1]. The mixed Rasch model is similar to the Multivariate IRT model in that it aims to capture the idea pointed out by reviewer 2: a student might be good at one subject (arithmetic) while being bad at another (literature). In terms of modeling, each of these subjects corresponds to a mixture component. Both the student abilities and the test difficulties differ from one mixture component to another. A mixture learning algorithm can proceed as follows: * Perform spectral clustering on the rows of the response matrix $X$ (such as via sparse PCA) to produce $K$ clusters of rows. * Run the spectral algorithm on these K clusters to obtain $K$ set of parameters. * The value of $K$ can be chosen using cross-validation or using Bayesian information criterion. Given that the spectral algorithm runs significantly faster than other estimation algorithms, we can see its utility in learning mixtures of Rasch models where $K$ is large. The learned parameter estimates can be used to study the different subpopulation of students and how the test difficulties vary among these subpopulations. [1] J Rost, C Carstensen, M Von Davier. Applying the Mixed Rasch Model to Personality Questionnaires (1997). ## Comparison to a Bayesian method We have performed some additional experiments with a Bayesian estimation method whose implementation can be found at https://github.com/nd-ball/py-irt, based on the recent paper [2]. The Bayesian algorithm uses a hierarchical prior and variational inference for parameter estimation. Given that the paper [2] is well cited, we believe that this is a strong baseline to compare to. The message is similar to the experiment results in our main paper. The spectral algorithm is competitive in terms of accuracy. However, it is significantly more efficient than the Bayesian method. Furthermore, like other methods shown in our experiments, the Bayesian method doesn’t perform as well as the spectral method in terms of top-K ranking. Where the spectral method enjoys more practical advantages will be in recommendation systems applications or large scale education datasets. ``` | Dataset | AUC (Bayesian) | AUC (Spectral) | |:--------:|:-----------------:|:-----------------:| | LSAT | 0.706 | 0.707 | | 3 Grades | 0.532 | 0.532 | | UCI | 0.565 | 0.565 | | ML-100K | 0.681 | 0.662 | |:--------:|:-----------------:|:-----------------:| | | LogLik (Bayesian) | Loglik (Spectral) | | LSAT | -0.487 | -0.487 | | 3 Grades | -0.681 | -0.687 | | UCI | -0.693 | -0.706 | | ML-100K | -0.635 | -0.646 | |:--------:|:-----------------:|:-----------------:| | | Top-K (Bayesian) | Top-K (Spectral) | | ML-100K | 0; 0; 0.04; | 0.4; 0.6; 0.54 | |:--------:|:-----------------:|:-----------------:| | | Time (Bayesian) | Time (Spectral) | | LSAT | 63 | 0.028 | | 3 Grades | 27 | 0.015 | | UCI | 26 | 0.021 | | ML-100K | 2.8k | 2 | ``` [2] Natesan et al., Bayesian Prior Choice in IRT Estimation Using MCMC and Variational Bayes <!-- ## Discussions of experiments --> # Reviewer 1 In addition to our main reply above, we hope to address some of your specific comments here. > "Theorems 3.5 and 3.6 appear to be missing something ... " This is a Cramer-Rao lower bound so the estimator $T$ is assumed to be an unbiased estimator. This estimator has to output the true parameter given infinite data for any $\beta^*$. If it always outputs $T(X) = 0$ then it's no longer an unbiased estimator. > "existence and uniqueness of the estimator, i.e., of the stationary distribution of (3)" This is a good point. The existence and uniqueness of the stationary distribution can be guaranteed by making sure that the Markov chain is ergodic. Roughly speaking, this means that from every state, one can land at another state (after some time) with positive probability. This can be guaranteed using our regularization scheme which ensures that the pairwise transition probabilities are always positive and that the graph underlying the Markov chain is connected. If accepted, we will emphasize this point in the final version of the paper. > "The similarities and differences to existing pairwise & spectral approaches" This is a good suggestion as well. The main reason we don't discuss this in greater depth is because of the lack of space in the paper. If accepted, we should be able to discuss the connection to other pairwise methods in the final version which allows for more space. > Typos and references Thank you for pointing out these details. We'll surely address them in the final version. > Why is the MMLE so bad in the Top-K case, depsite achieving good AUC and log-likelihood? (I don't understand the caption of Table 1). This seems to be a point of confusion for reviewer 4 (id: L9zk) as well. Please take a look at our reply to reviewer 4 where we clarify our setup and provide more explanations as to why the spectral method tends to outperform other methods in terms of top-K accuracy while AUC and log-likelihood tend to be similar among the methods. > Can you comment to what extent the proofs of Thms 3.1 and 3.2. mirror the proofs in Rank Centrality paper for the BTL model? It is reasonable to see a lot of similarities between our algorithm and Rank Centrality as both construct a Markov chain and computes its stationary distribution to recover parameter estimate. As a starting point, the analyses of the two algorithms use similar tools (such as Lemma A.3 in our paper) to bound the error of the stationary distribution of a perturbed Markov chain. However, the pairwise transition probabilities in the two algorithms are defined differently. Our sampling model is also different from the Erdos-Reyni sampling model for the comparison graph in the analysis of Rank Centrality. Furthermore, the Rasch model has two sets of parameters while the BTL model has one. Here, we see this difference plays out as we have two separate results for two regimes of m -- when m is a constant or small relatively to n vs when m also grows with n. # Reviewer 2 We hope that our main reply above shows how the spectral algorithm can be extended to more expressive IRT models such as the mixed Rasch model. As for your specific comments about the 2PL and 3PL model. The spectral algorithm is derived based on certain properties that are characteristic of the 1PL model. Therefore, we are unsure what an extension to the 2PL and 3PL model would look like. The idea of a factored Markov chain sounds interesting but we're not familiar with the topic so we can't say much here. > Potential negative social impacts ... This is a very good point. We will certainly include a more thoughtful discussion of this in the final version of the paper. > Minor points Thanks for pointing these out, especially our claims on item parameter estimation. Perhaps a better way of phrasing this would be "one-sided parameter estimation", whether it is estimating the students' abilities or the items' parameters in the context of recommendation systems. > The authors cite several other spectral methods but do not make clear if any of the theoretical results come from those papers. Is Proposition 2.1 specific to the current paper or was it proved in one of the earlier works? This proposition and its proof are new. The pairwise transition probabilities of the Markov chain in our algorithm is different from those in Rank Centrality. > For Theorem 3.3 (where m grows) the authors need to describe to this audience why JMLE fails in this case and what is different about their algorithm that makes it succeed. We're actually not claiming that JMLE fails in this regime and this theorem only refers to the spectral method (the punchline is that when m grows, the spectral algorithm enjoys better, in fact, optimal error rate). However it has been shown that JMLE fails when the number of items m is a constant whereas our algorithm is still consistent under that regime (Theorem 3.1). > related works and how our algorithm differs You raised very good points about the related work section and we will incorporate the suggested changes in a final version. We also hope to clarify the connection between our algorithm and Rank Centrality, the closest spectral algorithm in the literature. Superficially, they seem similar as both construct a Markov chain and estimate the parameters from the stationary distribution. However, the construction of the Markov chains, the pairwise transition probabilities are different. The resulting analysis is also different for our algorithm as we have a different sampling model from the Erdos-Reyni graph sampling model in the Rank Centrality paper. Furthermore, the BTL model where Rank Centrality is applied to has 1 set of parameters whereas the Rasch model has two. This difference plays out in Theorem 3.1 and Theorem 3.3 where we have two different guarantees for two regimes of m. Here the message is that the relative size of m and n affects the estimation error. Similarly, the theorems and propositions in the paper are new and not simple extensions of the results for Rank Centrality. # Reviewer 3 We hope that our main reply above addresses both of your comments about potential extensions of the spectral method as well as joint parameter estimation (learning both user and items parameters). For joint parameter estimation, one can also avoid using MLE (say due to computational concerns) by running the spectral algorithm twice, to estimate $\hat \beta$ and $\hat \theta$. However, since the spectral algorithm essentially learns the 'difference' in the parameters and outputs normalized version of the user and item parameters, one needs to "align" the two sets of parameters. Fortunately, this alignment only comes in as a single scalar $a$. We estimate $\hat a$ by solving the following problem (keeping $\hat \beta$ and $\hat \theta$ fixed). $\hat a = \arg\max_a \sum_{l\in [n], i\in[m]} X_{li} \log\frac{1}{1+\exp(-(\hat\theta_l + a - \hat\beta_i))} + (1-X_{li}) \log\frac{1}{1+\exp(-(\hat\beta_i - \hat\theta_l - a))}$ This is a concave maximization problem in a single scalar variable and can be solved efficiently. # Reviewer 4 We hope that the additional experiments in our main reply above addresses your question on how our method would compare to a Bayesian method. The Bayesian algorithm that we compare the spectral method to is based on a recent and well cited paper [2] so we believe that it is a strong baseline. Your comments on the discussions of our experiment results are well appreciated. If accepted, we will certainly discuss the experiment results in greater depth as the final version allows more space. An important question that you posed and can be a really good discussion point is > Why AUC, log-likelihood are similar while top-K accuracy differs significantly between the methods For smaller scale education datasets (UCI, 3 Grades, LSAT) the number of items (tests) is small and there are no missing responses. There is much data relative to the number of parameters. As the algorithms, after all, operate on the same statistical model, they output very similar estimates and the performance is very close. The difference is more apparent in large scale datasets and in top-K accuracy (which is not defined for the education datasets) where the responses are sparse and there is a large number of items. To transform ratings data to binary response data, for each user, we convert all ratings higher than the average to 0 and 1 otherwise (an item with a higher parameter value is better) like in [3]. The reference top-K set is the set of items with the highest average ratings. However, we have also removed items with very high but few ratings (<= 10) from this reference set as we treat these ratings as noisy. Note that the $K \in [10, 25, 50]$ are relatively small compared to the number of items in the large scale datasets that we use. This is applicable in settings such as RecSys where we care about identifying the top few items. We observe that the other methods tend to "overvalue" items with noisy responses whereas our method, which uses pairwise differentials, is less susceptible to noisy ratings. This could explain why in some datasets, some of the baseline algorithms fail to identify any of the top items. Perhaps, this highlights the limitations of these classical IRT methods in applications like RecSys where there is much noisy data. On the other hand, AUC and log-likelihood is evaluated on a held-out dataset that contains a large number of items and users. The difference among the methods, when averaged over the users and items, is quite small. Therefore, the performance difference becomes less significant. > Some minor points about presentation These are all good points and we'll certainly address these in a final version. [2] Natesan et al., Bayesian Prior Choice in IRT Estimation Using MCMC and Variational Bayes [3] Lan et al., An Estimation and Analysis Framework for the Rasch Model (ICML 18')

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully