Yoom Lam
    • Create new note
    • Create a note from template
      • Sharing URL Link copied
      • /edit
      • View mode
        • Edit mode
        • View mode
        • Book mode
        • Slide mode
        Edit mode View mode Book mode Slide mode
      • Customize slides
      • Note Permission
      • Read
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Write
        • Only me
        • Signed-in users
        • Everyone
        Only me Signed-in users Everyone
      • Engagement control Commenting, Suggest edit, Emoji Reply
    • Invite by email
      Invitee

      This note has no invitees

    • Publish Note

      Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

      Your note will be visible on your profile and discoverable by anyone.
      Your note is now live.
      This note is visible on your profile and discoverable online.
      Everyone on the web can find and read all notes of this public team.
      See published notes
      Unpublish note
      Please check the box to agree to the Community Guidelines.
      View profile
    • Commenting
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Suggest edit
      Permission
      Disabled Forbidden Owners Signed-in users Everyone
    • Enable
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
    • Emoji Reply
    • Enable
    • Versions and GitHub Sync
    • Note settings
    • Note Insights New
    • Engagement control
    • Transfer ownership
    • Delete this note
    • Save as template
    • Insert from template
    • Import from
      • Dropbox
      • Google Drive
      • Gist
      • Clipboard
    • Export to
      • Dropbox
      • Google Drive
      • Gist
    • Download
      • Markdown
      • HTML
      • Raw HTML
Menu Note settings Note Insights Versions and GitHub Sync Sharing URL Create Help
Create Create new note Create a note from template
Menu
Options
Engagement control Transfer ownership Delete this note
Import from
Dropbox Google Drive Gist Clipboard
Export to
Dropbox Google Drive Gist
Download
Markdown HTML Raw HTML
Back
Sharing URL Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Customize slides
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Engagement control Commenting, Suggest edit, Emoji Reply
  • Invite by email
    Invitee

    This note has no invitees

  • Publish Note

    Share your work with the world Congratulations! 🎉 Your note is out in the world Publish Note

    Your note will be visible on your profile and discoverable by anyone.
    Your note is now live.
    This note is visible on your profile and discoverable online.
    Everyone on the web can find and read all notes of this public team.
    See published notes
    Unpublish note
    Please check the box to agree to the Community Guidelines.
    View profile
    Engagement control
    Commenting
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    • Everyone
    Suggest edit
    Permission
    Disabled Forbidden Owners Signed-in users Everyone
    Enable
    Permission
    • Forbidden
    • Owners
    • Signed-in users
    Emoji Reply
    Enable
    Import from Dropbox Google Drive Gist Clipboard
       Owned this note    Owned this note      
    Published Linked with GitHub
    • Any changes
      Be notified of any changes
    • Mention me
      Be notified of mention me
    • Unsubscribe
    # [Investigate what occurs when Reader is refreshed](https://github.com/department-of-veterans-affairs/caseflow/issues/14298#) #14298 ###### tags: `Echo` There are 2 pages for Caseflow Reader: * Document List page - shows a table of documents for the Veteran * Document View page - shows one document at a time To help diagnose problems, a description of backend calls for each page are described below, followed by a list of recent known and resolved problems. ## Document List page Upon load or page refresh, the [Document List page](https://github.com/department-of-veterans-affairs/caseflow/wiki/Caseflow-Reader#document-list) makes 2 requests to the backend: 1. `Reader::AppealController#show` returns info about the current appeal. 2. `Reader::DocumentsController#index` returns an object with the following: * `documents`: a list of document records and associated document tags (aka "Issue Tags" in Reader). To get the documents, the controller calls `appeal.document_fetcher.`**find_or_create_documents!** * In this case, `document_fetcher` uses `EFolderService` for AMA and Legacy appeals (see [Integrations](https://github.com/department-of-veterans-affairs/caseflow/wiki/Caseflow-Reader#integrations)). Upon `document_fetcher` initiation, it * retrieves document metadata from eFolder Express with **EFolderService.fetch_documents_for(appeal, user)** (next section has further descriptions) * then sets `manifest_vbms_fetched_at` and `manifest_vva_fetched_at` (which are also sent back to the frontend) * Upon **find_or_create_documents!** being called, it ensures versions of `Document`s can be tracked by `series_id` as follows: * it calls `DocumentSeriesIdAssigner` to [ensure all known `Document`s have a `series_id`](https://github.com/department-of-veterans-affairs/caseflow/blob/7e38376d2cda2c0d5c8ffe9a5097c54d89aa6d7c/app/services/document_series_id_assigner.rb#L23) * and merges fetched documents with known `Document` records or, if unknown, creates a new `Document` (copying annotations/comments from [previous doc with the same `series_id`](https://github.com/department-of-veterans-affairs/caseflow/blob/ef965424cc9d8670bac32193fa80870d5ee9fed4/app/services/document_fetcher.rb#L68)) * `annotations` (aka document "Comments" in Reader) * also calls `appeal.document_fetcher.`**find_or_create_documents!** * `manifestVbmsFetchedAt`: timestamp indicating when documents were fetched from VBMS * `manifestVvaFetchedAt` : timestamp indicating when documents were fetched from VVA ### `series_id` and `vbms_document_id` * A specific version of a document is referenced by its `vbms_document_id`. * All versions of the same document have the same `series_id`. * So there may be `Document` records that represent older versions of documents that are (correctly) not presented in the UI. As a result, `Document.where(file_number: vet.file_number).count` is not equal to `documents.size` returned from `Reader::DocumentsController`. From [Reader's VBMS integration](https://github.com/department-of-veterans-affairs/caseflow/wiki/Caseflow-Reader#vbms): > Each document has a `series_id` and a `version_id` (unfortunately we refer to `version_id` as `vbms_document_id` in most of the code). In VBMS a document may be uploaded with multiple versions. Each version of the document gets its own `version_id`, but will have the same `series_id`. Whenever we see a new document with the same `series_id` as an existing document, we copy over all the metadata (comments, tags, etc.) we'd associated with that first document. ### `EfolderService.fetch_documents_for(appeal, user)` `EfolderService` is a client for the eFolder Express service (aka Caseflow eFolder). `EfolderService.fetch_documents_for` is used by Reader to download documents from VBMS and VVA. * First it sends a POST request to `/api/v2/manifests` (see [Reader access to VBMS](https://github.com/department-of-veterans-affairs/caseflow/wiki/Caseflow-Reader#vbms)) * In response to the POST request, eFolder Express (specifically `Api::V2::ManifestsController#start`) creates a `Manifest` (and a corresponding `FilesDownload` per current_user) for the Veteran. A `Manifest` typically has 2 `ManifestSource`s -- one for each of VBMS and VVA. * [Schema diagram of relevant eFolder Express classes](https://dbdiagram.io/d/5ed6741c39d18f555300202a) * It starts to retrieve documents for each `ManifestSource` using a high_priority `V2::DownloadManifestJob` parameterized by the `current_user`. `V2::DownloadManifestJob` does the following: * uses `ManifestFetcher` to [fetch a *list of documents* for all the "file numbers" known for the veteran using BGS info](https://github.com/department-of-veterans-affairs/caseflow-efolder/blob/7061bde7c9a2f919db122314f5f7e94f6d35cfb4/app/services/manifest_fetcher.rb#L23). The actual document-list fetching is done by calling `v2_fetch_documents_for(file_number)` on `VBMSService` and `VVAService`. A `DocumentCreator` is used to [delete and recreate all `Record`s associated with the `manifest_source`, after applying `DocumentFilter`](https://github.com/department-of-veterans-affairs/caseflow-efolder/blob/5edb1df749fd3bae0bf601ae8738ba3bae524ebb/app/services/document_creator.rb#L11). * then it starts a low_priority `V2::SaveFilesInS3Job` to retrieve the documents' *contents* and store them as files in S3: [`manifest_source.records.each(&:fetch!)`](https://github.com/department-of-veterans-affairs/caseflow-efolder/blob/a25a268ad641829addc890401583f2d5ee2dca8f/app/jobs/v2/save_files_in_s3_job.rb#L7) * A `Record` corresponds to a `Document` to be retrieved by `RecordFetcher`, which will [fetch the contents from S3 before trying VBMS/VVA](https://github.com/department-of-veterans-affairs/caseflow-efolder/blob/5b925d7ecba59a0f636c68d77ae975a2083cb45b/app/services/record_fetcher.rb#L15), convert images to PDF files if needed, and save to S3. * If conversion to PDF fails, the image is saved to S3 (however Reader can only show PDFs) and no alert is logged. **Should investigate a solution and at least log the error when `record.conversion_status==conversion_failed`.** * Once all documents for the appeal are fetched, `EfolderService` sends a GET request to `/api/v2/manifests/#{manifest_id}` to return the retrieved documents. ## Document View page From [Reader's Document View](https://github.com/department-of-veterans-affairs/caseflow/wiki/Caseflow-Reader#document-view): > [The frontend] makes calls directly to the `/api/v2/records/:id` endpoint on eFolder Express to retrieve the content of a document. [...] the document contents should already be cached in S3. 1. With each document shown to the user, `DocumentController#pdf` is called for the current, next, and previous documents. (Note this is not the same `Reader::DocumentsController` used for the Document List page above.) * It serves up the [pdf file from directory `/tmp/pdfs/`](https://github.com/department-of-veterans-affairs/caseflow/blob/43562ed9e75b14c6f949802521da9df6a4214c2b/app/models/document.rb#L154). The pdf could come from [3 places](https://github.com/department-of-veterans-affairs/caseflow/blob/43562ed9e75b14c6f949802521da9df6a4214c2b/app/models/document.rb#L122): > Currently three levels of caching. Try to serve content from memory, then look to S3 if it's not in memory, and if it's not in S3 grab it from VBMS Log where we get the file from for now for easy verification of S3 integration. * So if the document is not in S3 and comes from VVA, then Reader won't be able to show it. **Should investigate a solution.** * Can check in Rails logs for ["File #{vbms_document_id} fetched from VBMS"](https://github.com/department-of-veterans-affairs/caseflow/blob/43562ed9e75b14c6f949802521da9df6a4214c2b/app/models/document.rb#L130) 2. `DocumentController#mark_as_read` updates `DocumentView` records to capture when the user views the document 3. `Reader::DocumentsController#show` sets up the page content 4. `Reader::AppealController#show` returns info about the current appeal 5. `Metrics::V1::HistogramController#create` sends a histogram to DataDog about `pdf_page_render_time_in_ms` but values seem to always be 0: `[{"group":"front_end","name":"pdf_page_render_time_in_ms","value":0,"app_name":"Reader","attrs":{"overscan":6,"document_type":"VA Memo","page_count":4}}, ...]` ### Documents cached in S3 Reader pulls document files from S3, if they're available. [A RetrieveDocumentsForReaderJob caches documents in S3](https://github.com/department-of-veterans-affairs/caseflow/wiki/Caseflow-Reader#eagerly-caching-documents-in-s3): * According to [serverless.yml](https://github.com/department-of-veterans-affairs/appeals-lambdas/blob/master/async-jobs-trigger/serverless.yml#L92), this job runs every 5 minutes for [active Reader users](https://github.com/department-of-veterans-affairs/caseflow/blob/master/app/queries/batch_users_for_reader_query.rb#L7). * This job chooses up to 5 users who (1) logged in within the last week and (2) haven't used eFolder to fetch documents at all or not within the last day. * For the Legacy and AMA appeals these users are assigned to, the job calls [`appeal.document_fetcher.`**find_or_create_documents!**](https://github.com/department-of-veterans-affairs/caseflow/blob/a45cada3ddf04c1356f055c23afb5e54c4bdd7ca/app/workflows/fetch_documents_for_reader_job.rb#L32) -- same as on Reader's Document List page. **Concerns**: * 5 minutes may be too frequent. Could the same 5 users be chosen by consecutive jobs if the first job is still processing? Since `efolder_documents_fetched_at` is not set until a job finishes, if the first job takes longer than 5 minutes (e.g., 1000+ documents) then the next job would pick the same users. **Should investigate improvements to this.** * How often is S3 used compared to document retrievals from VBMS/VVA? The intent of the job is to retrieve preferably all documents from S3. **Should measure how well this job is achieving this intent and improve it, while considering S3 file auto-deletions.** #### When are these files auto-deleted in S3? Asked Tango: [Slack convo](https://dsva.slack.com/archives/C3EAF3Q15/p1591118799353200) Some digging reveals this: ```ruby bucket=Caseflow::S3Service.init! client = Aws::S3::Client.new resp=client.get_bucket_lifecycle({bucket: bucket.name}) pp resp.rules.pluck(:id,:prefix) [["delete form 8s after 5 days", "form_8 "], ["delete documents after 5 days", "documents"]] ``` The earliest file in the S3 `documents` folder is 5 days ago ([AWS S3 web UI shows folder contents](https://console.amazonaws-us-gov.com/s3/buckets/dsva-appeals-caseflow-prod/?region=us-gov-west-1&tab=overview)), so Reader documents are indeed deleted after 5 days. ## Doc counts in Reader In the Reader UI, document counts are displayed to the user. It can be simulated as follows: ```ruby appeal=Appeal.find_by(uuid: ...) docs=Document.where(file_number: appeal.veteran_file_number) # Document List page page1resp=ExternalApi::EfolderService.document_count(appeal.veteran_file_number,user) # Document View page page2resp=ExternalApi::EfolderService.fetch_documents_for(appeal,user) page2resp[:documents].size ``` These document counts can change over time. For example, * 2 `Document` records were created and retrieved but are no longer retrievable by eFolder, possibly because new versions are available. * eFolder has new 1 `Record` that Reader doesn't yet know about, possibly because a new document was uploaded to VBMS/VVA. * The net document count change may be 1, but there are 3 differences. **Should investigate a better way to track documents.** Some code for further investigation: ```ruby docs=Document.where(file_number: appeal.veteran_file_number) vbms_idsD=docs.pluck(:vbms_document_id) df=appeal.document_fetcher # takes many seconds to complete df.number_of_documents df.documents.group_by{|d| d.upload_date.beginning_of_day}.map{|k,v| [k,v.size]}.sort df.documents.group_by{|d| d.received_at.beginning_of_day}.map{|k,v| [k,v.size]}.sort vbms_idsR=df.documents.pluck(:vbms_document_id) vbms_idsD - vbms_idsR => ["{2605FFFC-C9C7-4EF8-BAB8-E1042CB7A92F}", "{50FB8137-8D01-431E-B71D-55F8A6BC7F09}"] vbms_idsR - vbms_idsD => ["{2B507FFF-0CF2-41DA-92A4-0394D3BBF52A}"] ``` ## Doc counts in Queue page Document counts are shown in the table on some Queue pages. `AppealsController#document_count` provides these document counts. It calls `EFolderService.document_count(appeal.veteran_file_number, current_user)`, which: * checks Rails.cache `"Efolder-document-count-#{file_number}"` * checks Rails.cache `"Efolder-document-count-bgjob-#{file_number}"` (`expires_in: 15.minutes`) * starts background `FetchEfolderDocumentCountJob`, which checks Rails.cache `"Efolder-document-count-#{file_number}"` (`expires_in: 4.hours`) and sends GET request to `/api/v2/document_counts` * In response, eFolder Express `api/v2/document_counts#index` checks its cache `veteran-doc-count-#{file_number}` (`expires_in: 2.hours`) and responds with `DocumentCounter.new(veteran_file_number: file_number)` * which calls `v2_fetch_documents_for(file_number)` for both VBMSService and VVAService (same as `ManifestFetcher` mentioned in the context of Reader's Document List page), and then returns a set of `document_ids`, which is counted. ## Known Problems 1. PDF version of TIFF from VVA not shown b/c the TIFF(not the PDF) is in S3 and cannot be immediately retrieved like a VBMS-sourced file. [#14193](https://github.com/department-of-veterans-affairs/caseflow/issues/14193) * Which documents fail conversion? 2. Why document counts change over time? e.g., 421 + 5 more: [#14289](https://github.com/department-of-veterans-affairs/caseflow/issues/14289) * Why 425 vs 426? 2 gone + 1 added; VBMS's response changes over time * 6/2/2020: Now 440. `docs.pluck(:series_id).uniq.size => 440` * Added details at [14081 investigation Part-3](https://hackmd.io/9DYl3EwdTqKCpALVbIWnQg#Part-3) * Need to better synchronize documents with VBMS/VVA. 3. Is the same job submitted within the same time span? "active user" check and limited to 5 at a time 4. [Should no longer be a problem] Document count numbers are not the same in Queue and Reader ([Related resolved ticket due to VBMS pagination](https://github.com/department-of-veterans-affairs/caseflow/issues/13261)) # To diagnose * Which documents (VBMS vs VVA) are retrieved by a call to ... * `appeal.document_fetcher.`**find_or_create_documents!** is called by * `RetrieveDocumentsForReaderJob` for 5 active users * `DocumentController#pdf` on the Document List page

    Import from clipboard

    Paste your markdown or webpage here...

    Advanced permission required

    Your current role can only read. Ask the system administrator to acquire write and comment permission.

    This team is disabled

    Sorry, this team is disabled. You can't edit this note.

    This note is locked

    Sorry, only owner can edit this note.

    Reach the limit

    Sorry, you've reached the max length this note can be.
    Please reduce the content or divide it to more notes, thank you!

    Import from Gist

    Import from Snippet

    or

    Export to Snippet

    Are you sure?

    Do you really want to delete this note?
    All users will lose their connection.

    Create a note from template

    Create a note from template

    Oops...
    This template has been removed or transferred.
    Upgrade
    All
    • All
    • Team
    No template.

    Create a template

    Upgrade

    Delete template

    Do you really want to delete this template?
    Turn this template into a regular note and keep its content, versions, and comments.

    This page need refresh

    You have an incompatible client version.
    Refresh to update.
    New version available!
    See releases notes here
    Refresh to enjoy new features.
    Your user state has changed.
    Refresh to load new user state.

    Sign in

    Forgot password

    or

    By clicking below, you agree to our terms of service.

    Sign in via Facebook Sign in via Twitter Sign in via GitHub Sign in via Dropbox Sign in with Wallet
    Wallet ( )
    Connect another wallet

    New to HackMD? Sign up

    Help

    • English
    • 中文
    • Français
    • Deutsch
    • 日本語
    • Español
    • Català
    • Ελληνικά
    • Português
    • italiano
    • Türkçe
    • Русский
    • Nederlands
    • hrvatski jezik
    • język polski
    • Українська
    • हिन्दी
    • svenska
    • Esperanto
    • dansk

    Documents

    Help & Tutorial

    How to use Book mode

    Slide Example

    API Docs

    Edit in VSCode

    Install browser extension

    Contacts

    Feedback

    Discord

    Send us email

    Resources

    Releases

    Pricing

    Blog

    Policy

    Terms

    Privacy

    Cheatsheet

    Syntax Example Reference
    # Header Header 基本排版
    - Unordered List
    • Unordered List
    1. Ordered List
    1. Ordered List
    - [ ] Todo List
    • Todo List
    > Blockquote
    Blockquote
    **Bold font** Bold font
    *Italics font* Italics font
    ~~Strikethrough~~ Strikethrough
    19^th^ 19th
    H~2~O H2O
    ++Inserted text++ Inserted text
    ==Marked text== Marked text
    [link text](https:// "title") Link
    ![image alt](https:// "title") Image
    `Code` Code 在筆記中貼入程式碼
    ```javascript
    var i = 0;
    ```
    var i = 0;
    :smile: :smile: Emoji list
    {%youtube youtube_id %} Externals
    $L^aT_eX$ LaTeX
    :::info
    This is a alert area.
    :::

    This is a alert area.

    Versions and GitHub Sync
    Get Full History Access

    • Edit version name
    • Delete

    revision author avatar     named on  

    More Less

    Note content is identical to the latest version.
    Compare
      Choose a version
      No search result
      Version not found
    Sign in to link this note to GitHub
    Learn more
    This note is not linked with GitHub
     

    Feedback

    Submission failed, please try again

    Thanks for your support.

    On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

    Please give us some advice and help us improve HackMD.

     

    Thanks for your feedback

    Remove version name

    Do you want to remove this version name and description?

    Transfer ownership

    Transfer to
      Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

        Link with GitHub

        Please authorize HackMD on GitHub
        • Please sign in to GitHub and install the HackMD app on your GitHub repo.
        • HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.
        Learn more  Sign in to GitHub

        Push the note to GitHub Push to GitHub Pull a file from GitHub

          Authorize again
         

        Choose which file to push to

        Select repo
        Refresh Authorize more repos
        Select branch
        Select file
        Select branch
        Choose version(s) to push
        • Save a new version and push
        • Choose from existing versions
        Include title and tags
        Available push count

        Pull from GitHub

         
        File from GitHub
        File from HackMD

        GitHub Link Settings

        File linked

        Linked by
        File path
        Last synced branch
        Available push count

        Danger Zone

        Unlink
        You will no longer receive notification when GitHub file changes after unlink.

        Syncing

        Push failed

        Push successfully