Feat(batch): Add Inheritance/segregation Input And Calculate Scores

by ADMIN 68 views

Description

The initial batch processing implementation focuses solely on variant annotation lookup via variant-linker and download. However, it does not currently allow specifying inheritance/segregation per variant, nor does it calculate the Gene, Inheritance, and final NCS scores. This feature aims to address these limitations by enabling users to provide inheritance pattern and segregation probability alongside each variant in the input and calculating the corresponding scores.

Desired Behavior

To achieve the desired behavior, the following features should be implemented:

  • Allow users to provide inheritance pattern and segregation probability: Users should be able to optionally provide this information alongside each variant in the input (textarea or file upload).
  • Define a clear input format: A clear input format should be defined for this combined data, such as Tab-Separated Values (TSV): Variant<TAB>Inheritance<TAB>Segregation.
  • Parse extra information during batch processing: During batch processing, the extra information provided by the user should be parsed.
  • Calculate Gene Score: For each variant, the corresponding Gene Score should be fetched using the GeneCard logic.
  • Calculate Inheritance Score: The Inheritance Score should be calculated based on the provided pattern/segregation using the InheritanceCard logic.
  • Calculate Combined NCS Score: The combined NCS score should be calculated using the logic from CombinedScoreCard.
  • Include calculated scores in downloadable output: The calculated scores (Gene, Variant, Inheritance, Combined NCS) should be included in the downloadable output file (JSON, CSV, TSV).

Implementation Ideas

To implement the desired behavior, the following ideas should be considered:

  • Update input parsing logic: The input parsing logic in BatchView.vue should be updated to handle multi-column input (e.g., TSV). Default values should be defined if inheritance/segregation are missing for a row.
  • Modify batch processing loop: The batch processing loop should be modified to:
    • Get the variant-linker annotation (which includes the Variant Score and Gene Symbol).
    • Call fetchGeneDetails to get the Gene Score.
    • Calculate the Inheritance Score using the parsed pattern/segregation and the logic from InheritanceCard.vue.
    • Calculate the Combined NCS score using the logic from CombinedScoreCard.vue.
  • Update queryVariant API wrapper: The queryVariant API wrapper (or batch equivalent) should be updated if necessary to ensure the Variant Score is returned correctly in batch JSON mode.
  • Modify output generation: The output generation (downloadFile logic in BatchView.vue) should be updated to include the new score columns in the correct format (JSON object properties or CSV/TSV columns).

Tasks

The following tasks should be completed to implement the desired behavior:

  • Define and document expected input format: Define and document the expected input format (e.g., TSV with headers: Variant, Inheritance, Segregation).
  • Update parsing logic in BatchView.vue: Update the parsing logic in BatchView.vue to handle multi-column input (e.g., TSV).
  • Integrate calls to fetch Score: Integrate calls to fetch Gene Score within the batch loop.
  • Integrate Inheritance Score calculation logic: Integrate the Inheritance Score calculation logic (potentially refactor from InheritanceCard).
  • Integrate Combined NCS calculation logic: Integrate the Combined NCS calculation logic (potentially refactor from CombinedScoreCard).
  • Update output generation: Update the output generation to include score columns for all formats (JSON, CSV, TSV).
  • Add error handling for score calculation failures: Add error handling for score calculation failures for individual variants.

Motivation

Q: What is the purpose of this feature?

A: The purpose of this feature is to enable users to provide inheritance pattern and segregation probability alongside each variant in the input and calculate the corresponding scores, including Gene, Inheritance, and Combined NCS scores.

Q: Why is this feature necessary?

A: This feature is necessary because the initial batch processing implementation only focuses on variant annotation lookup via variant-linker and download, without allowing users to specify inheritance/segregation per variant or calculate scores. This limitation hinders the utility of the batch processing feature for variant prioritization.

Q: What input format will be used for this feature?

A: The input format for this feature will be Tab-Separated Values (TSV): Variant<TAB>Inheritance<TAB>Segregation. This format will allow users to easily provide the required information alongside each variant.

Q: How will the extra information be parsed during batch processing?

A: During batch processing, the extra information provided by the user will be parsed using the updated input parsing logic in BatchView.vue. This logic will handle multi-column input (e.g., TSV) and define default values if inheritance/segregation are missing for a row.

Q: What scores will be calculated for each variant?

A: The following scores will be calculated for each variant:

  • Gene Score: using the GeneCard logic
  • Inheritance Score: using the parsed pattern/segregation and the logic from InheritanceCard.vue
  • Combined NCS Score: using the logic from CombinedScoreCard.vue

Q: How will the calculated scores be included in the downloadable output?

A: The calculated scores will be included in the downloadable output file (JSON, CSV, TSV) using the updated output generation (downloadFile logic in BatchView.vue).

Q: What tasks need to be completed to implement this feature?

A: The following tasks need to be completed to implement this feature:

  • Define and document the expected input format
  • Update parsing logic in BatchView.vue
  • Integrate calls to fetch Gene Score within the batch loop
  • Integrate Inheritance Score calculation logic
  • Integrate Combined NCS calculation logic
  • Update output generation to include score columns for all formats
  • Add error handling for score calculation failures

Q: What is the motivation behind this feature?

A: The motivation behind this feature is to provide the full scoring context for batched variants, significantly increasing the utility of the batch processing feature for variant prioritization. By enabling users to provide inheritance pattern and segregation probability alongside each variant, and calculating the corresponding scores, this feature enhances the overall user experience and provides more accurate results.

Q: What are the benefits of this feature?

A: The benefits of this feature include:

  • Enhanced user experience
  • More accurate results
  • Increased utility of the batch processing feature for variant prioritization

Q: What are the potential challenges or limitations of this feature?

A: The potential challenges or limitations of this feature include:

  • Complexity of implementing the scoring logic
  • Potential errors in parsing the extra information
  • Limited in the input format

Q: How will this feature be tested and validated?

A: This feature will be tested and validated through a combination of unit tests, integration tests, and manual testing. The feature will be thoroughly reviewed to ensure that it meets the requirements and is free of errors.