Using the SolveBio Variant Explorer

The SolveBio Variant Explorer is a fast and easy way to explore known reference information about a specific sequence variant. You no longer need to trawl through several websites and manually enter variant accessions each time to get the same information. The Variant Explorer is automatically and dynamically generated for every possible variant, linking accessions, datasets and more.

Warning! Currently the Variant Explorer only supports GRCh37/hg19! Please email us at support@solvebio.com to request GRCh38/hg38/hg20.

Quick video overview

Walkthrough

Interacting with variants in the Variant Explorer is easy: just search for any genetic variant in the search bar.

 

Variant summary

Variant summary

The variant summary dashboard contains a summary of apps available for analysis of the variant. Apps include identification, beacons, germline analysis, somatic analysis, literature, notes and more. By clicking on each app you enter the full-screen view and get a more detailed analysis.

Variant identification

Variant identification

The variant identification application visualizes the sequence alteration at the reference genome level. The top part, variant identification, lists the following information:

  • Genome build (currently only GRCh37/hg19 is supported)
  • Chromosome
  • Start position of the variant relative to GRCh37
  • Stop/end position of the reference allele relative to GRCh37
  • Reference allele present at this chromosome, start, stop location
  • Alternate allele that is currently being analyzed
  • Type of variant: single nucleotide variant (SNV), insertion, deletion, or multiple nucleotide substitution (substitution).
  • If the variant is a deletion or insertion, the size of that deletion or insertion.

Additionally, the variant is visualized in the context of a gene model (if the variant lies within a gene) and also the chromosome.

We also link out to variant-specific external databases and information. Each link-out, whether its searching Google or ClinVar, or pulling up the exact location/variant in UCSC or the ExAC browser, is already variant-specific for convenience.

We also generate a number of variant names and find dbSNP rsIDs for each variant. The c. and g. HGVS values are automatically generated through SolveBio's HGVS translation API. We are in the process of indexing historical names of variants to make it easier to find these in the future.

 

Effect predictor

Effect predictor

The SolveBio effect predictor is a customized effect predictor we built (have a look, it is open source) similar to SnpEff, VEP, and ANNOVAR. The effects tab displays predictions of the effects of the variant on each transcript/gene that it resides in. Currently the gene models supported on the SolveBio Variant Explorer are based on MapView files for RefSeq 104 & RefSeq 105 (files available via FTP for 104 and 105). If your variant is in multiple genes and/or multiple transcripts of the same gene, you can switch between the different transcripts using the dropdown menu.

The consequences/effects are defined by following exactly Sequence Ontology terms. Impact categories and ranks correspond to the VCF annotation spec (PDF available here).

We also have convenient variant-specific link-outs to content from other sites, such the Ensembl's VEP API JSON output and the UCSC genome browser, so that you can independently verify the predicted effects.

Mendelian disease classifier

Mendelian disease classifier

The SolveBio Mendelian disease classifier is a flexible automated variant classification software (API documentation for the classifier is also available). The classifier is currently applying the ACMG/CAP guidelines for the interpretation of sequence variants (Richards 2015) but can be extended and customized for different variant classification rubrics. Contact SolveBio for information on custom classifiers.

The classifier runs through each rule in the guidelines that can be automated (by our professional judgement), and applies each rule to the variant with data from the SolveBio Data Library, as well as the output of the SolveBio Effect Predictor. Each rule is returned as "Met", "Not Met", or "To be Evaluated", with a long-form text message specific for each variant and each rule.

For example, for the variant NM_007294.3:c.528G>A (GRCH37-17-41251811-41251811-T), the classifier states for BP7 that this variant is "Synonymous variant that is not conserved (PhyloP 46-way score is -0.108669) and not in a splice region." This message is automatically generated and specific for this variant.

Clinical evidence

Clinical evidence

If the variant is in ClinVar, we display the relevant information for each ClinVar submission as well as link-outs to the ClinVar record on the NCBI site.

In silico predictions

In silico predictions

In-silico prediction algorithms can give insight to a variant when no experimental or clinical evidence exists.

  • SIFT & PolyPhen2 predict the effect of amino acid changes on protein structure and are derived from dbNSFP.
  • ada and rf scores are composite scores from dbscSNV for splice site predictions
  • InterPro protein domains are brought back from dbNSFP
  • RepeatMasker defines where repeat regions exist

We are constantly adding new in-silico predictors. Email us at support@solvebio.com if your favorite is not currently on our list.

Population allele frequencies

Population allele frequencies

The population tab displays whether or not the variant has been seen in 1000 Genomes, ExAC, and ESP6500, and if so, allele frequencies and counts broken down by sub-population.

Somatic analysis

Somatic analysis

We've put together many of our commonly used databases & requested link-outs of somatic mutation information into one tab for convenience. There is a link-out to MSKCC's cBioPortal, all the times a variant has been described in COSMIC (pre-commercial licensing), CIViC, and counts of how many times the variant has been seen in TCGA.

Literature

Literature

If a paper (with a valid PubMed ID) has been linked to a variant, details about that paper (abstract, title, authors, linkouts to PubMed) are shown.

Currently, the literature tab shows papers from ClinVar and OMIM. We are actively expanding this source of information through curation.

Gene

Gene summary

The gene where a variant appears will go to the gene explorer summary which combines a number of reference datasets to bring back gene-specific information. If the genetic variant is in several genes, or you are using the multi-gene browser, you can toggle which gene you're looking at with the purple bar on top.

Text summaries for each gene are brought in from RefSeqGene. If the gene does not currently have a RefSeq summary, there will not be one available (please let us know if this happens!). The conditions module has data coming in from the NHGRI CGD database. This module lists all the conditions that have been associated with this gene and details about the inheritance patterns.

Finally, we list many of the known identifiers for this gene symbol, with information indexed from HGNC/HUGO.

Premium Gene-Specific Datasets

Additionally, if you have a BiomarkerBase license from Amplion, you get to see more information about gene-specific targets, biomarkers, diseases, FDA approved tests, LDTs, clinical trials, and drugs! More information available from our blog and from Amplion.

Frequently Asked Questions

What is the impact category and impact rank on the effects tab?

We followed the VCF annotation spec available here: http://snpeff.sourceforge.net/VCFannotationformat_v1.0.pdf to put together impact categories (putative ranks) and impact ranks (from annotation sort order). Where the spec was internally inconsistent, we used our best judgment to assign categories and ranks.

My gene is missing a gene summary, what is going on?

We currently access RefSeqGene to display a gene summary - if the gene does not have a summary in RefSeqGene, we do not provide a gene summary. We are working on a better system to be able to provide more comprehensive gene summaries.

The coordinates of my variant seems to be different from UCSC/dbSNP/other.

Please make sure you are checking the GRCh37/hg19 coordinates. If you are checking the NC_ based HGVS g. genomic coordinates on dbSNP, make sure you are looking at the one with the lower version number.

For example, in dbSNP for rs9534262, NC_000013.11:g.32362509T>C is actually the GRCh38 coordinates. The GRCh37 coordinates are NC_000013.10:g.32936646T>C and corresponds to the SolveBio variant of GRCH37-13-32936646-32936646-C.

If there still appears to be a discrepancy, please email us at support@solvebio.com and we will look into it immediately!

Can you please add ________?

We love adding new features! Email us your wishlist at support@solvebio.com.

Can we use your HGVS translator, annotator, effect predictor, or classifier?

Yes! They are all available as part of the SolveBio Genomic Web Services (GWS). Please contact us at contact@solvebio.com for more information regarding support and custom services.

What’s next for the Variant Explorer?

We have a long list of features we’re adding:

  • Better multi-transcript support (Ensembl, summary views of the effects tab)
  • More documentation overall
  • Support for GRCh38
  • A protein tab with visualizations of the amino acid changes
  • More in-silico predictors, particularly for splicing
  • Smarter automatic systems to link new papers to variants
  • New information alerts about this variant since last time you've visited

Is there something else you want in particular? Please email us as support@solvebio.com with your wish-list!

Have more questions? Submit a request

0 Comments

Article is closed for comments.
Powered by Zendesk