Importing VCF and JSON Data

The SolveBio Platform is designed to be compatible with stringent security and compliance requirements. All dataset imports and modifications are linked to “commits” which track each change. Commits contain information on what changes were made, when it was made, and who made it. We recommend that all users establish their own internal best practices and workflows for importing data into the platform.

SolveBio supports data import from JSON or VCF files. You can import data into any new or existing private SolveBio dataset. Before importing, make sure your private dataset has been created. You can use the web interface or API to create and manage datasets. Ensure you have the appropriate permissions relative to the data repository prior to submitting an import (you need to have a Contribute permission-level or higher for the data repository).

There are three steps to importing data:

Step 1: Upload the file to SolveBio

SolveBio currently accepts JSON or VCF files (version 4 and up). Files may be up to 2 gigabytes in size, although we recommend splitting large files up into multiple parts. Files can be compressed using gzip. The following file extensions are valid: .vcf, .vcf.gz, .json, and .json.gz.

Important: JSON files should contain one JSON record per line. For example:

{"record_number": "1", "field_1": "value 1", "field_2", "value 2"}
{"record_number": "2", "field_1": "value 1", "field_2", "value 2"}

Step 2: Start the import

An import requires a file (from step 1) and a destination (a private dataset). If the dataset is genomic (has the `genomic_coordinates` object field, or genomic coordinates in another representation), the appropriate genome build should be set.

If the uploaded file is a VCF file it will be validated and parsed during this step and converted to a dataset commit. If it is a JSON file, it will be validated against the destination dataset to ensure field compatibility.

By default, imports must be approved by a user with Approve permission-level or higher. For users with Approve permissions, imports can be set to auto-approve, which will skip step 3 and proceed through the whole import process automatically.


Step 3: (optional) Approve the commit

If the import does not have auto-approve enabled, it must be approved by you or someone else on your team. The approval itself can be done via the API or through the web interface. The approval process gives you a way to ensure that the number of records to import is accurate, and that the fields are correct.

Have more questions? Submit a request


Article is closed for comments.
Powered by Zendesk