Author: Matillion
Date Posted: Jul 9, 2024
Last Modified: Jul 23, 2024
TCGA Genomic Data Commons Data Portal
Extract case, exposure, diagnosis and gene expression quantification files from The Cancer Genome Atlas (TCGA)
This Data Productivity Cloud Custom Connector extracts and loads open access data from the Genomic Data Commons Data Portal for analysis.
Authentication
No authentication is required for open access data.
Endpoints
- cases - Retrieve the metadata associated with one or more cases, including all nested biospecimen entities
- geq_files - The “search” part of the GDC API’s “Search and Retrieval” functionality for files
Parameters
The cases
endpoint can be configured by setting query parameters:
- from - the start point for paging. Recommend leaving this at its default value 0
- size - the number of records per page, default 1000
- fields - a comma-separated list of fields to extract for every case
The geq_files
endpoint is a “Search and Retrieval” request that extracts a list of gene expression quantification file names. Users are expected to subsequently download the files one by one or in bulk using the GDC API’s file download functionality. Filters and field selection can be changed by editing the POST body as per the documentation.
Downloads
Licensed under: Matillion Free Subscription License
- Download TCGA-GDC.json
- Type: Custom Connector