Demo of EurOPDX Research Infrastructure IT tools 20:30 - 22:00
EurOPDX Data Infrastructure and the Touch of Automation in the Process
Zdenka Dudová1, Dalibor Stuchlík1, Radim Peša1, Boris Jurič1, Zinaida Perova2, Csaba Halmagyi2, Luca Vezzadini3, Enzo Medico4,5, Aleš Křenek1
1Institute of Computer Science, Masaryk University, Brno, Czech Republic
2European Molecular Biology Laboratory- European Bioinformatics Institute, Cambridge, United Kingdom
3Kairos3D, Turin, Italy
4Department of Oncology, University of Torino, Candiolo, Italy
5Candiolo Cancer Institute, FPO-IRCCS, Candiolo, Italy
Our team is responsible for the IT infrastructure development and maintenance supporting the EurOPDX Research Infrastructure (RI). From the beginning of the EDIReX project we established a tight collaboration with the EMBL-EBI project partners who developed the PDX Finder and took care of the data collection and harmonization for PDX models established by the members of the EurOPDX Consortium. We have integrated the search function of the PDX Finder into the newly developed EurOPDX Data Portal . In addition to this model repository, further molecular and clinical data processing is possible through a EurOPDX instance of cBioPortal for cancer genomics and the innovative tool GenomeCruzer in the same harmonized way.
EurOPDX Data Portal and its components are running on OpenStack virtual machines (Data Portal - 8GB RAM, 4xCPU; Data Hub - 16GB RAM, 4xCPU; cBioPortal - 8GB RAM, 4xCPU), the deployment is implemented using Docker containers. Molecular and clinical data are stored in Neo4J and MySQL databases.
Currently we have available a set of tools covering the whole EurOPDX RI data workflow such as:
1. PDX data collection templates for clinical and technical information (“metadata”).
2. Data processing pipelines for sequencing data providing results in ready-to-use format for the EurOPDX Data Portal.
3. So called “Sandbox Data Portal” – Data Portal with cBioPortal and relevant components packed into one virtual machine. The sandbox serves for individual testing of unpublished user's data.
4. EurOPDX cBioPortal instance containing all data from the Data Portal available to our users for molecular data browsing.
5. On-demand cBioPortal instance as part of the Sandbox Data Portal environment.
6. GenomeCruzer tool for deeper (3D) molecular data analysis – currently provided as a desktop application.
The above-mentioned tools and services are aimed to offer data browsing and management capabilities for potential users of the EurOPDX RI, in compliance with FAIR principles. We would like to take the opportunity of the Workshop to explain to our potential users the possibilities offered by the EurOPDX data infrastructure and how its individual components are working together. This will be organised through a live demonstration during one of the poster sessions, and available as well as during the whole Workshop at a specific desk.