Video, Training materials, Mock data, E-learning
How to use Python and R with RDF Data
How to use Python and R with RDF Data is a training that was developed in the context of the Swiss Personalized Health Network (SPHN) initiative and is part of a series of trainings centred around the SPHN Semantic Interoperability Framework developed by the SPHN Data Coordination Center (DCC). The framework aims at facilitating collaborative research by providing a decentralized infrastructure sustained by a strong semantic layer (SPHN Dataset) and graph technology, based on RDF, for the exchange and storage of data.
Having health-related data stored in compliance with the SPHN RDF schema enables the use of SPARQL queries to provide a solid foundation for answering specific research questions. Building on top of this foundation, general purpose languages such as Python and R enable data scientists to apply further data science methods to the retrieved data. In this training, we provide a short introduction on how to use Python and R to:
- Setup a connection to a SPARQL endpoint
- Run a SPARQL query and retrieve results
Building on top of these basics, we look at how to combine results from different queries, as well as how to deal with various datatypes.
Prerequisites:
- Basic knowledge about R and Python
- Basic knowledge about RDF and SPARQL
This video assumes that your data is loaded into your triplestore (in our example, GraphDB), and that you are familiar with SPARQL. If you need instructions on loading the data into your triplestore, please watch our training RDF Schema and Data Visualization or read our user guide. If you need a reminder on SPARQL, please watch our training on Querying Data with SPARQL.
After the training you will be able to:
- Setup a connection to a SPARQL endpoint through R and Python
- Run a SPARQL query through R and Python to extract specific data.
Resources:
All resources are available on the training's GitLab space
Licence: Creative Commons Attribution Share Alike 4.0 International
Keywords: Clinical data, SPARQL, Query data, RDF, Knowledge graph, Python, R, GraphDB
Target audience: Research Scientists, Data Managers, Biomedical Researchers, Bioinformaticians, Data Scientists
Resource type: Video, Training materials, Mock data, E-learning
Status: Active
Contributors: Sabine Österle, Vasundra Touré
Scientific topics: Computer science, Data management, FAIR data, Medical informatics
Operations: Query and retrieval, Data handling, Data retrieval
Activity log