MCBDD 2022 Module II: Offline activity
Dear students of MCBDD,

Please find below the offline activity for Module II. You have to finish it before May the first, 2022, in order to get credits.

The exercise is to use your favorite programming language (shell scripts, python, R, for instance) and APIs (Application Programming Interfaces) of databases to perform operations to retrieve drugs and associated targets.

Best regards,
David
Sign in to Google to save your progress. Learn more
Given Name *
Family Name *
Step 1: Retrieve all approved drugs from the ChEMBL database, sort them by approval year and name (a Python example is at https://github.com/chembl/chembl_webresource_client; documentations of the ChEMBL API can be found at https://www.ebi.ac.uk/chembl/api/data/docs). How many drugs did you get? *
Step 2: For each approved drug *since 2012* that you identified in step (1), retrieve a list of UniProt accession numbers, namely protein targets associated with the drug. On average, how many protein targets are associated with each compound? Report the median value here. *
Step 3: for each protein with a UniProt accession number that you identified in step (2), retrieve UniProt keywords associated with it. You can use the UniProt API, documented at https://www.ebi.ac.uk/proteins/api/doc/#!/proteins/search. Python (https://pypi.org/project/uniprot_tools/) and R (https://github.com/lgatto/UniProt.REST) clients are also available. Which keyword(s) is associated with most drugs approved since 2012? *
What's your interpretation of the results?
Please put your code in GitHub or GitLab or other code-hosting service and paste the link below.
Submit
Clear form
Never submit passwords through Google Forms.
This content is neither created nor endorsed by Google. Report Abuse - Terms of Service - Privacy Policy