SureChEMBL user survey
At the end of 2023 we introduced a new version of SureChEMBL (please see the announcement blog post for more details).

We are now considering what new functionalities to introduce, in addition to continuing to improve the core system. We have many ideas, but not enough time and resources to deliver them all immediately. This is why we would like to offer the SureChEMBL community a say on what we should implement first.

Please read the short descriptions below and then cast your votes!

Note than due to the varying complexity of the different tasks, we cannot guarantee that the ideas receiving the most votes will be delivered first, but at least it will give us an indication of what the SureChEMBL user community is looking for and will be taken into account in our planning.

You can also leave a comment in the section at the end of the form for any other ideas not already covered.

The survey is anonymous and will remain open until 31st January 2024.
Sign in to Google to save your progress. Learn more
Our ideas
  • Idea 1: improve compound structure quality from image extraction
Recently, progress has been achieved for automated optical chemical structure identification. We are for example currently testing DECIMER.AI which seems to address some issues that were recently raised.
  • Idea 2: make available patents from China (CNIPA)
In 2022, China’s patent office received around 1.62 million patent applications according to the WIPO.
  • Idea 3: allow download of biomedical annotations (at first targets, diseases, mode of action)
We are now using Natural Language Processing (NLP) to annotate information about gene/protein targets, diseases and mode of action. For each patent we will make the result of the annotation available for download as for the compounds.
  • Idea 4: integrate patent Core Chemical Structure (CCS) in the document view and deliver regular updates of SureChEMBLccs map files.
We will re-use the work from Falaguera et al. in which they developed a new filtering protocol to automatically select the core chemical structures best representing a congeneric series of pharmacologically relevant molecules in patents.
  • Idea 5: implement a query builder
This will enable users to build complex queries to retrieve specific patents without having to learn all the field names in our data index.
  • Idea 6: combine both text + structure searches
Implement an efficient way to query SureChEMBL combining both text and compound structure searches.
  • Idea 7: group patent results by patent family
This will make the patent result visualisation more efficient as all patents from the same family will be grouped together. Each patent will remain individually accessible. According to the EPO, a patent family is a collection of patent applications covering the same or similar technical content. The applications in a family are related to each other through priority claims.
  • Idea 8: offer filtering options before downloading patent molecules
This will give users a choice on the molecules they want to download from a patent.
  • Idea 9: add entity patent position in the annotation download files
This will allow users to easily retrieve the location of an annotation
  • Idea 10: implement in the UI a patent filtering system for returning only patents most likely to be biomedically relevant.
The filtering will be based on the CPC/IPC codes.
  • Idea 11: provide metadata, title and abstract downloads for patents in the map files.
This will open the way for deeper patent analysis.
Cast your vote
*
Please choose up to 3 options. If you have another idea, please write it down in the "other" choice.
Required
Any comment or suggestion (optional)
Thank you for your contribution.

The SureChEMBL Team
Submit
Clear form
Never submit passwords through Google Forms.
This form was created inside of EBI. Report Abuse