This section is about data-mining. With data in this context I am referring to chemical structures and associated data from (mainly) public databases. In general I use the workflow creator “Knime” and combine it as required with Java, SQL and different APIs.
As recent as 2019/20(ish) I have been using more and more Python scripting. Many machine learning and especially AI methods are done in Python which allows for more flexibility than Knime (for better or worse). Of course, Python can be incorporated into Knime if need be…
I am sharing, as time goes by, some practical (parts) of workflows here on my page which you may study, copy/use, whatever, though a “developed by A. Minidis, Pharmakarma”, even if only in 6pt font, would be appreciated. Maybe even a mail that it helped you. Or a mail if you have improvement suggestions?
Most of these workflows could in principle also be converted to one of the other multitudes of platforms out there, such as e.g. Pipeline Pilot or Bioclipse, or script based (Python, R), if that is more to your liking.
As an edit to the above: newer Knime Workflows are now shared via the Knime-Hub and there an automatic licensing per CC BY 4.0 is applied. Other code written in Python may be found on my GitHub with appropriate licensing.
You will find details in the Blog section, here are some direct links:
- Knime & External Tool for OCR of structures
- SpotRM+ & batch mode usage
- Part 3: What disease should I …. ? Knime workflows (and Part 1 & 2 linked within)
- Sweet, another publication! Machine Learning in Reaction predictions!
(this refers to a publication which includes Knime workflows)