AI made easy – an image classifier example for everyone

Artificial Intelligence (AI) is all in vogue right now. For better or worse, it is here to stay. So why not have a look at this being part of modern data science? A simple image classifier could do the trick!

It is almost too easy for anyone these days to work with AI or MachineLearning. Tools are aplenty, be it using the graphical based Knime or one of the more common scripting languages such as Python. Combine that with popular tools such a scikit, pytorch, etc using only a few lines of code and you are done. Making a good AI model though, even with all the available tools – another story for another time…

Moving on. What is it you are asking? You can’t/don’t want to get into advanced “stuff”? AI sounds complicated? Too much programming and statistics or whatnot? Forget all that. Not necessary. May I suggest this online course/book by Jeremy Howard and Rachel Thomas from SanFran Uni (no connections or perks exist between us, I simply like their approach). Do start at their blog: https://www.fast.ai/ and choose “Practical Deep Learning for Coders”. It introduces you to all prerequisites in an easy and simple manner, even tips with regards to free cloud services if you don’t have the hardware required. The video sessions go through the book as python notebooks (Jupyter) and introduces you to some basic programming at the same time. All with the attitude that you don’t need a Ph.D. to do AI. (Although, while that is true, a certain level of education or “human intelligence” is necessary to make useful and “safe” models – otherwise you end up with scandals or abuse of models. Check out e.g. Thomas’s course on tech ethics: https://ethics.fast.ai/).

Taking from this course, I present here a very simple AI for image recognition, specifically, one that distinguishes (more or less well) between Bengal cats vs “other cats”, and “cartoon cats”, because, why not. And since I have a Bengal myself… To test this, you won’t even need to install anything, simply use this MyBinder link:

Binder

This is a rather neat way to share code with others who don’t (want to / can) code, without having to go through whatever hoops to get it shared. One can even include a simplistic GUI when using something called ‘Voila’. It does have some drawbacks, but for the purpose of this e.g. this blog, it is perfect.

(Regarding model creation: I won’t go into the actual creation, though you will find a separate notebook in the Git repository of this model. It uses a simple 18 layer model, resnet 18.

For deeper explanations, you are probably better off viewing the pro description on how to to do this – I followed these two notebooks from FastAi Book: https://github.com/fastai/fastbook/blob/master/05_pet_breeds.ipyn resp. https://github.com/fastai/fastbook/blob/master/06_multicat.ipynb . )

My Bengal Classifier

Anyway, the final code and output looks simply like this, where the actual “AI” is strictly speaking only one line (in paragraph 3 (learn.inf.predict(img)), everything else is preparation and output. Well, that, and the architecture that is being loaded in paragraph 2 (load_learner(….)). This architecture is the model created in the above mentioned separate notebook.

You can have an even simpler view, if you use something called Voila (which is available in the referred notebook):

You can find all this on Github for testing yourself – using MyBinder.org though, you don’t require any local installation/know-how: simply click on the icon, wait for the (rather long) creation of the virtual image of this app (but hey it’s for free!). Or click directly here on the Binder link without the hassle of going through Github:

Binder

You will see something like this in your browser (click in that window the “show/hide” text in “Build logs” to expand and see the (slow) startup status):

Finally, you should have the notebook open and you can either run it there directly (click the run button multiple times, or choose menu “Cell > Run all “; ignore the error messages).

Finally, upload an image via the Upload button.

Even simpler, if you don’t want to bother with code, or “Run”, click the “Voila” button (circled in red) and you will only see the text and the upload button (as shown above).

That’s it! Artificial Intelligence (AI) made easy! Although … shouldn’t forget to at least touch upon that mainstream usually forgets to mention that AI isn’t that intelligent at all. It’s actually pretty stupid and depends on (the intelligence of?) the person(s) who sets up a system…. Anyway….

Of course, since I myself am interested in molecules, I want to use AI for different purposes, but that is something for another time.

Thanks for reading, hope you enjoyed the intro to creating your own AI app!

Oh, and Happy New Year!

Something rare nowadays – a publication

As life continues after my years of research in R&D, there are and will be less and less publications. Therefore I am even more so excited and happy if I can contribute to some great scientific work.

Alf Claesson, the main author, and I have published a “Perspective” in the ACS journal Chemical Research in Toxicology, titled “Systematic Approach to Organizing Structural Alerts for Reactive Metabolite Formation from Potential Drugs”.

We believe it should be a good tool for especially medicinal chemists who design new compounds, but also for metabolic biologists who work with reactive metabolites. It has to do to some extend with the software SpotRM+ by Awametox which is to a certain extent the engine behind this paper.

Here is the full citation:

Systematic Approach to Organizing Structural Alerts for Reactive Metabolite Formation from Potential Drugs

Alf Claesson and Alexander Minidis
Chemical Research in Toxicology 2018 31 (6), 389-411

DOI: 10.1021/acs.chemrestox.8b00046

And the link:
https://pubs.acs.org/doi/10.1021/acs.chemrestox.8b00046

Science@home for everyone – the quick and simple(st) way

Do you want to do contribute to research but don’t have the time/nerve/know-how for any kind of deeper involvement? Of course you want to 😀 !
And yes, it is possible! The answer is – distributed or volunteer computing!

This is not a new phenomena, it has been around for quite a long time now. One of the more know projects most likely is SETI@home, where you help analyze radio signals from space in the search for extra-terrestrial life.
Today, the field of distributed computing encompasses all kinds of research areas, including drug discovery. One of many summaries on this subject can be found on this blog by the OpenScientist and of course Wikipedia, on Volunteer Computing.

Thus, by allowing your computer to calculate on behalf of whatever research in question, you indirectly contribute to that project – without lifting a finger. The only thing you need to do is install a program, register yourself as user (for some you can even just run anonymously) with a tiny caveat that you also “contribute” with electricity. But hey – it’s for science, right? In addition, some projects include a fancy looking screen saver!
Don’t want to have your computer on all the time? Don’t want to be bothered while you are using your own machine? No problem, nearly all allow AFAIK several ways to restrict the client with regards to CPU/GPU usage or the time it may run or not.

Can’t decide what to contribute to? Want to contribute to multiple projects but not have multiple clients installed/have to keep track off? Then I can recommend the World Community Grid which supports out-of-the-box 7 different projects. And if I am not mistaken, with a wee bit of manual work, you can make the client run other projects via this client. And if you prefer doing something like this while playing a video-game, even that is possible, for example in EVE Online or FoldIt (these though require a bit more “work” requiring active inputs/analysis by the user and thus go beyond the idea of “simplistic” distributed computing).

Myself, I am supporting the OpenZika project, due to some personal interest in this field. Come join me and many, many others!

Click here to get started!
(Note: this includes my referral ID – don’t worry, there is no money involved, it simply gives out “badges” for “recruitment”. Use the above World Community Grid link instead if you don’t like this idea).

Docking & virtual screening @home – preview

Way too many things have been happening lately, I didn’t have the time as I’d like to write new entries, one of them is the start of a new Job within the next few days 😀 [That’s a valid excuse, isn’t it?]

Anyway, a bit more complex and especially CPU/GPU heavy task is docking and receptor modelling. It depends though on what you think you want to do –

Do you just want to dock the occasional molecule(s), maybe make a nice picture, then you should be fine with a low-spec configuration as described in my post Part 1 of Drug Research @home . If you intend to do high throughput virtual screening of tens or even hundreds of thousands of compounds, you either have to have a lot of patience (in the range of days to weeks) or a lot of money for a cluster [I am not going into the possibility of using cloud-services (yet), though that would probably be an option as well].

The system I will describe is AutoDock, resp. Vina, the simplest and most “open-sourced” docking software, and combine it with other free tools for visualization, respectively preparation.

As time/computational reference: Docking a single molecule with Vina on an average modern i7 system takes ca 20-30 seconds. That’s ok for several hundreds at once. While I previously had access to Xeon based Linux Cluster, I screened 80k compounds on 12 CPUs in 10 or so days…. (well, it was a queue system shared with other users, though the way the system was set up it was more or less constantly calculation,).

Now, using Vina isn’t new and there are descriptions out there, but few deal (if at all) with automation. Furthermore, you have to pick bits and pieces from different places and combine them, which isn’t as obvious as one might think if you aren’t an expert (well, at least I don’t consider myself one in this particular field).

Until soon!