No big deal? Dual use of artificial-intelligence-powered drug discovery

A few months ago a paper was published discussing the abuse potential of modern AI based drug models and how it could lead to deliberately ill intended toxic compounds. The article itself can be found here: Dual use of artificial-intelligence-powered drug discovery | Nature (unfortunately behind a pay-wall).

(image created by Crayon using “artificial intelligence creating drugs”)

I didn’t participate at the conference from where this paper evidently originated from, so I don’t know how the presentation or discussion went leading to this article. Would have been interesting though to have participated, because it’s rather surprising to me, that “all of a sudden” the whole abuse potential would be so surprising.

“By inverting the use of our machine learning models, we had transformed our innocuous generative model from a helpful tool of medicine to a generator of likely deadly molecules.” (Ekins et al., Nature article)

Shortly after said publication, Lowe wrote a commentary in Science: Deliberately Optimizing for Harm | Science (fully accessible). He agrees along the lines of the abuse potential of “inverting of the model”, i.e., instead of finding “good, treatment based” drugs, one uses it to find “bad, toxic” compounds, but he isn’t as worried (at this stage) and I concur; though perhaps for somewhat different reasons?

“…..I’m not surprised at all that computational methods can point the way to such things, although I can see how the authors of this work found demonstrating this to be an unsettling experience. To be honest, I’m not all that worried about new nerve agents, ………That is, I’m not sure that anyone needs to deploy a new compound in order to wreak havoc – they can save themselves a lot of trouble by just making Sarin or VX, God help us…..” (Lowe, Science)

Honestly, I think the whole publication is extremely exaggerated and not such a big deal, almost to the degree where I wonder if it was written more to create hype than awareness? Is it perhaps even an unnecessary publication? Perhaps not. But i digress –

Why isn’t this such a big deal? Well, simply because all the tools have been out there for years that allow you to come up with toxic compounds if so desired – without the use of any advanced modelling techniques.

Myself, I mentioned along similar lines already in 2017 here in a short blog about the abuse potential of open-access: Abuse of open access tools and data . I wasn’t even the first to have these observations (I cite a Guardian article from 2013….), and I am certain this has been discussed and published elsewhere. And it’s perhaps what Lowe refers to with “just go and wreak havoc with a known compound”.

Let’s have a somewhat different look at this topic:

To be able to use these modern AI tools and actually turn it into anything, no matter if good or bad, you require a bit more knowledge than just downloading a model from Github. But, should you do have some chemical/pharmaceutical know-how, then you are probably much easier off using open-access such as PubChem and look for a target with known safety issues. If we use pharmaceutical discovery as reference, there is a whole battery of known “no-no targets” which an evil person or organization could pick from. Any potent compound for any such a target, especially if it hits several, is a potential poison! No need for modelling! Most often than not such a multi-target compound would be even “better” (well, worse, in this case). Finally, even if there are public models for any given target, if you have evil intent, you won’t bother considering the effort. It might be a bit more of an effort if you have specific distribution method in mind – in someones drink? A bullet in a home-brewn umbrella gun? Etc. Etc.

The only exception that I (perhaps naïve?) can see would perhaps be in the field of sports with performance enhancing drugs (i.e. doping), since there is a lot of money and advanced knowledge while being in constant race with the authorities. There you would want a “clean” drug, not (easily) detectable and you are down to “classical” drug discovery and the time and resources it so far takes.

If we want to go so far to claim that governmental intelligence agencies are interested in toxins, well, yes, they too would have the means and resources and could use AI systems.

What about illegal recreational drugs? Wouldn’t that be the same? I dare to say no, they usually don’t care about drug quality, they would most likely prefer quick and dirty. And a scraping of Pubchem would probably just do that sufficiently. It seems unlikely that for them selectivity or potency would matter – as long as a certain desired effect is achievable in a cheap fashion.

Here, if any AI modeling would be of an issue, it would be the synthesis tools – how to make a compound effectively in as few steps as possible (i.e. cheap….). Or how to exchange that banned starting material, without having too much expert chemistry knowledge. Having said that, if they target a previously published compound, then the synthesis is already out there, most likely in sufficiently “quick and dirty” manner. For better or worse, with SciHub you can even access most relevant synthesis publications. You might not even need that, a lot is available in freely available supplementary material or patents – as a an expert in the art would most likely know. They wouldn’t care if a paper is illegal to download, or not, that would be the least of their worries in this context. To take another quote from Lowe regarding constraints by ethics or law: “….history demonstrates that anyone truly interested in using such things will care nothing for these constraints.”

To close off this topic, when it comes to the the sole purpose of ill intent: The ones with such an intent and sufficient resources, they have most likely already used their know-how for decades even without AI. With regards to the (not so) far future though – now that that is admittedly a different story yet to be told.

AI made easy – an image classifier example for everyone

Artificial Intelligence (AI) is all in vogue right now. For better or worse, it is here to stay. So why not have a look at this being part of modern data science? A simple image classifier could do the trick!

It is almost too easy for anyone these days to work with AI or MachineLearning. Tools are aplenty, be it using the graphical based Knime or one of the more common scripting languages such as Python. Combine that with popular tools such a scikit, pytorch, etc using only a few lines of code and you are done. Making a good AI model though, even with all the available tools – another story for another time…

Moving on. What is it you are asking? You can’t/don’t want to get into advanced “stuff”? AI sounds complicated? Too much programming and statistics or whatnot? Forget all that. Not necessary. May I suggest this online course/book by Jeremy Howard and Rachel Thomas from SanFran Uni (no connections or perks exist between us, I simply like their approach). Do start at their blog: https://www.fast.ai/ and choose “Practical Deep Learning for Coders”. It introduces you to all prerequisites in an easy and simple manner, even tips with regards to free cloud services if you don’t have the hardware required. The video sessions go through the book as python notebooks (Jupyter) and introduces you to some basic programming at the same time. All with the attitude that you don’t need a Ph.D. to do AI. (Although, while that is true, a certain level of education or “human intelligence” is necessary to make useful and “safe” models – otherwise you end up with scandals or abuse of models. Check out e.g. Thomas’s course on tech ethics: https://ethics.fast.ai/).

Taking from this course, I present here a very simple AI for image recognition, specifically, one that distinguishes (more or less well) between Bengal cats vs “other cats”, and “cartoon cats”, because, why not. And since I have a Bengal myself… To test this, you won’t even need to install anything, simply use this MyBinder link:

Binder

This is a rather neat way to share code with others who don’t (want to / can) code, without having to go through whatever hoops to get it shared. One can even include a simplistic GUI when using something called ‘Voila’. It does have some drawbacks, but for the purpose of this e.g. this blog, it is perfect.

(Regarding model creation: I won’t go into the actual creation, though you will find a separate notebook in the Git repository of this model. It uses a simple 18 layer model, resnet 18.

For deeper explanations, you are probably better off viewing the pro description on how to to do this – I followed these two notebooks from FastAi Book: https://github.com/fastai/fastbook/blob/master/05_pet_breeds.ipyn resp. https://github.com/fastai/fastbook/blob/master/06_multicat.ipynb . )

My Bengal Classifier

Anyway, the final code and output looks simply like this, where the actual “AI” is strictly speaking only one line (in paragraph 3 (learn.inf.predict(img)), everything else is preparation and output. Well, that, and the architecture that is being loaded in paragraph 2 (load_learner(….)). This architecture is the model created in the above mentioned separate notebook.

You can have an even simpler view, if you use something called Voila (which is available in the referred notebook):

You can find all this on Github for testing yourself – using MyBinder.org though, you don’t require any local installation/know-how: simply click on the icon, wait for the (rather long) creation of the virtual image of this app (but hey it’s for free!). Or click directly here on the Binder link without the hassle of going through Github:

Binder

You will see something like this in your browser (click in that window the “show/hide” text in “Build logs” to expand and see the (slow) startup status):

Finally, you should have the notebook open and you can either run it there directly (click the run button multiple times, or choose menu “Cell > Run all “; ignore the error messages).

Finally, upload an image via the Upload button.

Even simpler, if you don’t want to bother with code, or “Run”, click the “Voila” button (circled in red) and you will only see the text and the upload button (as shown above).

That’s it! Artificial Intelligence (AI) made easy! Although … shouldn’t forget to at least touch upon that mainstream usually forgets to mention that AI isn’t that intelligent at all. It’s actually pretty stupid and depends on (the intelligence of?) the person(s) who sets up a system…. Anyway….

Of course, since I myself am interested in molecules, I want to use AI for different purposes, but that is something for another time.

Thanks for reading, hope you enjoyed the intro to creating your own AI app!

Oh, and Happy New Year!