Machine Learning

Medical Text Corpora

The world’ biomedical knowledge is huge, and it keeps growing exponentially.

That’s great for humanity, right? Yes, but there is a catch. Most of it is in text form, and there are millions of new documents per year. The problem is that each of us can only read a tiny portion of all that literature. What’s the point of having all that knowledge if we can not synthesize it and leverage it?

The AI that Reads Science

The abundance of knowledge that we have today is unprecedented.   Our human brains are sadly limited in making sense out of all this  – however, machines are not. We are building an AI platform that it  starts out as a Science Assistant; helping you find the science you need. Over time she will learn, slowly but surely becoming a Scientist herself.  

Word Embeddings

We use  an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.  

Topic Modelling

A word is worth a thousand vectors.  We apply Vector based toolkits to medical corpora to test its potential for improving the accessibility of medical knowledge.  The end objective is to draw previously unseen correlations and create & test hypotheses astonishingly quickly.   

Recurrent Neural Networks

We train a RNN-based model to mimic a complex set of morphological and syntactic transformations applied by a state-of-the-art rule-based system, and generalize better than the rule-based system on concepts not present during training time. 

Recursive Neural Networks

Similar to how recurrent neural networks are deep in time, recursive neural networks are deep in structure, because of the repeated application of recursive connections.  

Helping you to find the Science you need!

We think that if we could only read and understand all of the scientific data humans have created, not to mention connecting the dots in that data, we’d have solutions to a number of pressing problems already!