The world’ biomedical knowledge is huge, and it keeps growing exponentially.
That’s great for humanity, right? Yes, but there is a catch. Most of it is in text form, and there are millions of new documents per year. The problem is that each of us can only read a tiny portion of all that literature. What’s the point of having all that knowledge if we can not synthesize it and leverage it?
The abundance of knowledge that we have today is unprecedented. Our human brains are sadly limited in making sense out of all this – however, machines are not. We are building an AI platform that it starts out as a Science Assistant; helping you find the science you need. Over time she will learn, slowly but surely becoming a Scientist herself.
Attention is a powerful and efficient way to replace recurrent networks as a method of modeling dependencies. We are working on pretrained language models in order to achieve state-of-the-art results on a wide range of NLP tasks. Big changes are underway in the world of NLP.
We use an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space.
A word is worth a thousand vectors. We apply Vector based toolkits to medical corpora to test its potential for improving the accessibility of medical knowledge. The end objective is to draw previously unseen correlations and create & test hypotheses astonishingly quickly.
We train a RNN-based model to mimic a complex set of morphological and syntactic transformations applied by a state-of-the-art rule-based system, and generalize better than the rule-based system on concepts not present during training time.
We think that if we could only read and understand all of the scientific data humans have created, not to mention connecting the dots in that data, we’d have solutions to a number of pressing problems already!