We are a group of cutting-edge scientists and engineers trying to solve very difficult problems at the interface of data science, machine learning, cloud scale analytics and medicinal chemistry. We have developed a computational platform that can predict how a potential drug will behave in the lab and the body. We use this platform to process large spaces of chemistry in a search for therapies for some of the world’s most important diseases, such as, obesity, heart failure, and Alzheimer’s.
At the base of our stack is a proprietary distributed computational platform, which allows us to process many millions of virtual molecules at scale. We typically host this platform on Amazon Web Services infrastructure, often parallelizing our computations across more than 20,000 cores. At the next layer of our stack are custom services and libraries, such as machine learning, cheminformatics and bioinformatics. At the very top of our stack are the analysis tools for dealing with these large spaces of data, including web applications, *NIX command line tools, and custom plugins for third party data analysis and chemistry applications.. All of the layers use various storage types (MySQL, S3, Cassandra, custom built). The code base is written in Java, Javascript, Python and some C++. All of these layers are being actively developed and improved.
Our software is used by our in-house scientists and industry and government partners. The demands on our software are very high, so we employ current best practices for development, such as continuous integration, unit and performance testing, agile/scrum methodologies and code reviews.