Full-stack Software Engineer
- Software Development
- Full-time
- Hyderabad, IN
2023-02-04 21:25:53 UTC
We live in the era of the complex biologic drug – a multipart symphony of a designed active element, delivered via a targeted carrier, produced in an automated high-throughput process painstakingly optimised across hundreds of parameters. While the last decade has seen these kinds of biologics – including gene therapies, personalised cancer drugs and RNA interference based constructs – finally start to be deployed in the real world, these first generation products are the result of slow and steady development, with each one typically taking many, many years to get from concept to the clinic.
The rapid development of mRNA vaccines against COVID-19, with Moderna and BioNTech-Pfizer taking scarcely a few months from the day that they designed their sequences to get their shots in arms, shows the possibility of rapid acceleration in these development cycles as platforms with hot-swappable components reach maturity.
PopVax aims to be even faster than that. We use machine learning driven computational techniques, many of which got good enough for our purposes just in the last few months, to generate thousands of diverse variations of each component of our mRNA vaccines and therapeutics. Unlike many other leaders in ML-driven bio, however, we do not stop there. We understand that nothing can substitute for real world test data, so we actually produce and test these novel constructs both in the lab, and once validated, in animals, with a shorter iteration cycle from in silico to in vitro to in vivo, and back again, than anyone else we know of.
Current laboratory software in biology is focused on keeping track of experimental protocols, not running a high throughput de novo design and validation loop. To power our bio-computational flywheel, we need bespoke software – software that can seamlessly track the relationships between each of the different components of our drugs as we make thousands of tiny changes and a few dozen big ones with each iteration cycle. We need to then be able to perform data analysis across custom subsets of hundreds of thousands of parameters across our entire database of designs and experiments. Over time, we hope to be able to use the data we collect to run an automated, perpetual cycle of fine-tuning our machine learning models against ground truth data.
We started by building an extensible interface on top of a relational database that can intuitively capture the provenance of our designs and variations, and bidirectionally link the data from our experiments to the evolutionary tree of each PopVax-designed component. We’re now building tools on top of this to facilitate complex querying, one click computational simulations, and procedural experiment design. Once we have these core extensions built, we will tie this system into our machine learning pipeline, so that any of our scientists can make the GPUs go brr at the push of a button.
We’re looking for exceptionally competent and insatiably curious full-stack software engineers who want to help us in our mission to eliminate deadly diseases an order of magnitude faster than would otherwise be possible.
The ideal candidate need not have existing knowledge of biology, but needs to be willing to learn quickly, build fast, and act independently in an environment with a non-trivial amount of uncertainty. They must have built substantial full stack web applications end-to-end on their own in the past.
If you come and work with us, your software will not be used directly by the general public, but we hope that in short order, its impact will accelerate work that saves or improves millions, and potentially billions, of lives. There aren’t many places that can offer you that kind of deep positive impact, not to mention the opportunity to work at the frontier of synthetic biology’s computational revolution.
Interested? Apply below.
If you’d like to get a feel for what the job might be like, here’s an OPTIONAL task you can complete to round out the application:
Your task is to build a web application that lets someone upload a protein (in the form of a .pdb file). Once uploaded, the app should display both the sequence and 3D structure of this uploaded protein. Further, the user should be able to annotate arbitrary subsequences within the protein sequence with unique labels in different colours. Each file upload should generate a unique URL which the user can copy and share. Anyone who visits this unique URL should see the protein sequence and structure along with any annotations made by previous users. You do not need to implement any form of authentication. You must host this web app online and link to it in your application.
There are lots of free options to host your app, and plenty of open-source libraries to help with each component – you may use whatever you like.
Good luck!