Ebola: The next big frontier for protease inhibitor therapies?

Ebola_virusWhile I was on vacation this summer, the news was full of stories about the Ebola Virus outbreak in Africa and the health workers who had contracted the virus through working with the infected population there. Then on the heels of all of this, comes a very timely paper High Content Image-Based Screening of a Protease Inhibitor Library Reveals Compounds Broadly Active against Rift Valley Fever Virus and Other Highly Pathogenic RNA Viruses in the journal PLOS Neglected Tropical Diseases.

While the primary pathogen tested in the article is the Rift Valley Fever Virus, the library of protease inhibitors was also screened for efficacy against Ebola Virus and a range of other related RNA viruses, and shown to have activity against those pathogens as well.

Given the incredible successes we have seen with the use of protease inhibitors in other virally induced diseases like HIV/AIDS, it is tempting to wonder whether there might be a similarly promisng new medical frontier for protease inhibitors in the treatment of these extremely dangerous viral hemorrhagic fevers.

Interestingly, most of the compound screening for these kinds of antiviral therapies in the last couple of years, has been focused upon signaling molecules like kinases, phosphatases and G-Protein Coupled Receptors (GPCRs). The use of protease inhibitors as antiviral compounds therefore, represents something of a departure from the mainstream in this research field. The authors of the current study however, felt that the success of protease inhibitors in the treatment of other diseases, were grounds to merit a study of their efficacy against RNA viruses.

When I think about all of the people who are still alive today thanks to the use of protease inhibitors to control their HIV/AIDS, the early signs of a similarly efficacious class of compounds for treating hemorrhagic fevers, described in this new research article, definitely give one hope for the prospect of a future in which there are successful therapies to treat deadly (and really scary) diseases like Ebola.

© The Digital Biologist | All Rights Reserved

An excellent R&D team is more like a jazz ensemble than an orchestra

gw-sample-20090607203856-bw-1As digital biologists, the world of research and development is for many of us, our professional arena. Here is an article I recently published on LinkedIn in which I talk about the “round hole” that managing an R&D program presents to the “square peg” that is the traditional management model typically applied to this problem, particularly in some of the larger R&D organizations.

© The Digital Biologist | All Rights Reserved

The limitations of deterministic modeling in biology

newtonOver the three centuries that have elapsed since its invention, calculus has become the lingua franca for describing dynamic systems mathematically. Across a vast array of applications from the behavior of a suspension bridge under load to the orbit of a communications satellite, calculus has often been the intellectual foundation upon which our science and technology have advanced. It was only natural therefore that researchers in the relatively new field of biology would eagerly embrace an approach that has yielded such transformative advances in other fields. The application of calculus to the deterministic modeling of biological systems however, can be problematic for a number of different reasons.

The distinction typically drawn between biology and fields such as chemistry and physics is that biology is the study of living systems whereas the objects of interest to the chemist and the physicist are “dead” matter. This distinction is far from a clear one to say the least, especially when you consider that much of modern biological research is very much concerned with these “dead” objects from which living systems are constructed. In this light then, the term “molecular biology” could almost be considered an oxymoron.

From the perspective of the modern biologist, what demands to be understood are the emergent properties of the myriad interactions of all this “dead” matter, at that higher level of abstraction that can truly be called “biology”. Under the hood as it were, molecules diffuse, collide and react in a dance whose choreography can be described by the rules of physics and chemistry – but the living systems that are composed of these molecules, adapt, reproduce and respond to their ever changing environments in staggeringly subtle, complex and orchestrated ways, many of which we have barely even begun to understand.

The dilemma for the biologist is that the kind of deterministic models applied to such great effect in other fields, are often a very poor description of the biological system being studied, particularly when it is “biological” insights that are being sought. On the other hand, after three centuries of application across almost every conceivable domain of dynamic analysis, calculus is a tried and tested tool whose analytical power is hard to ignore.

One of the challenges confronting biologists in their attempts to model the dynamic behaviors of biological systems is having too many moving parts to describe. This problem arises from the combinatorial explosion of species and states that confounds the mathematical biologist’s attempts to use differential equations to model anything but the simplest of cell signaling or metabolic networks. The two equally unappealing options available under these circumstances are to build a very descriptive model of one tiny corner of the system being studied, or to build a very low resolution model of the larger system, replete with simplifying assumptions, approximations and even wholesale omissions. I have previously characterized this situation as an Uncertainty Principle for Traditional Mathematical Approaches to Biological Modeling in which you are able to have scope or resolution, but not both at the same time.

There is a potentially even greater danger in this scenario that poses a threat to the scientific integrity of the modeling study itself, since decisions about what to simplify, aggregate or omit from the model to make its assembly and execution tractable, require a set of a priori hypotheses about what are and what are not, the important features of the system i.e. it requires a decision about what subset of components of the model will determine its overall behavior before the model has ever been run.

Interestingly, there is also a potential upside to this rather glum scenario. The behavior of a model with such omissions and/or simplifications may fail even to approximate our observations in the laboratory. This could be evidence that the features of the model we omitted or simplified might actually be far more important to its behavior than we initially suspected. Of course, they might not be and the disconnect between the model and the observations may be entirely due to other flaws in the model, in its underlying assumptions,or even in the observations themselves. The point worth noting here however, and one that many researchers with minimal exposure to modeling often fail to appreciate, is that even incomplete or “wrong” models can be very useful and illuminating.

The role of stochasticity and noise in the function of biological systems is an area that is only just starting to be widely recognized and explored, and it is an aspect of biology that is not captured using the kind of modeling approaches based upon the bulk properties of the system, that we are discussing here. A certain degree of noise is a characteristic of almost any complex system and there is a tendency to think of it only in terms of its nuisance value – like the kind of unwanted noise that reduces the fidelity of an audio reproduction or a cellphone conversation. There is however evidence that biological noise might even be useful to organisms under certain circumstances – even something that can be exploited to confer an evolutionary advantage.

The low copy number of certain genes for example, will yield noisy expression patterns in which the fluctuations in the rates of gene expression are significant relative to the overall gene expression “signal”. Certain microorganisms under conditions of biological stress, can exploit the stochasticity inherent in the expression of stress-related genes with low copy numbers, to essentially subdivide into smaller, micro-populations differentiated by their stress responses. This is a form of hedging strategy that spreads and thereby mitigates risk, not unlike the kind of strategy used in the world of finance to reduce the risk of major losses in an investment portfolio.

In spite of all these shortcomings, I certainly don’t want to leave anybody with the impression that deterministic modeling is a poor approach, or that it has no place in computational biology. In the right context, it is an extremely powerful and useful approach to understanding the dynamic behaviors of systems. It is however important to recognize that many of the emergent properties (like adaptation for example) of the physico-chemical systems that organisms are at a fundamental level – where on the conceptual spectrum from physics through chemistry to biology we are positioned much further towards the biology end – are often ill-suited to analysis using the deterministic modeling approach. Given the nature of this intellectual roadblock, it may well be time for computational biologists to consider looking beyond differential equations as their modeling tool of choice, and developing new approaches for biological modeling, better suited to the task at hand.

© The Digital Biologist | All Rights Reserved

Creation of a bacterial cell controlled by a chemically synthesized genome

A new research article by Gibson et al that appeared in Science this month, describes if not quite a wholly synthetic organism, at least a big step towards one.

Here is the author’s own abstract.

We report the design, synthesis, and assembly of the 1.08-mega-base pair Mycoplasma mycoides JCVI-syn1.0 genome starting from digitized genome sequence information and its transplantation into a M. capricolum recipient cell to create new M. mycoides cells that are controlled only by the synthetic chromosome. The only DNA in the cells is the designed synthetic DNA sequence, including “watermark” sequences and other designed gene deletions and polymorphisms, and mutations acquired during the building process. The new cells have expected phenotypic properties and are capable of continuous self-replication.

Inspires me to read some Aldous Huxley again :-)

© The Digital Biologist | All Rights Reserved

Not yet as popular as Grumpy Cat but …

Usage-Q1-2014… in the first quarter of 2014, “The Digital Biologist” was read in 70 countries around the world. So while it’s very unlikely that the high octane, rollercoaster world of computational biology will ever have the pulling power of internet memes like a comically sour-faced kitty or the lyrical stylings of Justin Bieber, what we lack in numbers we certainly make up for in geographical diversity. Can Grumpy Cat or Justin Bieber claim a following in Liechtenstein for example? Neither can we actually, but we can make that claim for Sierra Leone (if you can really call a single reader a “following” – thank you mysterious Sierra Leone Ranger, whoever you are :-).

Anyway, wherever you are visiting from, thank you all so much for your readership and don’t forget that you can also join the conversation via the LinkedIn Digital Biology Group and at our FaceBook page.

 © The Digital Biologist | All Rights Reserved 

Ten Simple Rules for Effective Computational Research

Big-Influence-Map-BorderedI have written at some length about what I feel is necessary to make computational modeling really practical and useful in the life sciences. In the article Biologists Flirt With Models for example, that appeared in Drug Discovery World in 2009,  and in the light-hearted video that I made for the Google Sci Foo Conference, I have argued for computational models that can be encoded in the kind of language that biologists themselves use to describe their systems of interest, and which deliver their results in a similarly intuitive fashion. It is clear that the great majority of biologists are interested in asking biological questions rather than solving theoretical problems in the field of computer science.

Similarly, it is important that these models can translate data (of which we typically have an abundance) into real knowledge (for which we are almost invariably starving). If Big Data is to live up to its big hype, it will need to deliver “Big Knowledge”, preferably in the form of actionable insights that can be tested in the laboratory. Beyond their ability to translate data into knowledge, models are also excellent vehicles for the collaborative exchange and communication of scientific ideas.

With this in mind, it is really gratifying to see researchers in the field of computational biology, reaching out to the mainstream life science research community in an effort to address these kinds of issues, as in this article “Ten Simple Rules for Effective Computational Research” that appeared in a recent issue of PLOS Computational Biology. The 10 simple rules presented in the article touch upon many of the issues that we have discussed here, although they are for the most part, much easier said than done. Rule 3. in the article for example “Make Your Code Understandable to Others (and Yourself)” is something of a doozy that may ultimately require biologists to abandon the traditional mathematical approaches borrowed from other fields and create their own computational languages for describing living systems.

To be fair to the authors of the article however, recognizing that there is a problem is an invaluable first step in dealing with it, even if you don’t yet have a ready solution – and for that I salute them.

Postscript: Very much on topic, this article about the challenges facing the “Big Data” approach subsequently appeared in Wired on April 11th.

© The Digital Biologist | All Rights Reserved 

Biomarker Design: Lessons from Bayes’ Theorem

bayes-discreteIn the last article I posted on “The Digital Biologist”, I gave a very brief and simple introduction to Bayes’ Theorem, using cancer biomarkers as an example of one of the many ways in which the theorem can be applied to the evaluation of data and evidence in life science R&D. The power of the Bayesian approach was I hope, evident in the analysis of the CA-125 biomarker for ovarian cancer that we considered, and I felt that it would be worthwhile in this follow-up, to round out our discussion by looking in a little more detail at the practical, actionable insights that can be gained by the application of Bayesian analysis to the design of biomarkers. It is all too often that those of us in the field of computational biology are accused of generating models, simulations and algorithms that while pretty or cool, are of little or no practical help to real world research problems. The sting of this accusation comes at least in part from the fact that all too often, this is actually true :-(

Bayesian analysis by contrast, can be a really useful and practical computational tool in life science R&D as I hope this brief discussion of its application to biomarker design will show. There are some valuable lessons for biomarker design that can be drawn from the kind of Bayesian analysis that we described in the first part of this discussion, when we considered its application to the use of CA-125 to diagnose ovarian cancer.

Let’s suppose that we are a company determined to develop a more reliable biomarker than CA-125 for the early detection of ovarian cancer. One direction we might pursue is to identify a biomarker that predicts disease in actual sufferers with a higher frequency i.e. a biomarker with a better true positive hit rate. We saw in the previous article, that CA-125 only predicts the disease in about 50% of sufferers for stage I ovarian cancer and about 80% of sufferers for stage II and beyond. One of the dilemmas faced by physicians working in the oncology field, is that biomarkers like CA-125 can be poorly predictive of the disease in the early stages when the prognosis and options for treatment are better. It’s disheartening for both the patient and the physician to be able to get a reliable diagnosis only when the disease has already progressed to the point at which there are fewer good options for treatment.

I have previously used an analogy from the behavioral sciences to describe this situation: “broken glass and blood on the streets are the “markers” of a riot already in progress but what you really need for successful intervention are the early signs of unrest in the crowd before any real damage is done”.

So imagine that our hypothetical biomarker company has made a heavy R&D investment in identifying a biomarker with a better rate of true positives. If our new biomarker has true positive rate of 95% (a fairly significant improvement on our previous value of about 80%) and the same roughly 4% false positive test rate as previously, how much better off are we?

If we plug the numbers into our Bayesian model, the answer is “not much”.

The chances of a patient actually having ovarian cancer given a positive test result with our new biomarker are still less than 1 in 4. In fact, even if we were to identify a biomarker with a 99% true positive rate, we could still only declare a roughly 1 in 4 chance of disease given a positive test result.

What if instead of pursuing a better true positive hit rate, our company had invested in reducing the false positive test rate?

Without altering the true positive rate of about 80%, reducing the biomarker’s false positive rate from about 4% to 2%, increases the chance of the patient actually having the disease given a positive test result, to better than 1 in 3. If our hypothetical company can get the false positive rate down to 1%, there is actually a better than even chance of a positively-testing patient actually having the disease. Getting the false positive test rate down to 0.1% (approximately 40 times lower than the actual false positive rate for CA-125) means that the patient is very likely to have the disease given a positive test result, with a less than 1 in 10 chance of receiving a false positive diagnosis.

The Bayesian model clearly tells us in the case of ovarian cancer, that our hypothetical company is much better off investing its R&D dollars in the pursuit of lower false positive test rates rather than higher true positive test rates. Even a 99% true positive test rate barely shifts the probabilities associated with a positive test result, whereas getting the false positive test rate down to 1% improves the probability of a true diagnosis from less than 1 in 4, to better than even. Even this scenario however, is far from ideal.

If you look at the actual numbers in the model with regard to the populations of tested patients with and without the disease, there is another valuable lesson to be learned, and it is one that illuminates the reason why improving the true positive test rate while ignoring the false positive test rate is what my countrymen would refer to as “a hiding to nothing“.

It is the overwhelmingly larger population of healthy patients versus those with the disease, that is skewing the probability numbers against us and the lower the incidence of the disease, the worse this problem will be.

If ovarian cancer had a higher incidence of say, 1 in 10 women instead of 1 in 72 as is actually the case, a positive test result with CA-125 would correspond to an almost 70% probability of the patient actually having the disease. By contrast, if the ovarian cancer incidence was 1 in 1000 women, a positive test result with CA-125 would still correspond to less than 1 chance in 50 of the patient actually having the disease.

The lower the incidence of the disease you want to diagnose, the correspondingly lower your false positive test rate needs to be.

Imagine for example, the exigencies that a rare cancer like adrenocortical carcinoma which only affects 1 or 2 people in a million imposes on the development of any kind of diagnostic biomarker for that disease. In some rare diseases that have a genetic origin (such as Type II Glycogen Storage Disease for example), there do exist definitive genetic tests for the disease that are essentially unequivocal insofar as they have a false positive rate that is effectively zero.

The Bayesian model presented here is an extremely simple but excellent example of the way in which models can provide intellectual frameworks with which data can be organized and reasoned about. It is this author’s opinion that the pharmaceutical and biotechnology industries could actually benefit enormously from a shift in their current emphasis on data, with more attention being paid to the kind of models that have the potential to explain these data, to synthesize useful knowledge from them, and to drive effective decision making based upon the underlying science.

 © The Digital Biologist | All Rights Reserved

A theorem for all seasons

bayesAs life scientists, it is seldom that we ever get to deal with anything resembling certainty. The systems that we work with are typically nonlinear and chaotic, heterogeneous, non-binary – in a word, messy. In the world of commercial life science, it is common that the real value of an R&D investment of hundred of millions or even billions of dollars, may ultimately ride on such a razor-thin edge between success and failure that it requires the calculation of something like a t-test to determine whether you really have a marketable product or just another placebo – or in the case of a diagnostic, a real indicator versus background noise.

The development of a new drug or diagnostic is in essence, a process of gathering evidence either for or against your working hypothesis that the use of your product will confer some net benefit over not using it. In such a case, the null hypothesis – that your product will confer no net benefit at all – is (and always should be) a core consideration in your approach.

With each new piece of data that we accumulate along our hopeful path to that blockbuster product, we are weighing the evidence for and against eventual success or failure. A big part of this process for a commercial life science company, is the decision based upon the current evidence, of whether or not it is worthwhile to continue the investment of time, money and resources  on the product, or to pull the plug on it. All too often and for all sorts of reasons, it can be painful and difficult for a company to admit that a product is a dead-end and walk away from its investment. Killing projects in a timely fashion is a particularly acute problem in the case of drug development, given the exponentially increasing cost of R&D as the product progresses through the subsequent phases of development.

This process of weighing the evidence was mathematically formalized during the 18th century, by Thomas Bayes, an English priest who was fascinated with statistics and probability. As a small but amusing aside, it is ironic to reflect that there is some uncertainty about whether the only extant portrait of Thomas Bayes shown above, actually depicts the right person! But whether or not the portrait is a true likeness of the great man, the now famous theorem to which he gave his name, stands as a landmark in the history of probability theory. An understanding of the implications of Bayes’ Theorem and its application to the myriad problems of truth, belief and likelihood that our uncertain world challenges us with daily, is something that every scientist (biologist or not), can put to good use in his or her own work.

So why is Bayes’ Theorem so useful and what does it have to teach us as life scientists?

By way of a very simple and brief introduction to Bayes’ Theorem, let’s take a look at the development of biomarkers – an area of life science research that is directly concerned with issues of prediction and likelihood.

Let’s imagine that we are looking for a reliable early indicator for a disease which affects about 1.5% of the population and that our research has uncovered a biomarker whose presence is predictive of the disease in about 80% of sufferers. Sounds pretty good right? Most people would probably look at those numbers and conclude that a positive test for the biomarker was associated with something like an 80% chance of having the disease. Not too shabby.

Not so fast.

One very important question that remains unanswered is “How many people who do not have the disease, would still get a positive test result with this biomarker?”  A biomarker that produced a positive result (indicative of the presence of the disease) in 80% of all patients, with or without the disease, would obviously have no predictive power at all for signaling the presence of the disease. As with the inherent uncertainty that pervades most things in life, biomarkers are seldom if ever 100% reliable, but let’s say for the purposes of our story, that the biomarker in question produces false positives in about 4% of patients without the disease  (i.e. the test result indicates disease where none is actually present). Things seem to be looking up. Armed with these numbers, we might feel that this biomarker has a bright future in the clinic based upon the following reasoning – it will only fail to detect the disease in about 2 out of 10 sufferers and it will only produce a misdiagnosis of the disease in about 4 out of 100 healthy patients.

So now we’re in good shape right?

Once again – not so fast.

Let’s think about this biomarker’s performance from the perspective of a hypothetical population of 10,000 patients. Based upon the 1.5% incidence of this disease in the population, we would expect our population to have about 150 patients with the disease and therefore, about 9,850 without it. Of those 150 patients with the disease, we would expect about 120 to test positive for the biomarker based upon an 80% positive test rate amongst people with the disease. Amongst the 9,850 patients who do not have the disease, we would expect about 394 to test positive based upon a 4% false positive test rate for the biomarker.

Now put yourself in the position of one of those patients who just got a positive test result. The first question you’re going to ask is “What is the probability that I have the disease given that I tested positive for it?”

This is really the key question. What does the test result actually mean?

To answer that question, let’s look at the overall probability of getting a positive test result under any circumstances. We expect 120 patients with the disease to test positive and 394 without the disease to test positive. So out of a total of 514 positive tests, we expect 120 patients who test positive to actually actually have the disease, corresponding to a probability of about 23%. In other words, the answer to the question of the patient who had the positive test result is that they have only about 1 chance in 4 of actually having the disease, based upon the positive test result. Put another (and perhaps more optimistic) way, despite the positive test result, there are still about 3 chances in 4 that they do not have the disease. Or put in yet another way – despite the positive test result, the patient is still 3 times more likely not to have the disease than to have it.

In the light of this new analysis of the biomarker’s performance, would you still conclude that this biomarker is a useful clinical diagnostic for this disease? If you were a physician for example, would you schedule a potentially risky or expensive surgical procedure based upon the 1 in 4 chance of the disease indicated by the positive test result? Would you alternatively, recommend doing nothing at all despite the positive test result?

You might be really surprised to learn that this “hypothetical” disease biomarker example is based upon the real numbers for the CA-125 biomarker that is actually used as a diagnostic indicator for ovarian cancer. A wealth of statistics have been published both for ovarian cancer incidence and for the use of CA-125 as a diagnostic marker. All that remained for me to do, was to plug these numbers into a Bayesian model.

According to the American Cancer Societythe lifetime risk of a woman developing ovarian cancer is 1 in 72 (0.0134). In a recent study involving more than 78,000 women, the use of CA-125 as a single indicator, yielded 3,285 false positive results (~4%) in which healthy women were diagnosed as having ovarian cancer. Of these incorrectly diagnosed women, 1080 of them actually underwent an unnecessary surgical biopsy procedure, of whom about 150 suffered severe complications as a result. As if this bad news was not already enough, the diagnostic use of CA-125 for ovarian cancer was shown to be of little use even in women who have already had ovarian cancer. In a study that examined the benefit of using elevated CA-125 levels as an early marker for ovarian cancer relapse, there was shown to be no survival benefit for women who were started early on chemotherapy based upon the CA-125 test results, versus those who waited until they exhibited the clinical symptoms of relapse.

It is worth noting by the way, that the diagnostic probabilities obtained from this admittedly rather crude Bayesian model presented above, do nonetheless correlate rather well with the actual statistics obtained for true and false positive tests from studies of women who were tested with CA-125 for ovarian cancer.

It should be clear from this example, that it is important to weigh the evidence for the efficacy of diagnostic markers carefully. Failing to do so, has the potential to add a great deal of unnecessary complication and expense to health care treatments. CA-125 by itself is a poor indicator of ovarian cancer and medical decisions based upon the sole use of such an indicator can end up subjecting patients to unnecessary pain and suffering. Consider also, the time and money that was wasted for the unnecessary medical treatement of 1080 women with false positive test results in the study cited above, let alone the costs incurred managing the severe surgical complications suffered by 150 of these women as a result of this unnecessary treatment. To be fair however, the shortcomings of CA-125 as a diagnostic marker are now well recognized and the current standard of practice recommends the use of CA-125 with other indicators such as sonograms and pelvic exams, all of whose combined results are more reliable than the diagnostic use of CA-125 alone.

This CA-125 story also highlights the urgent need for more reliable biomarkers that can be used in the early detection of diseases like ovarian cancer. The statistical probabilities that I used were actually taken from the “best case” scenario (in terms of predictive accuracy) for the use of CA-125 to diagnose ovarian cancer. I used the values observed for women with stage II or later disease in which CA-125 levels are typically more elevated but unfortunately the disease is harder to treat. Had I used instead, the values for women with earlier stage I disease where the treatment options and prognosis are better, the true positive rate for diagnosis drops from around 80% to around 50% and the probability of actually having the disease given a positive test result, drops to almost 1 in 7.

Incidentally, if you think I was exaggerating about the naivety of people’s interpretation of biomarker statistics, where for example, a test that detected 80% of cases for diseased patients was equated in peoples’ minds with an 80% probability of having the disease if the test is positive – well unfortunately I was not. In repeated studies, it has been consistently shown that even the majority of physicians, whose job it is to interpret these kinds of statistical results for their patients, struggle with their interpretation, generally ascribing more confidence to their conclusions from them, than is actually due.

The intuitive ‘algorithm’ that we used above to determine the probability of an event (the patient has a disease), given some prior evidence (the patient tested positive for the disease), can be captured more formally in an equation. The formal description of Bayes’ Theorem, is typically presented as an equation of the form:


In the equation above, the syntax p(B|A) denotes the conditional probability of outcome B given outcome A. If we plug in the same numbers that we used in our intuitive approach in order to re-calculate the probability of having the disease given a positive test result, they look like this:

p(disease | positive) = p(positive | disease) * p(disease) / p(positive)

note that p(positive) is the total probability for all of the circumstances under which a positive test could occur – in our case, it is the sum of the probabilities for getting a positive test with and without the disease.

p(positive) = p(positive | disease) * p(disease) + p(positive | no disease) * (no disease)

p (positive) = 0.8 * 0.015 + 0.04 * 0.985 = 0.0514

therefore: p(disease | positive) = 0.8 * 0.015 / 0.0514 = 0.233 which corresponds to the 23% probability we arrived at using our intuitive approach.

As life scientists, the weighing of evidence is always an important component of our work. I hope that the example above makes it clear that in the case of the biomedical sciences at least, weighing the evidence naively can have the potential to be extremely costly and even life-threatening. In the life sciences, Bayes’ Theorem has been successfully applied to a vast array of biological areas as diverse as bioinformatics and computational biology, next-generation sequencing, biological network analysis, and disease evolution and epidemiology to name but a very few examples.

The fundamentals of Bayes’ Theorem are extremely easy to grasp, especially when dealing with the point probabilities and binary outcomes that were discussed here, but the applications of Bayes’ Theorem are vast, not only in the life sciences but in any sphere of activity in which our beliefs and decisions are shaped by weighing the evidence.

  © The Digital Biologist | All Rights Reserved