Meeting Announcement: The Future of Careers in Scholarship


If you’re interested in the future of research and scholarship, and like many in the field, you also subscribe to the consensus view that the current system is broken, you won’t want to miss the Ronin Institute’sThe Future of Careers in Scholarship”, being held in Cambridge MA on November 5th. The unconference format of the meeting, will even allow you and other attendees to shape the agenda of the meeting, so come prepared to be an active participant rather than just a spectator. The meeting will be hosted at The Democracy Center in Harvard Square and you will have the chance to meet and network with an eclectic and forward-thinking group of people from a range of different research areas.

If you are really interested in the future of research and scholarship, and would like to get involved in the movement to advance beyond our current broken system and build new models for doing research, take a look at the Ronin Institute website. The Ronin Institute is devoted to facilitating and promoting scholarly research outside the confines of traditional academic research institutions.

© The Digital Biologist


Python For The Life Sciences: Update

After about a year of work and barring a few last-minute tweaks to the layout, it looks like we finally have our book Python For The Life Sciences, ready to publish. UPDATE 10/6/2016: THE BOOK IS NOW PUBLISHED AND AVAILABLE AT LEANPUB

I think it’s fair to say that we probably got a little carried away with the project, and the book has ended up somewhat larger than we thought it would be, coming in at over 300 pages. For sure it’s not quite the concise, slim volume, quick introduction to biocomputing with Python that I think we had envisaged in the beginning.

The book does however cover an incredible range of life science research topics from biochemistry and gene sequencing, to molecular mechanics and agent-based models of complex systems. We hope that there’s something in it for anybody who’s a life scientist with little or no computer programming experience, but who would love to learn to code.

For the latest news on the book, including all the free revisions and updates that are included with any purchase of the book, sign up for the (zero spam) Python For The Life Sciences Mailing List

© The Digital Biologist

Python For Handling 96-Well Plate Data and Automation

book-graphics.005There’s hardly a life science lab you can walk into these days, without seeing a ton of 96-well plates and instruments that read and handle them. That’s why we’ve dedicated an entire chapter of our forthcoming book Python For The Life Sciences, to the humble 96-well plate.

The chapter introduces the use of Python for handling laboratory assay plates of many different sizes and configurations. It shows the reader how to read plate assay data from files formatted as comma-separated values (CSV), how to implement basic row and column computations, how to plot multi-well plates with the wells color-coded by their properties, and even how to implement the high level code necessary for driving instruments and robots through devices like Arduinos.

And this is just one of about 20 chapters designed to introduce the life scientist who wants to learn how to code, to the wonderful and versatile Python programming language.

Almost all of the code and examples in the book are biology-based and in addition to teaching the Python programming language, the book aims to inspire the life scientist reader to bring the power of computation to his or her research, by demonstrating the application of Python using real-world examples from across a wide range of biological research disciplines.

The book includes code and examples covering next-generation sequencing, molecular modeling, biomarkers, systems biology, chemical kinetics, population dynamics, evolution and much more.

Python For The Life Sciences should be available as an eBook this fall (2016), so if you’re a life scientist interested in bringing a computational skill set to your research and your career, visit the book’s web page and sign up to our (no spam) mailing list for updates about the book’s progress and publication.

© The Digital Biologist

A Cartoon History Of The Theranos Scandal

KQED_Theranos_01Here at The Digital Biologist, we have reported at some length on the Theranos scandal, since it serves as an example of the way that biological data can be used (and misused) in the development of new healthcare technologies.

Now the KQED Science web site has published a wonderful cartoon history of the rise and fall of Theranos, along with links to the major articles that described this history as it was unfolding. It’s a kind of an everything-you-ever-wanted-to-know-about-the-Theranos-scandal-but-were-afraid-to-ask, all presented on a single page. Enjoy 🙂

© Digital Biologist

The Future of Research is now a thing

FOR-250To be fair, it always was a ‘thing’, but now that The Future of Research has made it official by becoming a fully-fledged non-profit organization, it’s now really a ‘Thing’ (with a big T). This is no small feat for a group that started out as a loosely-knit assembly of grassroots activists who share a common interest in improving things for academic researchers. In only three years, they have built both national  and international recognition for their movement. They were named 2015 People of the Year by Science Careers, as well as landing a 2-year, $300,000 grant from the Open Philanthropy Project, to help them in their work in assisting junior scientists in grassroots efforts to change science policy.

So congratulations to the Future of Research on becoming a Thing with a big T. Anyone who cares about the future of academic research and the working conditions and job prospects of those who pursue careers in it, should consider joining this very important conversation that the Future of Research has started.

As history has taught us over and over, systems that are broken, dysfunctional and unfair, rarely if ever transform or dismantle themselves from the top down, but rather, from the bottom up. Those who benefit the most from such systems are also usually its gatekeepers, and they will generally strive (consciously or unconsciously) to preserve it since they potentially have the most to lose from changing it. This is precisely why grassroots organizations like The Future of Research are so important as instruments of social change.

© The Digital Biologist

Big Data Does Not Equal Big Knowledge


… the life science Big Data scene is largely Big Hype. This is not because the the data itself is not valuable, but rather because its real value is almost invariably buried under mountains of well-meaning but fruitless data analytics and data visualization. The fancy data dashboards that big pharmaceutical companies spend big bucks on for handling their big data, are for the most part, little more than eye candy whose colorful renderings convey an illusion of progress without the reality of it.

Read the full article on LinkedIn.

© The Digital Biologist

Python For The Life Sciences: Table of contents now available

Coming Soon ...Our book Python For The Life Sciences is now nearing publication – we anticipate sometime in the early summer of 2016 for the publication date. Quite a number of people have asked us to release a table of contents for the book, so without further ado, here is the first draft of the table of contents.

If you would like to receive updates about the book, please sign up for our book mailing list.

Python at the bench:
In which we introduce some Python fundamentals and show you how to ditch those calculators and spreadsheets and let Python relieve the drudgery of basic lab calculations (freeing up more valuable time to drink coffee and play Minecraft)

Building biological sequences:
In which we introduce basic Python string and character handling and demonstrate Python’s innate awesomeness for handling nucleic acid and protein sequences.

Of biomarkers and Bayes:
In which we discuss Bayes’ Theorem and implement it in Python, illustrating in the process why even your doctor might not always estimate your risk of cancer correctly.

Reading, parsing and handling biological sequence data files:
Did we already mention how great Python is for handling biological sequence data? In this chapter we expand our discussion to sequence file formats like FASTA.

Regular expressions for genomics:
In which we show how to search even the largest of biological sequences quickly and efficiently using Python Regular Expressions – and in the process, blow the lid off the myth that Python has to be slow because it is an interpreted language.

Biological sequences as Python objects:
Just when you thought you had heard the last about sequences, we explore the foundational concept of Object Oriented Programming in Python, and demonstrate a more advanced and robust approach to handling biological sequences using Python objects.

Slicing and dicing genomic data:
In which we demonstrate how easy it is to use Python to create a simple next-generation sequencing pipeline – and how it can be used to extract data from many kinds of genomic sources, up to and including whole genomes.

Managing plate assay data:
In which we use Python to manage data from that trusty workhorse of biological assays, the 96-well plate.

Python for structural biology and molecular modeling:
In which we demonstrate Python’s ability to implement three-dimensional mathematics and linear algebra for molecular mechanics. It’s nano but it’s still biology folks!

Modeling biochemical kinetics:
In which we use Python to recreate what happens in the biochemist’s beaker (minus the nasty smells) – as well as using Python to model the cooperative binding effects of allosteric proteins.

Systems biology data mining:
In which we demonstrate how to parse and interrogate network data using Python sets, and in the process, tame the complex network “hairball”.

Modeling cellular systems:
In which we introduce the Gillespie algorithm to model biological noise and switches in cells, and use Python to implement it and visualize the results – along with some pretty pictures to delight the eye.

Modeling development with cellular automata:
In which we use the power of cellular automata to grow some dandy leopard skin pants using Turing’s model of morphogenesis with Python 2D graphics. Note to our readers: no leopards were harmed in the writing of this chapter.

Modeling development with artificial life:
In which we introduce Lindemeyer systems to grow virtual plants and use Python’s implementation of Turtle LOGO. Don’t worry, these plants will not invade your garden (but they might take over your computer).

Predator-prey dynamics in ecology:
In which we let loose chickens and foxes into an ecosystem and let ‘em duke it out in a state-space that is visualized using Python’s animation features.

Modeling virus population dynamics with agent-based simulation:
In which we create a virtual zombie apocalypse with agents that have internal state and behaviors. These are definitely smarter-than-usual zombies that illustrate an approach in which Python’s object-oriented programming approach really shines.

Modeling evolution:
In which we use the Wright-Fisher model to demonstrate natural selection in action, and show how being the “fittest” doesn’t always mean that you will “win”. Think Homer Simpson winning a game of musical chairs.

© The Digital Biologist

Theranos: A unicorn with real potential or a horse in a costume?

Theranos, the troubled healthcare startup (it feels faintly ridiculous to use that term for a company valued at $9 billion),  is now at something of a crossroads with the regulatory agencies upon whose approval its entire business model ultimately depends. Following months of apparent obfuscation and stonewalling about how (and how well) its disruptive blood testing technology works,  a scathing Wall Street Journal article exposed a degree of potential fraud and deceit surrounding this much lauded technology, that nobody outside Theranos could have imagined (especially since most of the lauding was coming from the company’s own PR machine).

Despite the clamor for Theranos to release for peer review, some of the findings and data it has generated in the course of developing its blood-testing technology that it claims can replace the use of hypodermic needles with a simple finger prick, the technology is still largely a black box to outsiders. Investors and industry observers want to know if the technology’s potential squares with the company’s stratospheric valuation, but more importantly – regulatory agencies, healthcare providers and their patients need to know if the medical diagnostic tests that use this technology, actually work and are accurate and reliable.

There’s a great deal more than money at stake here.

This week’s regulatory call for Theranos could literally make or break the company depending upon which way it goes. Failure to be in regulatory compliance could bring with it, a host of new problems for the struggling company including crippling fines and the inability to operate until such time as it can demonstrate that it has addressed the problems raised by the regulatory agencies. Most damaging of all however, this would be yet another huge blow to its already strained credibility with investors and healthcare provider partners, some of whom have already withdrawn or suspended their relationships with Theranos over doubts about the efficacy and accuracy of its blood tests. Beyond the problems that this could create for Theranos itself, many industry observers fear the effects that it could have on the entire sector if a contagion of doubt and panic were to grip investors and financiers, potentially stemming the flow of biotechnology venture funding and capital.

© The Digital Biologist

A sample chapter from our forthcoming book “Python For The Life Sciences”

chapter-comicsIn conjunction with my business partner Alex Lancaster, we are very excited for this early release of a sample chapter from our forthcoming book Python For The Life Sciences. This book is written primarily for life scientists with little or no experience writing computer code, who would like to develop enough programming knowledge to be able to create software and algorithms that they can use to advance or accelerate their own research. These are probably scientists who are currently using spreadsheets and calculators to handle their data, but who have probably promised themselves that at some point when the opportunity arises, they will learn to write code. If this pretty well describes your situation, then your wait is over and the opportunity is knocking. This could very well be just the book you have been waiting for!

In short, this is the book that would like to have read when we were learning computational biology.

The aim of this book

The aim of this book is to teach the working biologist enough Python that he or she can get started using this incredibly versatile programming language in their own research, whether in academia or in industry. It also aims to furnish a Python foundation upon which the biologist can build by extrapolating from the broad set of Python fundamentals that the book provides.

What this book is not

This book is not another comprehensive guide to the Python programming language, nor is it intended to be a Python language reference. There are already plenty of those out there, and easily accessible online. For this reason, you will find that there are many (many) aspects and areas of the Python language that are not covered. In a similar vein, this book is not intended to be a life science primer for programmers and computer scientists.

A tour of computational biology beyond bioinformatics

This book is all about using computational tools to jumpstart your biological imaginations. We will show the reader the range of quantitative biology questions that can be addressed using just one language from a range of life sciences. The examples are deliberately eclectic and cover bioinformatics, structural biology, systems biology to modeling cellular dynamics, ecology, evolution and artificial life.

Like a good tour, these biological examples were deliberately chosen to be simple enough not to impede the reader’s ability to assimilate the Python coding principles being presented – but at the same time each scientific problem illustrates a simple, yet powerful principle or idea. By covering a wide variety of examples from different parts of biology, we also hope that the reader can identify common features between different kinds of models and data and encounter unfamiliar, yet useful ideas and approaches. We provide pointers and references to other code, software, books and papers where they can explore each area in greater depth.

We believe that exploring biological data and biological systems should be fun! We want to take you from the nuts-and-bolts of writing Python code, to the cutting edge as quickly as possible, so that you can get up and running quickly on your own creative scientific projects.

The sample chapter shows how to use Python to mine and understand data from transcription factor networks and you can get it here.

© The Digital Biologist