Several years ago I had the pleasure of working with some doctors at a local trauma hospital on a research project.  As with many organizations that excel at their core mission, the quality of the medical care this hospital offered was truly unparalleled, but many of the supporting functions were in need of optimization.  For example, operating rooms were scheduled with sticky notes on a whiteboard, and the way they kept track of the progress of an operation was to go peer into the OR window. Knowing if the current operation was going to end soon was important in determining when to get the next patient prepped and when to call the doctor who would be performing the operation.  We developed video analytics that would tell you if the OR was in use, being cleaned, or ready, and whether the current operation was wrapping up.

Operating room scheduling is one small part of hospital logistics, which includes staffing at all levels, managing bed space (which is a harder problem than you would think), scheduling the use of medical equipment, ensuring patients are seen by the right experts in a timely manner, and on and on.  When you look at a hospital as a whole, you start to realize that there are things (rooms, beds, equipment, patients, doctors, nurses) that stand in certain relations to one another through time (Mr. Jones was admitted to room 418, got a CT scan, and then saw Dr. Smith). A common representation for this kind of data is in the form of a graph, in which nodes (circles) stand for things and edges (arrows) stand for relations between the things at the ends of an arrow.  Graphs have a number of advantages in general (we’ll return to logistics in a moment), such as being easy for people to understand visually, having lots of theory and algorithms that let you answer really interesting questions about them, and flexibility in encoding lots of different kinds of information (e.g., a “temperature” edge joining a “Mr. Jones” node and a “101.2” node can encode that Mr. Smith’s temperature is currently 101.2 degrees).

So what can we do with logistics data once it’s encoded as a graph?  One thing is to compute centrality to figure out which nodes are most “important”.  Suppose you start at a node in a graph, randomly choose a node to follow, and repeat.  That’s called a random walk, and nodes that are touched on many random walks are important (according to one definition of importance).  If, for example, the hospital’s CT scanner has high centrality, then a scanner failure would block lots of patient flows. Maybe the hospital should buy another scanner, or at least ensure that it is well-maintained to prevent failures.

Another thing we can do is find common subgraphs, where the goal is to understand problems and improve treatment efficacy or hospital efficiency.  Maybe there is a pattern that involves patients with high fever being seen in the ER on the overnight shift and then coming back to the ER a few days later.  There are graph mining algorithms that will expose these kinds of common patterns, perhaps with a restriction that the patterns include a return visit to the hospital or some other outcomes that are important. We can then dig into the underlying data to see what’s going on.  What comorbidities are there? Which hospital staff see these patients? .

We’ve barely scratched the surface here. But I hope you get the idea that encoding the flow of things in a system through time as a graph makes it easy to think about problems and opens up a large toolbox of graph theoretic, artificial intelligence, and machine learning algorithms to tackle those problems.