Rather than taking a snapshot view of clinical trials, we track thousands of daily changes across multiple data sources to find the most crucial updates. Each of those events are classified and ranked by noteworthy criteria defined by STAT journalists and other expert sources.
By deriving and aggregating proprietary events and continuously looking at these new data points longitudinally, event detection algorithms are able to surface trends that provide an up-to-date view of emerging activity that others may miss.
“The new tool gathers vital information and context on clinical trials far more efficiently than I can do myself — and the algorithm’s ability to surface otherwise invisible trends is truly remarkable,” said STAT News Kate Sheridan, one of the journalists who contributed to the effort.
For example, consider these two trials, both with recent status updates to 'Recruiting':
You’ll see that the update to the same status in these trials received two different classifications. That’s due to the editorial considerations of the trials’ patterns.
The first trial, experiencing a ‘Recruitment Strategy’ event, changed from ‘Enrolling by Invitation’ to ‘Recruiting.’ A change from a specific, invitation-based strategy to a broader, more general recruiting strategy may indicate that the trial is struggling with enrollment goals, or that they've expanded their intended population to a larger group.
The second trial, experiencing a ‘Re-Entering Recruitment’ event, changed from ‘Active, not recruiting’ to ‘Recruiting.’ This may indicate that thus far, the data collected isn’t substantive to prove statistically significant efficacy, so in hopes of improving these metrics, the trial is enrolling an increased number of patients.
This is how we use aggregate patterns to assign editorial labels. Furthermore, we can mine unstructured data from clinical trial events to assign classification. For example, a trial stopping early can have different implications based on the reason it stopped:
Once we’ve defined the parameters of an editorial classification or signal to be surfaced, the models are trained to detect and classify these editorial definitions at scale on an ongoing basis.