Background. A wealth of information on clinical trials has been provided by publicly accessible online registries. Information technology and data exchange standards enable rapid extraction, summarization, and visualization of information and derived knowledge from these data sets. Clinical trials data was extracted in the XML format from the National Library of Medicine ClinicalTrials.gov site. This data includes categories such as 'Summary of Purpose', 'Trial Sponsor', 'Phase of the Trial', 'Recruiting Status', and 'Location'. We focused on 645 clinical trials related to cancer vaccines. Additional facts on cancer types, including incidence and survival rates, were retrieved from the National Cancer Institute Surveillance data. Results. This application enables rapid extraction of information about institutions, diseases, clinical approaches, clinical trials dates, predominant cancer types in the trials, clinical opportunities and pharmaceutical market coverage. Presentation of results is facilitated by visualization tools that summarize the landscape of ongoing and completed cancer vaccine trials. Our summaries show the number of clinical vaccine trials per cancer type, over time, by phase, by lead sponsors, as well as trial activity relative to cancer type and survival data. We also have identified cancers that are neglected in the cancer vaccine field: bladder, liver, pancreatic, stomach, esophageal, and all of the low-incidence cancers. Conclusion. We have developed a data mining approach that enables rapid extraction of complex data from the major clinical trial repository. Summarization and visualization of these data represents a cost-effective means of making informed decisions about future cancer vaccine clinical trials.
ASJC Scopus subject areas
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Applied Mathematics