The focus of the 2014 Clinical Decision Support Track will be the retrieval of biomedical articles relevant for answering generic clinical questions about medical records.
We will be using in this track short case reports, such as those published in biomedical articles, as idealized representations of actual medical records. A case report typically describes a challenging medical case, and it is often organized as a well-formed narrative summarizing the portions of a patient's medical record that are pertinent to the case.
Participants of the track will be challenged with retrieving for a given case report full-text biomedical articles that answer questions related to several types of clinical information needs. Each topic will consist of a case report and one of three generic clinical question types, such as "What is the patient's diagnosis?" Retrieved articles will be judged relevant if they provide information of the specified type that is pertinent to the given case. The evaluation of submissions will follow standard TREC evaluation procedures.
Date | Note |
---|---|
February, 2014 | Applications for participation in TREC 2014 due. |
March 21, 2014 | Document collection available for download. |
April 28, 2014 | Topics available for download. |
July 28, 2014 | Results submission deadline. |
October, 2014 | Relevance judgments and individual evaluation scores released. |
November 18–21, 2014 | TREC 2014 conference at NIST in Gaithersburg, MD, USA. |
The target document collection for the track is the Open Access Subset of PubMed Central (PMC). PMC is an online digital database of freely available full-text biomedical literature. Because documents are constantly being added to PMC, to ensure the consistency of the collection, we obtained a snapshot of the open access subset on January 21, 2014, which contained a total of 733,138 articles. The full text of each article in the open access subset is represented as an NXML file (XML encoded using the NLM Journal Archiving and Interchange Tag Library), and images and other supplemental materials are also available.
Each article in the collection is identified by a unique number (PMCID) that will be used for run submissions. The PMCID is specified by the <article-id> element within each article's NXML file. Please note that although each article is represented by multiple identifiers (e.g., PubMed, PMC, Publisher, etc.), we are only concerned with PMCIDs for this task. The various identifier types are specified using the pub-id-type attribute of the <article-id> element. Valid values of pub-id-type that indicate a PMCID include pmc and pmcid.
For example, the PMCID of article 3148967 may be specified in the article's NXML file as follows.
To make processing the documents easier, we have also renamed each article NXML according to the article's PMCID. For example, the document for article 3148967 is named 3148967.nxml.
The document collection may be obtained in one of two ways. For participants who are only interested in indexing the text of the articles in the collection (most participants), we have prepared 4 bundles containing all 733,138 articles in the January 21, 2014 snapshot, which can be downloaded from the links below.
Each of the 4 files listed above is around 2–3 GB in size. The article NXMLs in each archive are split into multiple directories to allow for easy directory listings. Please note that the directory structure was created merely as a convenience and is not meant to convey any information about the articles.
Participants wishing to utilize additional media other than text, such as the images and videos included in the articles, can download the full document bundles directly from the PMC Open Access FTP Service. However, be aware that the size of the full collection is around 2 TB with the additional media and takes several days to completely download.
We have prepared a simple python script for participants wishing to obtain the full collection. The script downloads only the articles present in the January 21, 2014 snapshot and can be obtained from the links below.
The script should work with most recent versions of python and has been tested with versions 2.6, 2.7, and 3.3 on Linux, OS X, and Windows. Please let the organizers know if you encounter any trouble using it. Participants can use the above script to download the entire collection, including images and videos, by executing the following shell command.
file_list.txt.gz is a compressed list of the article archives included in the January 21, 2014 snapshot and TREC-CDS is the local directory where the collection will be downloaded. For additional usage information, please enter the following.
Interested participants are free to devise their own method for obtaining the full collection. However, please note that the articles listed in file_list.txt.gz constitute the definitive collection for the track. Because articles are added to the PMC Open Access Subset every day, you will still want to use file_list.txt.gz in order to restrict the downloaded files to those present in the January 21, 2014 snapshot.
Downloading the additional media associated with the full-text articles is entirely optional. None of the topics will require this information. However, we are providing this option for participants who may be interested in analyzing the medical images included in many of the articles as part of their retrieval strategies.
We have been made aware of the existence of duplicated documents in the collection. While the duplicates will likely not impact retrieval results, we are not going to consider them when conducting the relevance assessment. Participating groups have provided lists of the files that will not be judged, as they are duplicates of other files remaining in the collection. The lists of files can be obtained from the link below.
The topics for the track are medical case narratives created by expert topic developers that will serve as idealized representations of actual medical records. The case narratives describe information such as a patient's medical history, the patient's current symptoms, tests performed by a physician to diagnose the patient's condition, the patient's eventual diagnosis, and finally, the steps taken by a physician to treat the patient.
There are many clinically relevant questions that can be asked of a given case narrative. In order to simulate the actual information needs of physicians, the topics are annotated according to the three most common generic clinical question types (Ely et al., 2000) shown in the table below. Participants will be tasked with retrieving biomedical articles useful for answering generic questions of the specified type about each case report.
Type | Generic Clinical Question | Number of Topics |
---|---|---|
Diagnosis | What is the patient's diagnosis? | 10 |
Test | What tests should the patient receive? | 10 |
Treatment | How should the patient be treated? | 10 |
For example, participants should retrieve for a case report labeled "diagnosis" PMC articles a physician would find useful for determining the diagnosis of the patient described in the report. Similarly, for a case report labeled "treatment," participants should retrieve articles that suggest to a physician the best treatment plan for the condition exhibited by the patient described in the report. Finally, participants should retrieve for "test" case reports articles that suggest relevant interventions that a physician might undertake in diagnosing the patient.
In addition to annotating the topics according to the type of clinical information required, we are also providing two versions of the case narratives. The topic "descriptions" contain a complete account of the patients' visits, including details such as their vital statistics, drug dosages, etc., whereas the topic "summaries" are simplified versions of the narratives that contain less irrelevant information. A topic's description and its summary are functionally equivalent: the set of relevant documents is identical for each version. However, we are providing the summary versions for participants who are not interested in or equipped for processing the detailed descriptions.
In order to make the results of the track more meaningful, we require that participants use only all topic descriptions or only all topic summaries for any given run submission. Participants are, of course, free to submit multiple runs so that they can experiment with the different representations. Participants will be required to indicate on the run submission form which version of the topics they used.
The table below shows an example of the kind of case-based topic we will be using for the track. The topic "summary" is of type "diagnosis," and the PMCIDs listed in the last column are relevant for the given case because they can assist a physician in determining the patient's diagnosis.
No. | Type | Summary | Relevant Articles |
---|---|---|---|
1. | Diagnosis | A woman in her mid-30s presented with dyspnea and hemoptysis. CT scan revealed a cystic mass in the right lower lobe. Before she received treatment, she developed right arm weakness and aphasia. She was treated, but four years later suffered another stroke. Follow-up CT scan showed multiple new cystic lesions. |
The topics are provided as XML and can be downloaded from the link below.
Topic numbers are specified using the number attribute of each <topic> element and topic types (i.e., diagnosis, test, and treatment) are specified with the type attribute. Topic descriptions are given in <description> elements and topic summaries are given in <summary> elements. Below is an example of the format.
Since this is the first year of the track, we do not have any development topics participants can use for training their systems. However, we do have permission to distribute the case-based retrieval topics used for the medical task of ImageCLEF 2013, which can be considered to be similar in style. The ImageCLEF topics can be found at the link below.
Please take extreme caution when using the ImageCLEF topics. They are provided here only for reference. We have reformatted them somewhat to match this track's topic format, but differences remain. In particular, the ImageCLEF topics only contain the shorter <summary> tags, and all the topics should be considered to be of type diagnosis.
The evaluation of the proposed track will follow standard TREC evaluation procedures for ad hoc retrieval tasks. Participants may submit a maximum of five automatic or manual runs, each consisting of a ranked list of up to one thousand PMCIDs. The highest ranked articles for each topic will be pooled and judged by medical librarians and physicians trained in medical informatics. Assessors will be instructed to judge articles as either "definitely relevant" for answering questions of the specified type about the given case report, "definitely not relevant," or "potentially relevant." The latter judgement may be used if an article is not immediately informative on its own, but the assessor believes it may be relevant in the context of a broader literature review. Because we plan to use a graded relevance scale, the performance of the retrieval submissions will be measured using normalized discounted cumulative gain (NDCG).
As in past evaluations of medically-oriented TREC tracks, we are fortunate to have the assessment conducted by the Department of Medical Informatics of the Oregon Health and Science University (OHSU). We are extremely grateful for their participation.
The TREC runs submission system is now open for the track. The submission form is linked to from both the Tracks page in the Active Participants' part of the TREC web site and the Results Submission page (which in turn is linked from the main page of the Active Participants' section).
The submission deadline is July 28, 2014. A submission deadline of July 28 effectively means that you must successfully submit your run prior to NIST staff arriving at NIST on July 29—basically by 7:00am EDT on July 29.
To submit a run, fill in the submission form by answering questions that describe the run and specify the file to upload. After you click submit, the submission system will run a validation script that will test the submission file for various kinds of formatting errors. A pointer to this script is on both the active participants' track page and in the 'Tools' section of the active participants' web site. Over the years NIST has found that strict checking of the "sanity" of an input file leads to far fewer problems down the line as it catches a lot of mistakes in the run at a time that the submitter can actually correct them. You are strongly encouraged to use the script to test your submission file prior to submitting the file to NIST. Invoke the script giving the run file name as the argument to the script and an error log file will be created. The error log will contain error messages if any errors exist, and will say that the run was successfully processed otherwise. If any errors are found by the script at the time the run is submitted, the submission system will reject the run. Rejected runs are not considered to be submitted; indeed, no information is retained about rejected runs. Submitting a run through the submission system is the only acceptable way to send a run to NIST. In particular, do not email submission files to NIST as they will be deleted without being read.
Once you submit a run, you cannot delete it using the submission system. This means you cannot submit a "corrected" version of a run by using the same run tag. The prohibition against remote removal of runs is a safety precaution to ensure no one mistakenly (or deliberately!) overwrites someone else's run. If you need to correct a run, contact NIST with details of the problem. If you need to correct a run on the last night before the submission deadline, submit a new run with a different run tag, and send NIST email describing the problem and stating which run the new run should replace.
One field in the submission form is a list of organizations that have both applied to participate in TREC 2014 and have submitted the required "Dissemination of TREC Results" form to NIST. The list is sorted by Group ID (the ID you selected for your team when you applied). Make sure you are listed in that field, and contact NIST if you are not. You should make that check now since it could take some time to resolve the issue of why you are not already in the list and to get your group inserted into it. The submission deadline is a firm deadline.
The format for run submssions is standard trec_eval format. Each line of the submission file should follow the form:
where TOPIC_NO is the topic number (1–30), 0 is a required but ignored constant, PMCID is the PubMed Central identifier of the retrieved document, RANK is the rank (1–1000) of the retrieved document, SCORE is a floating point value reprenting the similarity score of the document, and RUN_NAME is an identifier for the run. The RUN_NAME is limited to 12 alphanumeric characters (no punctuation). The file is assumed to be sorted numerically by TOPIC_NO, and SCORE is assumed to be greater for docments that should be retrieved first. For example, the following would be a valid line of a run submission file:
The above line indicates that the run named "my-run" retrieves for topic number 1 document 3148967 at rank 1 with a score of 0.9999.