The 2017 Multimedia for Medicine Task (Medico)

Task Results
Task overview: [Slides] [Presentation video]
Participant results: [Playlist of all presentation videos]

Task Description
The Medico Task tackles the challenge of predicting diseases based on multimedia data collected in hospitals.

The task differs from existing medical imaging tasks, in that is uses only multimedia data (i.e., images and videos) and no medical imaging data (i.e., CT scans). A further innovation is its focuses on two non-functional requirements: using as little training data as possible and being computationally effective.

The task focuses on videos of the gastrointestinal (GI) tract that were recorded by an extremely small camera that is swallowed by the patient like a pill in a process referred to as capsule endoscopy. The goal or the task is to develop approaches that can detect abnormalities and diseases in early stages.

The ultimate goal of the task is not only disease prediction, but also the generation of automatic text reports (summaries) of findings in multimedia content. The task works together with medical experts, who provide ground truth and help to develop it towards addressing the specific challenges of automatic text reports.

This year, participants in this task are offered three subtasks.

1) Detection: Detection of diseases with as few images in the training dataset as possible.

2) Efficient detection: Solve the classification problem in a fast and efficient way.

3) Report generation (Experimental): Automatically create a text-report for a medical doctor for three video cases. A list of requirements that will be provided to the participants before they submit their results.

Target group

Tackling the task can be addressed by leveraging techniques from multiple multimedia-related disciplines, including such as machine learning (classification), multimedia content analysis and multimodal fusion. Further, we hope that it will be useful for medical experts, through using multimedia research for providing more sophisticated ways to improve the health care services and survival for patients.

Overall, this task is intended to encourage the multimedia community to help improve the health care system through application of their knowledge and methods to reach the next level of computer and multimedia assisted diagnosis, detection and interpretation of abnormalities.

The dataset will include at least five different diseases in the human GI tract containing videos and images (at least 1000 images and ca. 10 videos per disease). The data will be split into training and test data whereas the training data will be small (max. 100-200 images for training per class) compared to the test data. Pre-extracted features for all data (visual) will also be provided. The ground truth for the data is collected from medical experts (specialists in GI endoscopy) annotating the provided images and videos.

Ground truth and evaluation
For the evaluation of detection we use the standard metrics Precision, Recall and weighted F1 score. We will also evaluate how much training data has been used to achieve good results.

For the evaluation of the processing time the organizers will run the code provided by the participants on the same hardware and measure the time from input to output weighted by accuracy of the output. Details about the hardware will be provided to the participants before submitting their code (it will be a multi core system with CUDA support).

The automatic generated report will be assessed manually from two of our medical partners in terms of how useful it is for them and if it satisfies existing demands for documentation of endoscopic procedures. The assessment will follow a list of requirements that will be provided to the participants before they submit their results.

Recommended reading
[1] Riegler, Michael, et al. "Multimedia and Medicine: Teammates for Better Disease Detection and Survival." Proceedings of the 2016 ACM on Multimedia Conference. ACM, 2016.

[2] World Health Organization - International Agency for Research on Cancer. Estimated Cancer Incidence, Mortality and Prevalence Worldwide in 2012., 2012.

[3] Y. Wang, W. Tavanapong, J. Wong, J. H. Oh, and P. C. de Groen. Polyp-alert: Near real-time feedback during colonoscopy. Computer methods and programs in biomedicine, (3), 2015.

[4] Y. Wang, W. Tavanapong, J. Wong, J. Oh, and P. C. de Groen. Computer-aided detection of retroflexion in colonoscopy. In Proc. of CBMS, pages 1–6, 2011.

Task organizers
Michael Riegler, Simula Research Laboratory & University of Oslo, Norway (contact person) michael at
Pål Halvorsen, Simula Research Laboratory & University of Oslo, Norway
Konstantin Pogorelov, Simula Research Laboratory & University of Oslo, Norway
Thomas de Lange, Cancer Registry of Norway, Norway
Sigrun Losada Eskeland, Vestre Viken Hospital Trust, Norway
Kristin Ranheim Randel, Cancer Registry of Norway, Norway
Duc-Tien Dang-Nguyen, Dublin City University, Ireland
Mathias Lux, University of Klagenfurt, Austria
Concetto Spampinato, University of Catania, Italy

Task schedule
1 May: Development data release
1 June: Test data release
18 August: Run submission
28 August: Working notes paper deadline
13-15 Sept: MediaEval Workshop in Dublin