Content creation is more and more a collective experience. People attending large social events (e.g., a soccer match, a concert), but also personal-scale ones (e.g., a wedding, a birthday party) collect dozens of photos and video clips with their smartphones, tablets, cameras, and more recently social cameras. Such information is later exchanged in a number of different ways, including shared repositories, clouds, social networks, etc.
Sharing of this information make it possible for any user who attended or is simply interested to the event, to create their own view of it through summaries, stories, personalized albums and soundtracks.
In this respect, a major issue is the need of aligning and presenting the media collection in a consistent way. As a matter of fact, the time and location information attached to the captured media (timestamp, GPS) can be wrong, inaccurate, or incomplete (e.g., due to wrong setting of the clock/calendar, different time-zone, modification or removal of tags), as in photo archives or material that has undergone post-processing.
In the task scenario, we imagine a number of users (10+) attending the same event and taking photos and videos with different devices (smartphones, cameras, tablets) and at different granularity.
Arranging this data into a single organized library implies creating a chronologically-ordered outline of the galleries. If correct timestamps were available for all media, the task would be trivial; otherwise, complex content/context analysis could be needed to understand the mutual relationships among media items captured by different users. Furthermore, factors such as perspective, zooming, image filtering, unwanted noise of people talking, soundtracks, etc., may result in even more difficult matching.
Participants are required, given N media collections (e.g., galleries) taken by different users/devices at the same event, to find the best (relative) time alignment among them and detect the significant sub-events over the whole gallery.
The working assumptions are as follows:
- each gallery may be composed of photos and video clips taken from the same device;
- each gallery will be consistent in terms of time and location information, when available;
- researchers can use any kind of available information related to the media items: tags, annotation, timestamp, GPS, content, as well as possibly related information available on the network.
The task is of interest to researchers in the areas of multimedia event analysis, indexing and retrieval, although the dataset is sufficiently rich and appealing also for other research communities.
The data used for the challenge consist of events involving many participants, as in case of big music concerts, and are provided by the Flemish broadcaster VRT. Additional material to enrich the dataset is collected from social networking platforms.
Data will be released under Creative Commons license (or a license allowing similar terms for research purposes).
Ground truth and evaluation
The organizers will provide an annotated dataset. The ground truth will be verified by human assessors, to check the consistency of the media timestamps.
The system performances will be assessed considering the time synchronization error and sub-event detection error.
The time synchronization counts the number of galleries for which the synchronization error is below a predefined threshold. It will be evaluated by measuring the precision, namely the ratio between the number of synchronized galleries, and the total number of galleries, and the accuracy, namely the average time lapse calculated over the synchronized galleries, normalized with respect to the maximum accepted time lapse.
The quality of the sub-events detection (clustering) will be computed using the F-score.
 Conci, N., De Natale, F., Mezaris, V. Synchronization of Multi-User Event Media (SEM) at MediaEval 2014: Task Description, Datasets, and Evaluation, In Proceedings of MediaEval 2014
 Broilo, M., Boato, G., De Natale, F. Content-based synchronization for multiple photos galleries. In Proceedings of IEEE International Conference on Image Processing (ICIP), 2012, pp. 1945-1948.
 Blakowski G., Steinmetz, R. A media synchronization survey: reference model, specification, and case studies, IEEE Journal on Selected Areas in Communications, 1996, vol. 14, no. 1, pp. 5–35.
 Veenhuizen, A., van Brandenburg, R. Frame accurate media synchronization of heterogeneous media sources in an HBB context. In Proceedings of the Media Synchronization Workshop, 2012.
Nicola Conci, University of Trento, Italy
Francesco G. B. De Natale, University of Trento, Italy
Vasileios Mezaris, ITI - CERTH, Greece
Mike Matton, VRT, Belgium
11 May Development data release (updated release date)
12 June Test data release (updated release date)
10 August Run submission
28 August: Working notes paper deadline
14-15 September MediaEval 2015 Workshop