DroneProtect (New!)

The 2015 DroneProtect Task: Mini-drone Video Privacy Task (New!)
The number of drones deployed for civil applications and other non-military uses such as journalism, recreation, public safety, and precision agriculture is increasing. In particular, the deployment of the highly mobile and versatile drones for aerial surveillance in urban policing and crowd management gives rise to new challenges for civil liberties, privacy and safety.

The DroneProtect: Mini-drone Video Privacy Task aims to benchmark privacy filtering solutions for drone video related to public safety. The performance of solutions is judged by their ability to retain sufficient (frame-level) semantic information about activities and situations, while at the same time providing the required level of privacy for people appearing in the videos.

Task participants implement a combination of privacy filters to protect various personal information regions in a set of drone videos, which is provided. Privacy filtering should be optimised to: i) obscure personally identifying information effectively while, ii) preserving the information that is needed by a human viewer in order to interpret the video at the level needed to maintain security in the area monitored by the drone. Solutions should also attempt to preserve the overall visual acceptability-attractiveness of the resulting privacy filtered video-frames, since these factors have potential impact on interpretability, and on the quality of the work experience for humans interpreting the videos. As a secondary goal, the task aims to investigate mixtures of reversible and irreversible privacy filters.

For this year’s task, the use-case scenario is Car Park Security. The use scenario is the factor that determines how much of which type of information must be retained in the video to support the goal of maintaining security. The video input for the privacy filtering process consists of drone video clips showing examples of:
Persons walking, running, or fighting in the car park area,
Persons attacking a driver, loitering, entering or leaving a particular car in the car park,
Wrongly parked cars, collision with cyclists.

The output of the privacy filtering process must preserve sufficient semantics for recognition of specific security-relevant events unfolding in the car park scenes whilst reversibly masking the following aspects:

Person’s face and silhouette,
Person’s gender and race (note this does not entail gender/race recognition but gender/race un-classification),
Personal accessories,
Vehicle make and model,
Vehicle license plate (if zoomed-in on).

The face and the car body have high personal identification potential, whereas the human body outline, particularly one that has been rendered gender-unclassified, has a low identification potential. Note that gait analysis is excluded in the formulation of the task. Accordingly all image regions as listed above would need to be masked respectively with corresponding filter strength, High (H), Low (L), Medium (M) so as to maintain the appropriate privacy protection, intelligibility and attractiveness-acceptability of the resulting privacy filtered video frame. Thus this privacy filtering task requires the detection of the human face-and-head zone within each bounding box that has already delineated a person.

Note that as a secondary goal the task in interested in solutions that deploy an appropriately managed mix of reversible and irreversible privacy filters. The filters must be responsive to the context of the events and persons’ behaviours occurring in the video. The responsiveness must enable the car park staff to reverse the privacy filtering to investigate any activities that may be related to any security incidents within a time frame, e.g, 7-30 days after which all videos are usually deleted. As an additional challenge a set of 5 un-annotated videos will be provided which the participants can use to attempt blind privacy filtering and the results for this will be evaluated separately.

Target group
Those working in image/video processing and video-analytics for privacy protection applications.

Data
The drone dataset to be provided incorporates 38 video clips of about 20 seconds each, in full HD resolution with sufficient number of examples of video images depicting different typical scenarios in a car park [3]. The bounding boxes for persons and cars are annotated. However, the detection of the face-head as a region of interest as well as a person-entering-a-car event is regarded as part of the task, since it is evaluated as intended for a real-life Car Park Security use-case scenario. Also, it should be sufficiently challenging, especially, since a region-specific privacy filtering has been previously benchmarked within the MediaEval 2014 Visual Privacy Task [1].

Ground truth and evaluation
The ground truth will consist of video frames with annotations of the bounding boxes containing description of entities in the video images of persons and cars plus examples of alternative filtering approaches and questionnaires used by the human viewers who have evaluated them, and, the final rankings achieved.

Privacy Solutions Evaluation: Participants are to submit privacy protected video clips using the testing subset. The submitted video clips will be evaluated based on: The human-perceived level of privacy filtering i.e., obscuring of the High/Low regions of personally identifiable information as previously annotated in the dataset provided.

The human perception and interpretation of the resulting privacy filtered image as a whole in terms of the level of retained information i.e., intelligibility, and, appropriateness (acceptability-attractiveness) of the privacy filtered image (also defined in the MediaEval 2012, MediaEval 2013, and, MediaEval 2014 Privacy Task descriptions [1,2]).

Participants will each receive the results of the evaluations of their submission as well as the overall results and rankings for all the submitted entries. The rankings will be based on the application of different weightings to the results for each of the above three criteria (privacy level, intelligibility, appropriateness) as calculated from the evaluation results given by each of three distinct communities of human evaluators; namely a) crowd-sourced online communities, b) surveillance monitoring staff, and, c) privacy filtering technology developers. The weightings will be agreed by the participants so as to reflect the relative importance of each of the above three evaluation criteria as perceived by each of the three human evaluator groups.

References and Recommended Reading (a more extensive reference list is available if required)
[1] Badii, A., Al-Obaidi, A., and Einig, M., MediaEval 2013 Visual Privacy Task: Holistic Evaluation Framework for Privacy by Co-Design Impact Assessment. MediaEval 2013 Workshop. CEUR-WS.org, 1043, Barcelona, Spain, October 2013.

[2] A. Badii, T. Ebrahimi, C. Fedorczak, P. Korshunov, T. Piatrik, V. Eiselein, and A. Al-Obaidi. Overview of the MediaEval 2014 visual privacy task, In MediaEval 2014 Workshop, Barcelona, Spain, October 2014.

[3] Bonetto, M., Korshunov, P., Ramponi, G., and Ebrahimi, T., Privacy in Mini-drone Based Video Surveillance, Workshop on De-identification for privacy protection in multimedia, May 2015.

[4] Badii, A., Einig, M., Tiemann, M., Thiemert, D. and Lallah, C., Visual context identification for privacy-respecting video analytics, in IEEE 14th International Workshop on Multimedia Signal Processing (MMSP 2012), pp. 366-371, Banff, Canada, September 2012.

Task organizers
Atta Badii (UoR), Touradj Ebrahimi (EPFL), Pavel Koshunov (EPFL), Jean-Luc Dugelay (EURECOM), Christian Fedorczak (Thales Communications & Security), Tomas Piatrik (QMUL), Volker Eiselein (TUB), Hamid Oudi (UoR), Ahmed Al-Obaidi (UoR), Natcha Ruchaud (EURECOM).

Task schedule
15 May : Development data release
15 June: Test data release
20 July: Run submission
28 August: Working notes paper deadline
14-15 September MediaEval 2015 Workshop

Acknowledgments
This task is organised by the EU FP7 project VideoSense: Virtual Centre of Excellence for Socio-ethically-guided and Privacy-respecting Video-Analytics in Security (videosense.eu).

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context