QA4SpokenWeb (New!)

Announcement of Data Release
Due to unforeseen circumstances, ultimately, this task ultimately did not take place at MediaEval 2013.

The 2013 Question Answering Task for Spoken Web
The problem that we wish to explore is how best to build an information retrieval system in which both the queries and the content are spoken. The task has the goal of challenging the research community’s ability to build ranked retrieval systems for matching spoken questions with spoken answers based on topical matching.

This 2013 Question Answering Task for Spoken Web is a joint task designed to run synchronously in FIRE 2013 and MediaEval 2013. Information about the task (i.e., more detail beyond what appears on this page) can be found at: http://www.umiacs.umd.edu/~oard/qasw/

Target group
We expect QASW to be of interest to researchers interested in speech recognition, information retrieval (including question answering), and information and communications technology for development (ICTD).

Data
The goal of the task is to match questions spoken in Gujarati to answers spoken in Gujarati. We are currently transcribing 200 questions (from the 2,285 that were asked in the operational system on the date we captured them). From these, we plan to select 50 for training, and at least 100 for evaluation. Our intent in selecting 100 evaluation questions is to make it likely that we will get a yield of at least 5 relevant documents for at least 50 of the questions. The collection to be searched will consist of 3,557 answers that were given in response to specific questions and 834 "announcement,s" general answers that were provided that were provided to address topics of general interest. We expect to complete transcription in early Auril, and to release the test collection to be available on May 1, 2013.

Relevance judgments will be performed using depth-30 pooling (or deeper, if resources allow) using graded relevance judgments. Judgments will be pooled from for both evaluations, which requires synchronizing the tasks. Participating systems will be asked to submit depth-1000 results using the full question, and also using truncated versions of the question (truncated at 5 seconds, 10 seconds, 15 seconds, etc.). The principal evaluation measure will be mean NDCG for the full questions. Participating systems will also be asked to predict which truncation point maximizes a reward function that rewards DCG@1 and that penalizes duration (i.e., later truncation points) -- the goal of this measure is to encourage the design of systems that can determine when to "barge in" for the first time with a plausible answer to the question (in a real system, subsequent interaction would be possible, but that will not be modeled in FIRE or MediaEval in 2013).

Recommended reading
[1] Douglas W. Oard, Query by Babbling: A Research Agenda, In Proceedings of the CIKM Workshop on Information and Knowledge Management for Developing Regions, 2012.
[2] F. Metze et al. The Spoken Web Search Task. In Proceedings of MediaEval, 2012.
[3] Aren Jansen and Benjamin Van Durme. Indexing Raw Acoustic Features for Scalable Zero Resource Search. In Proceedings of Interspeech, 2012.
[4] Nigel G. Ward and Steven D. Werner. Thirty-Two Sample Audio Search Tasks. UTEP Technical Report UTEP-CS-12-39.

Task organizers
Douglas W. Oard, University of Maryland, College Park, USA
Nitendra Rajput, IBM Research, India
Jerome White, IBM Research, India

Note that this task is a "Brave New Task" and 2013 is the first year that it is running in MediaEval. If you sign up for this task, you will be asked to keep in particularly close touch with the task organizers concerning the task goals and the task timeline.

Task schedule
For the schedule, please refer to: http://www.umiacs.umd.edu/~oard/qasw/

MediaEval Benchmarking Initiative for Multimedia Evaluation

The "multi" in multimedia: speech, audio, visual content, tags, users, context