COLING_debrief

Download Report

Transcript COLING_debrief

COLING 2010 Debrief
Cong Duy Vu HOANG
1
Conference Organization
• Generally good.
• Reception is very good (foods & performance)
but free .
• Banquet is really bad (poor foods & no
performance), but pricey with 600 RMB (~60
SGD).
2
Tour
• Visited a lot of places: Great Wall, Forbidden
City, Summer Palace, Lama Temple,
Tiananmen Square, Royal Tombs, Bird’s Net
Stadium, …
• Had a lot of kinds of foods in Beijing :D
– Beijing Roasted Duck
– BBQ & Hotspots
–…
3
Conference sessions
Oral Session
Poster
RING
(RefreshINGenious ideas)
Attendance
That depends.
Very crowded
Extremely crowded
Room size
Large
Very small
Large but not enough
Comments
Only a few questions (14) (even no questions
for some talks )
- Quite hard to
understand the work
only after presentation
- So noisy but
very funny, and
interesting
- Quite easy to
understand the
work quickly
- Like Q&A
- Directed by prestigious
professors (Aravind Joshi,
Ed Hovy, Ken Church)
- Very interesting and
useful
Whether each paper should have both oral Should include this.
and poster session???
4
My poster presentation
• Self-assess: good enough, quite a lot of
people come and ask me about it
– Prof. Dan Jurafsky (Stanford Univ.), Tomek
Strzalkowski (State Univ. of New York), Cecile Paris
(CSIRO ICT Centre), Minlie Huang (Tsinghua Univ.),
…
5
Papers of interest
• Topic: analyzing & processing scientific texts
– Learning to Annotate Scientific Publications
(Minlie Huang)
• Aim: to annotate scientific publications with key words
and phrases, leveraging manual annotations
• Task:
– Input: a target document
– Output: a set of key words and phrases for that document
• Steps: 1) retrieve relevant docs (in db) for input doc 2)
get initial list of annotated entries from them 2) rank to
annotate for target doc
• Data: 2 million documents from PubMed
6
Papers of interest
may be applied
to ForeCiteNote
• Topic: analyzing & processing scientific texts
– Towards Automatic Building of Document
Keywords (Joaquim Silva et al.)
• Aim: similar with previous paper (Learning …), but not
leveraging manual annotations
• Method:
– A language-independent approach
– N-gram and statistics – based approach to extract MWE as
keywords
• Data: news articles, but not clear about the domain,
statistics 
7
Papers of interest
• Topic: analyzing & processing scientific texts
– Unsupervised Synthesis of Multilingual Wikipedia
Articles
• Aim: to automatically synthesize Wikipedia articles in
multiple languages
• Task:
– Input: a Wikipedia article in a language (e.g. English)
– Output: a generated article in another language (e.g. Chinese)
• Method:
– Extract and translate keywords from input doc
– Query the web for translated keywords  candidate excerpts
– Rank excerpts to output
8
Papers of interest
• Topic: summarization & generation
– Multi-Document Summarization via the Minimum
Dominating Set (MDS) (Chao Sen et al.)
• a unique framework for different summarization
problems (generic, query, update, comparative, …)
• Based on a well-known algorithm (MDS), but I am not
clear about the motivation that why they can think
about using MDS for summarization.
• Obtained comparative evaluation results in compared
to state-of-the-art methods in DUC data
9
Papers of interest
• Topic: summarization & generation
– Opinosis: A Graph Based Approach to Abstractive
Summarization of Highly Redundant Opinions
(Kavita Ganesan et al.)
• Motivation:
– structured format of reviews not enough
– reviews with many redundant sentences
• Opinosios - a “shallow” abstractive summarization
based on graph representation + heuristics.
• No statistics on the corpus used .
10
Papers of interest
• Topic: information extraction
Aobo
Min
– An Empirical Study on Web Mining of Parallel Data
– Robust Measurement and Comparison of Context
Similarity for Finding Translation Pairs
– Mining Large-scale Comparable Corpora from
Chinese-English News Collections
– An Ontology-driven System for Detecting Global
Health Events (Collier N. et al.)
– Detection of Simple Plagiarism in Computer
Science Papers
11