| Title: | Parser-Based Retraining for Domain Adaptation of Probabilistic Generators |
| Authors: | Deirdre Hogan, Jennifer Foster, Joachim Wagner and Josef van Genabith, 2008 |
| Abstract: | While the effect of domain variation on Penn-treebank- trained probabilistic parsers has been investigated in previous work, we study its effect on a Penn-Treebank-trained probabilistic generator. We show that applying the generator to data from the British National Corpus results in a performance drop (from a BLEU score of 0.66 on the standard WSJ test set to a BLEU score of 0.54 on our BNC test set). We develop a generator retraining method where the domain-specific training data is automatically produced using state-of-the-art parser output. The retraining method recovers a substantial portion of the performance drop, resulting in a generator which achieves a BLEU score of 0.61 on our BNC test data. |
| ICHEC Project: | TREC-2008 Blog Track - Parsing for Sentiment Analysis |
| Publication: | In Proceedings of the 5th International Natural Language Generation Conference (INLG08), Salt Fork Park, Ohio |
| URL: | http://rian.ie/en/item/view/30441.html |
| Keywords: | Machine translating; Penn-Treebank-trained probabilistic generator |
| Status: | Published |