Labeling Chinese Predicates with Semantic Rolesverbs.colorado.edu/~xuen/publications/xue.pdf · as a Semantic Role Labeling task, where each argument is assigned a label indicating
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Labeling Chinese Predicates with Semantic
Roles
Nianwen Xue∗
University of Colorado at Boulder
Driven by the availability of semantically annotated corpora such as the FrameNet and the
Proposition Bank, recent years have seen a revived interst in semantic parsing by applying statis-
tical and machine-learning methods to human annotated corpora. So far much of the research has
been focused on English due to the lack of semantically annotated resources in other languages.
In this article we report work on Chinese Semantic Role Labeling (SRL), taking advantage of
two recently completed corpora, the Chinese Proposition Bank, a semantically annotated corpus
of Chinese verbs, and the Chinese Nombank, a companion corpus that annotates the predicate-
argument structure of nominalized predicates. Since the semantic role labels are assigned to
the constituents in a parse tree, we first report experiments in which Semantic Role Labels are
automatically assigned to hand-crafted parses in the Chinese Treebank. This gives us a measure
of the extent to which Semantic Role Labels can be bootstrapped from the syntactic annotation
provided in the treebank. We then report experiments using a fully automatic Chinese parser
that integrates word segmentation, POS-tagging and parsing. These experiments gauge how
successful Semantic Role Labeling for Chinese can be done in realistic situations. Our results
show that when hand-crafted parses are used, SRL accuracy for Chinese is comparable to what
has been reported for the state-of-the-art English SRL systems trained and tested on the English
Proposition Bank, even though the Chinese Proposition Bank is significantly smaller in size.
When an automatic parser is used, however, the accuracy of our system is significantly lower
than the English state-of-the-art. This indicates that the improvement in Chinese parsing is
critical to high-performance Semantic Role Labeling for Chinese. Our results also show that, in
general, SRL accuracy is significantly higher for verbs than for nominalized predicates across
all experimental conditions. We believe that this is due to the fact that the mapping from
the syntactic structure to the predicate-argument structure is less transparent for nominalized
predicates than for verbs.
1. Introduction
Corpus-based approaches to semantic analysis have been an active area of research
in recent years, driven by the availability of semantically annotated corpora such as
the FrameNet (Baker, Fillmore, and Lowe 1998), the Propositon Bank (Palmer, Gildea,
and Kingsbury 2005) and Nombank (Meyers et al. 2004) projects for English, the tec-
\("give more weight”), \p("make higher”) all take two arguments, a theme that
undergoes a change of state and an external force or agent that brings about the change
of state. These verbs are uniformly annotated and they all have two numbered argu-
ments with arg0 denoting the cause and arg1 denoting the theme. It would make sense
to group these verbs together into a class and use this information as features in the
22
Xue Semantic Role Labeling of Chinese Predicates
semantic role labeling task. Having a membership in a particular class says something
about the predicate-argument structure of a verb. When a verb is absent in the training
data, which is a familiar sparse data problem, the class information may tell the system
how to label the semantic roles of the verbs belonging to a particular class.
Although to our knowledge no such classification exists for Chinese verbs based on
the predicate-argument structure, a rough classification can be automatically derived
from the frames files, which are created to guide the Propbank annotation. We classified
the verbs along three dimensions, the number of arguments, the number of framesets
and selected syntactic alternations.
Number of arguments. : Verbs in the Chinese Proposition Bank can have one to five
arguments, with the majority of them having one, two or three arguments. Verbs with
zero argument are auxiliary verbs 3 like 7("will”), � ±("be able to”), T("should”),
c("dare”), �("may”), �("be willing to”), U("can”), U ("can”), L("must”), A
�("should”) and some other light verbs. Verbs that have five arguments are change
of state verbs like ò�("lengthen”), á("shorten”), ü$("lower”), Jp("increase”), *
�("enlarge”), �("make smaller”). These verbs generally take as arguments a theme
that undergoes the change of state, the original state, the new state, the range of the
change and the cause or agent that brings about the change.
Number of framesets. : A frameset roughly corresponds to a major sense. This information
is used because it is common that the different framesets of a verb can have different
numbers of arguments. For example, verbs like ²ï("balance”) can be used either as
a non-stative verb, in which case it means “balance”, or a stative verb, in which case
3 One could say that the argument of the auxiliary verbs is the entire proposition, but in this phase of theChinese Proposition Bank, auxiliary verbs are not annotated.
23
Computational Linguistics Volume 00, Number 0
it means “balanced”. When it is used as a non-stative verb, it takes two arguments, the
thing or situation that is balanced and the balancer, the entity that maintains the balance.
When it is used as a stative verb, obviously it only takes a single argument.
Syntactic alternations. : We also represent certain type of syntactic alternations. One
salient type of syntactic alternation is the well-known “subject of intransitive / object of
the transitive” alternation described in detail in Levin (Levin 1993). Chinese verbs that
demonstrate this alternation pattern include Ñ �("publish”). For example, ù("this")
�(CL) Ö("book”) plays the same semantic role even if it is the subject in “ù/this �/CL
Ö/book Ñ �/publish /AS” and the object in “ù/this [/CL Ñ �/publishing
�/house Ñ�/publish /ASP ù/this �/CL Ö/book”.
Thus each verb will belong to a class with a symbol representing each of the three
dimensions. For example, a given verb may belong to the class “C1C2a”, which means
that this verb has two framesets, with the first frameset having one argument and the
second having two arguments. The “a” in the second frameset represents a type of
syntactic alternation. 40 classes are derived in this manner.
Such a classification scheme will undoubtedly prove to be linguistically unsophis-
ticated. Verbs that have the same number of arguments may have different types of
arguments, and the current classification system does not pick up these distinctions.
However, our experiments show that such a simple classification, when used as fea-
tures, significantly improves the semantic role labeling task.
4.3 Using automatic parses
Previous work (Sun and Jurafsky 2004) on Chinese semantic role labeling uses a parser
that assumes correct (hand-crafted) segmentation. As word segmentation is a very
challenging problem that has attracted a large body of research by itself, it is still unclear
24
Xue Semantic Role Labeling of Chinese Predicates
how well semantic role tagging in Chinese can be performed in realistic situations.
In our experiments, we use a fully automatic parser that integrates segmentation,
POS tagging and parsing. Our parser is similar to (Luo 2003). The parser is trained on
CTB5.1, using the training data described in the previous section. Tested on the held-out
test data, the labeled precision and recall are 81.83% and 82.91% respectively for all
sentences. The results are comparable with those reported in Luo (Luo 2003), but they
cannot be directly compared with most of the results reported in the literature, where
correct segmentation is assumed. In addition, in order to account for the differences
in segmentation, each character has to be treated as a leaf of the parse tree. This is
in contrast with word-based parsers where words are terminals. Since semantic role
tagging is performed on the output of the parser, only constituents in the parse tree are
candidates. If there is no constituent in the parse tree that shares the same text span
with an argument in the manual annotation, the system cannot possibly get a correct
annotation. In other words, the best the system can do is to correctly label all arguments
that have a constituent with the same text span as in the parse tree.
4.4 Results and Discussion
4.4.1 Data. In all our experiments we use the Chinese Proposition Bank Version 1.04.
This version of the Chinese Proposition Bank (Xue and Palmer 2003) consists of standoff
annotation on the first 760 articles (chtb_001.fid to chtb_931.fid) of the Chinese
Treebank5. This chunk of the data has 250K words and 10,364 sentences. The total
number of verb types in this chunk of the data is 4,854.6. Following the convention of
the English semantic role labeling experiments, we divide the training and test data by
4 This data is publicly available through the Linguistic Data Consortium.5 The most current version (CTB5.1) of the Chinese Treebank has 507K words, 825K Chinese characters,
18,716 sentences and 890 articles.6 These include the so-called stative verbs, which roughly correspond to adjectives in English.
25
Computational Linguistics Volume 00, Number 0
the number of articles, not by the verb instances. This pretty much guarantees that there
will be unseen verbs in the test data. For all our experiments on semantic role labeling on
verbs, 661 files (chtb_100.fid to chtb_931.fid) are used as training data and the
other 99 files (chtb_001.fid to chtb_099.fid) are held out as test data. However,
our parser is trained on all the data in the most current version of the Chinese Treebank
except for the test data that has been set aside. That is, in addition to the training data
for the semantic role labeling experiments, it also uses the rest of the treebank which
has not yet been propbanked.
4.4.2 Results. The results of the semantic role labeling for both hand-crafted and auto-
matic parses are presented in Table 2 (Xue and Palmer 2005). To be used in real-world
natural language applications, a semantic role tagger has to use automatically produced
constituent boundaries either from a parser or by some other means, but experiments
with hand-crafted parses will help us evaluate how much of a challenge it is to map a
syntactic representation to a semantic representation, which may very well vary from
language to language. When hand-crafted parses in the Chinese Treebank are used as
input, our system achieved an f-measure of 93.9 percent. This accuracy is achieved when
verb class information is used as features. Without the verb class features, the accuracy
drops about one percentage point. Considering that we have used a very crude verb
classification technique, there is reason to believe that this accuracy could still be higher
with more refined verb classes. This accuracy is fairly high considering the fact that the
state-of-the-art semantic role labeling systems trained on the English Proposition Bank
(Palmer, Gildea, and Kingsbury 2005) is less than 94 percent (Pradhan et al. 2004; Xue
and Palmer 2004) and the English Proposition Bank is a much larger corpus, with 1
million words. There are several facilitating factors for Chinese semantic role labeling
when hand-craft parses are provided as input. First of all, Chinese verbs appear to be
26
Xue Semantic Role Labeling of Chinese Predicates
less polysemous, at least the ones that occur in the Chinese Treebank. Of the 4854 verbs
in this version of the Chinese Proposition bank, only 62 verbs have 3 or more framesets.
In contrast, 294 verbs out of the 3300 verbs in the Penn English Proposition Bank have 3
or more framesets. When a verb is less polysemous, the arguments of the verb tend to be
realized in a more uniform manner in syntax. As a result, the argument labels are easier
to predict from their structure. Chinese seems to compensate for this fact by using a
larger number of verbs. This becomes obvious when we consider the fact that 4854 verbs
are from just 250K words and the 3300 verbs in the English Proposition Bank is from
one million words. A related fact is that adjectives in Chinese are traditionally counted
as verbs and they generally have only one argument with a much simpler syntactic
label sequence, core argument labels and phrase type sequence, repeated core argument labels
with phrase types, repeated core argument labels with phrase types and adjacency information.
We speculate that the lack of improvement is due to the fact that the constraint that
core (numbered) arguments should not have the same semantic role label for Chinese
nominalized predicates is not as rigid as it is for English verbs. The other possibility is
that for nominalized predicates fewer core arguments are realized in each proposition,
and this renders the linguistic constraint that joint-learning attempts to address
irrelevant. However further error analysis is needed to substantiate these speculations.
6. Related Work
Computational approaches to semantic interpretation have a long tradition, but the line
of research that this work follows is relatively young. Gildea and Jurafsky (2002) pro-
vided the early work on the semantic role labeling of English verbs, using the FrameNet
corpus as training and test material. Since then, there has been rapid improvement
in the SRL accuracy of English verbs, fueled by the development of the Proposition
Bank (Palmer, Gildea, and Kingsbury 2005), which annotates the verbs in the one-
million-word Penn Treebank with semantic role labels. A wide range of statistical and
machine learning techniques have been applied to the SRL of verbs, using the Propbank
as training and test material. The machine-learning techniques used include Support
Vector Machines (Pradhan et al. 2004; Tsai et al. 2005), Maximum Entropy (Xue and
Palmer 2004; Haghighi, Toutanova, and Manning 2005; Che et al. 2005; Yi and Palmer
2005), Conditional Random Fields (Cohn and Blunsom 2005), and many others. Since
semantic role labeling is a complex task based on a wide range of lower level nat-
ural language techniques, many different preprocessing, integration and combination
techniques have been explored.. The relative merits of using a full syntactic parser that
38
Xue Semantic Role Labeling of Chinese Predicates
provides hierarchical structures (Xue and Palmer 2004) vs a shallow chunker (Pradhan
et al. 2005; Hacioglu et al. 2004) has been studied extensively. Noting that parsing errors
are difficult or even impossible to recover at the semantic role labeling stage, Yi and
Palmer experimented with integrating semantic role labeling with a Maximum Entropy-
based parser (Yi and Palmer 2005), effectively treating semantic role labels as function
tags on the constituents in a parse tree. (Punyakanok, Roth, and Yih 2005a; Màrquez
et al. 2005; Tsai et al. 2005) pursued alternative approaches to make their semantic role
labeling systems more robust by combining the output of multiple systems. Most of
the early systems consider each argument on its own when assigning the semantic role
labels, allowing the theoretical possibility that more than one core argument may share
the same semantic role label, violating the linguistic constraint that the same semantic
role label cannot be assigned to more than one core argument. (Toutanova, Haghighi,
and Manning 2005) addresses this by using a joint-learning strategy to rule out such
conflicting argument labels..
While there is some work on the statistical modeling of semantic relations of noun
phrases (Lapata 2002; Moldovan and Badulescu 2005), the work on the semantic role
labeling of nominalized predicates is relatively few, compared with the large body of
literature on the semantic role labeling of verbs, with the main hindrance being the lack
of linguistic resources annotated with the predicate-argument structure of nominalized
predicates. This is expected to change with the availability of resources such as the
Nombank (Meyers et al. 2004).
Work on Chinese semantic role labeling is still at its infancy stage. Lacking a
Chinese corpus annotated with semantic roles, Sun and Jurasfky (2004) did preliminary
work on the semantic role labeling of Chinese verbs by annotating a few selected
verbs, using a Support Vectors Machine classifier. (Pradhan et al. 2004), extended that
39
Computational Linguistics Volume 00, Number 0
work to Chinese nominalizations, and reported preliminary work for analyzing the
predicate-argument structure of 630 proposition for 22 nominalizations taken from the
Chinese Treebank. As far as we know, the work reported here is the first to use sizable
Chinese semantically annotated corpora. The approach adopted in the present work
emphasizes the integration of linguistically informed heuristics and machine-learning
approaches, and the exploration of the underlining linguistic insights behind the
features used in machine-learning systems. We believe semantic role labeling provides
an ideal stage where linguistic observations can be formalized as features and fed into a
general machine-learning framework for testing and verification and natural language
technologies can be advanced in the process.
7. Conclusions and Future Work
We have presented first experimental results on Chinese semantic role labeling using the
Chinese Proposition Bank and the Chinese Nombank. We have shown that given Gold
Standard parses, Chinese semantic role labeling can be performed with considerable
accuracy on Chinese verbs. In fact, even though the Chinese Proposition Bank is a
significantly smaller corpus than the English Proposition Bank, we achieved results
that are comparable with the state-of-the-art English semantic role labeling systems.
We suggest three factors that are particularly conducive to the semantic role labeling
of Chinese verbs when the hand-crafted treebank parses are used as input. One is that
Chinese verbs tend to be less polysemous compared with English, which contributes
to a more uniform mapping between the predicate-argument structure and its syntactic
realization. Another facilitating factor is that stative verbs, which generally translate
into adjectives in English, account for a large proportion of all the verbs in the Chinese
Proposition Bank and they tend to have simple argument structures. Finally, we suggest
40
Xue Semantic Role Labeling of Chinese Predicates
that the richer structure in the Chinese Treebank makes certain aspects of the semantic
role labeling simpler. One such example is the clear structural distinction between
syntactic arguments and adjuncts makes it easier for the SRL system to differentiate core
arguments and adjuncts for Chinese verbs. These all translate into lower confusability
along the lines of (Erk and Padó 2005) in the mapping from the syntactic structure to
the semantic role labels.
When the semantic role labeling takes raw text as input, it cannot take advantage
of the rich syntactic structure of the treebank unless it can be reproduced with high
accuracy by an automatic parser. Even though our experiments using a fully automatic
parser yield promising initial results, the accuracy is significantly lower than the English
state-of-the-art. Our parsing accuracy is hampered by a significantly smaller training
set that is only half the size of the Penn Treebank. In addition, the Chinese Treebank
is almost evenly divided between two very different sources, Xinhua newswire from
mainland China and Sinorama magazine articles from Taiwan. Generally the non-
uniformity of data hurts parsing accuracy. We also suggest that there are a few inherent
linguistic properties of the Chinese language that makes syntactic parsing a particularly
challenging task. The first has to do with the fact that Chinese text does not come
with word boundaries and our parser has to build structures from characters rather
than words. The second has to do with the fact that Chinese has very little inflectional
morphology that the parser can exploit when deciding the part-of-speech tags of the
words. Both word segmentation and POS tagging difficulties will lead to parsing errors
when larger phrase structures are built.
Our experimental results also show a substantial gap between system performance
on verbs and nominalized predicates, as graphically illustrated in Figure 2. This dif-
ference can be partially attributed to the smaller corpus size of the Chinese Nombank,
41
Computational Linguistics Volume 00, Number 0
Figure 2Comparison: hand-crafted vs automatic parses, verbs vs nominalizations
with fewer instances of nominalized predicates than verbs in the underlying Chinese
Treebank, but we believe the main reason is that the semantic role labeling is more chal-
lenging for nominalized predicates than for verbs. This again can be explained in terms
of confusability in the mapping from syntactic structure to the predicate-argument
structure. In general, the NPs in the Chinese Treebank has flatter structures compared
with verbs. For example, there is no clear structural distinction between arguments
and adjuncts for nominalized predicates hat are analogous to the argument/adjunct
distinction for verbs.
There are many directions one can go from here for future work. There are many
proven techniques that can be implemented for Chinese, the most important of which
is to make Chinese parsers more robust. One thing we plan to experiment with is the
combination of multiple parsers and multiple semantic role labeling systems. We also
believe that we have not settled on an "optimal" set of features for Chinese semantic role
labeling and more language-specific customization is necessary. We believe that joint-
42
Xue Semantic Role Labeling of Chinese Predicates
learning is still a promising avenue to pursue, especially for verbs where generally more
core arguments are realized.
43
Computational Linguistics Volume 00, Number 0
Acknowledgements
We would like to thank Martha Palmer
for her comments on this manuscript and
early versions of the paper, and more
importantly for her steadfast support for
this line of research. We also would like
to thank Scott Cotton for providing a
Propbank library that greatly simplifies
our implementation. This work is sup-
ported by MDA904-02-C-0412.
ReferencesBaker, Collin F., Charles J. Fillmore, and
John B. Lowe. 1998. The BerkeleyFrameNet project. In Proceedings ofCOLING/ACL, pages 86–90, Montreal,Canada.
Burchardt, A., K. Erk, A. Frank, A. Kowalski,S. Pado, and M. Pinkal. 2006. The salsacorpus: a german corpus resource forlexical semantics. In Proceedings of LREC2006, Genoa, Italy.
Carreras, Xavier and Lluís Màrquez. 2004a.Hierarchical Recognition of PropositionalArguments with Perceptrons. InProceedings of the Eighth Conference onNatural Language Learning, Boston,Massachusetts.
Carreras, Xavier and Lluís Màrquez. 2004b.Introduction to the CoNLL-2004 SharedTask: Semantic Role Labeling. InProceedings of the Eighth Conference onNatural Language Learning, Boston,Massachusetts.
Carreras, Xavier and Lluís Màrquez. 2005.Introduction to the CoNLL-2005 SharedTask: Semantic Role Labeling. InProceedings of the Nineth Conference onNatural Language Learning, Ann Arbor,Michigan.
Che, W., T Liu, S Li, Y Hu, and H Liu. 2005.Semantic Role Labeling with a MaximumEntropy Classifier.
Chen, Keh-Jiann, Chu-Ren Huang, Feng-YiChen, Chi-Ching Luo, Ming-ChungChang, and Chao-Jan Chen. 2004. SinicaTreebank: Design Criteria,Representational Issues andImplementation. In Anne Abeillé, editor,Building and Using Parsed Corpora. Kluwer.
Cohn, Trevor and Philip Blunsom. 2005.Semantic Role Labeling with treeConditional Random Fields. In Proceedingsof CoNLL2005, Ann Arbor, Michigan.
Erk, K and S Padó. 2005. Analyzing modelsfor semantic role assignment usingconfusability. In Proceedings of HumanLanguage Technology Conference andConference on Empirical Methods in NaturalLanguage Processing, pages 891–898,Vancouver, British Columbia, Canada,October. Association for ComputationalLinguistics.
Gabbard, Ryan, Seth Kulick, and MitchellMarcus. 2006. Fully parsing the penntreebank. In Proceedings of HLT-NAACL2006, New York City.
Gildea, D. and D. Jurafsky. 2002. Automaticlabeling for semantic roles. ComputationalLinguistics, 28(3):245–288.
Gildea, Dan and Martha Palmer. 2002. TheNecessity of Parsing for PredicateArgument Recognition. In Proceedings ofthe 40th Meeting of the Association forComputational Linguistics, Philadelphia, PA.
Hacioglu, Kadri, Sameer Pradhan, WayneWard, James H. Martin, and DanielJurafsky. 2003. Shallow Semantic ParsingUsing Support Vector Machines. TechnicalReport CSLR-2003-1, Center for SpokenLanguage Research at the University ofColorado.
Hacioglu, Kadri, Sameer Pradhan, WayneWard, James H. Martin, and DanielJurafsky. 2004. Semantic Role Labeling byTagging Syntactic Chunks. In Proceedings ofCoNLL-2004, Ann Arbor.
Haghighi, Aria, Kristina Toutanova, andChristopher Manning. 2005. A Joint Modelfor Semantic Role Labeling. In Proceedingsof the Nineth Conference on Natural LanguageLearning, Ann Arbor, Michigan.
Lapata, Maria. 2002. The disambiguation ofnominalizations. Computational Linguistics,28(3):357–388.
Levin, Beth. 1993. English Verbs andAlternations: A Preliminary Investigation.Chicago: The Unversity of Chicago Press.
Luo, Xiaoqiang. 2003. A Maximum EntropyChinese Character-Based Parser. InProceedings of the 2003 Conference on
44
Xue Semantic Role Labeling of Chinese Predicates
Empirical Methods in Natural LanguageProcessing (EMNLP 2003), Sapporo, Japan.
Marcus, M., B. Santorini, and M. A.Marcinkiewicz. 1993. Building a LargeAnnotated Corpus of English: the PennTreebank. Computational Linguistics.
Màrquez, Lluís, Mihai Surdeanu, PereComas, and Jordi Turmo. 2005. A RobustCombination Strategy for Semantic RoleLabeling.
McCallum, Andrew Kachites. 2002. Mallet: Amachine learning for language toolkit.http://mallet.cs.umass.edu.
Meyers, A., R. Reeves, C. Macleod,R. Szekely, V. Zielinska, B. Young, andR. Grishman. 2004. The NomBank Project:An Interim Report. In Proceedings of theNAACL/HLT Workshop on Frontiers inCorpus Annotation, Boston, Massachusetts.
Moldovan, Dan and Adriana Badulescu.2005. A semantic scattering model for theautomatic interpretation of genitives. InProceedings of Human Language TechnologyConference and Conference on EmpiricalMethods in Natural Language Processing,pages 891–898, Vancouver, BritishColumbia, Canada, October. Associationfor Computational Linguistics.
Palmer, Martha, Daniel Gildea, and PaulKingsbury. 2005. The Proposition Bank: Anannotated corpus of semantic roles.Computational Linguistics, 31(1):71–106.
Pradhan, Sameer, Kadri Hacioglu, WayneWard, James H. Martin, and DanielJurafsky. 2003. Semantic Role Parsing:Adding Semantic Structure toUnstructured Text. In Proceedings of theInternational Conference on Data Mining(ICDM-2003).
Pradhan, Sameer, Honglin Sun, Wayne Ward,James H. Martin, and Daniel Jurafsky.2004. Parsing Arguments ofNominalizations in English and Chinese.In Proceedings of NAACL-HLT 2004, Boston,Mass.
Pradhan, Sameer, Wayne Ward, KadriHacioglu, James H. Martin, and DanJurafsky. 2005. Semantic Role Labelingusing different syntactic views. InProceedings of ACL 2005, Ann Arbor,Michigan.
Pradhan, Sameer, Wayne Ward, KadriHacioglu, James H. Martin, and DanielJurafsky. 2004. Shallow Semantic ParsingUsing Support Vector Machines. InProceedings of NAACL-HLT 2004, Boston,Mass.
Punyakanok, Vasin, Dan Roth, and W. Yih.2005a. Generalized Inference with
Multiple Semantic Role Labeling Systems.In Proceedings of the Nineth Conference onNatural Language Learning, Ann Arbor,Michigan.
Punyakanok, Vasin, Dan Roth, and W. Yih.2005b. The necessity of syntactic parsingfor semantic role labeling. In Proceedings ofIJCAI-2005, Edinburgh, UK.
Sgall, Petr, Jarmila Panevová, and EvaHajicová. 2004. Deep SyntacticAnnotation: TectogrammaticalRepresentation and Beyond. In A. Meyers,editor, Proceedings of the HLT-NAACL 2004Workshop: Frontiers in Corpus Annotation,pages 32–38, Boston, Massachusetts, USA.Association for Computational Linguistics.
Shen, Libin and Aravind K. Joshi. 2004.Flexible Margin Selection for Rerankingwith Full Pairwise Samples. In Proceedingsof IJCNLP-2004, pages 446–455.
Sun, Honglin and Daniel Jurafsky. 2004.Shallow Semantic Parsing of Chinese. InProceedings of NAACL 2004, Boston, USA.
Toutanova, Kristina, Aria Haghighi, andChristopher Manning. 2005. Joint LearningImproves Semantic Role Labeling. InProceedings of ACL-2005.
Tsai, Tzong-Han, Chia-Wei Wu, Yu-ChunLin, and Wen lian Hsu. 2005. Exploitingfull parsing information to label SemanticRoles using an ensemble of ME and SVMvia Integer Linear Programming.
Xue, Nianwen. 2006a. Annotating thepredicate-argument structure of Chinesenominalizations. In Proceedings of the fifthinternational conference on LanguageResources and Evaluation, Genoa, Italy.
Xue, Nianwen. 2006b. Semantic role labelingof nominalized predicates in chinese. InProceedings of HLT-NAACL 2006, New YorkCity, NY.
Xue, Nianwen and Martha Palmer. 2003.Annotating the Propositions in the PennChinese Treebank. In The Proceedings of the2nd SIGHAN Workshop on Chinese LanguageProcessing, Sapporo, Japan.
Xue, Nianwen and Martha Palmer. 2004.Calibrating features for Semantic RoleLabeling. In Proceedings of 2004 Conferenceon Empirical Methods in Natural LanguageProcessing, Barcelona, Spain.
Xue, Nianwen and Martha Palmer. 2005.Automatic Semantic Role Labeling forChinese verbs. In Proceedings of theNineteenth International Joint Conference onArtificial Intelligence, Edinburgh, Scotland.
Xue, Nianwen, Fei Xia, Fu dong Chiou, andMartha Palmer. 2005. The Penn ChineseTreeBank: Phrase Structure Annotation of
45
Computational Linguistics Volume 00, Number 0
a Large Corpus. Natural LanguageEngineering, 11(2):207–238.
Yi, S. and M. Palmer. 2005. The integration ofsyntactic parsing and semantic rolelabeling.