排序学习中的领域自适应研究

英文题名：Domain Adaptation for Learning to Rank
作者：蔡鹏
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：领域自适应 ; 排序学习 ; 文档权重 ; 查询权重 ; 主动学习 ; 领域独立特征 ; 语义实体
英文关键词：Domain Adaptation ; Learning to Rank ; Document Weighting ; Query Weighting ; Active Learning ; Domain Independent Feature ; Semantic Entity
学位年度：2011
导师：周傲英
学科代码：081203
学位授予单位：华东师范大学
论文提交日期：2011-04-01

摘要

随着监督式机器学习技术在各个领域的广泛应用,研究人员逐渐意识到,训练数据的缺乏是阻碍学习模型快速部署的关键因素之一。最近几年,如何解决训练数据缺乏的问题,已经成为机器学习,自然语言处理,信息检索、多媒体等领域的研究热点。
     排序学习是信息检索的关键问题之一。目前,基于监督的机器学习技术被认为是解决排序学习的最佳选择。如同传统监督式学习技术,目标领域缺乏训练数据也是排序学习正在面临的实际问题。针对排序学习,我们研究了如何利用其他相关领域的己有训练数据,学习出适用于目标领域的模型,即领域自适应。
     本文的主要贡献有以下几个方面：
     1.提出了基于文档权重的排序学习自适应框架。首先,利用领域分隔超平面估计源领域文档对目标领域的重要性；然后,把这些文档权重转换成文档对权重；最终,文档对权重可以集成到基于文档对的排序学习算法中。
     2.研究了著名的排序学习算法RankBoost的领域自适应问题。在基于文档权重的排序自适应框架下,提出了三种基于文档权重的RankBoost算法,并分别对它们进行了理论分析和试验比较。
     3.提出了直接在查询层次估计源领域查询对目标领域重要性的方法。在排序学习中,查询是带有相关性标签的文档集合,是排序学习的基本对象。我们分别从两个不同的角度进行查询权重估计：(1)将查询压缩成特征向量,然后采用传统的权重估计方法进行查询权重估计。(2)对每个源领域的查询,把它依次和目标领域查询进行比较；通过集成这些两两比较的结果,估计源领域查询对于目标领域的重要程度。
     4.提出了基于主动学习的排序学习自适应算法。为了获得目标领域特有的排序知识,采用主动学习技术,选择少量目标领域具有信息量的查询进行标注。这些查询可以弥补源领域所缺失的目标领域排序知识,同时,利用这些目标查询评估源查询对目标领域的重要性权重,从而充分利用源领域的训练数据。
     5.将领域自适应技术应用在语义实体识别中,提出了利用领域独立特征来增强领域自适应能力的方法。传统实体识别仅仅使用短文本特征,当训练文档和测试文档在风格上稍有差异时,性能便有明显下降。为解决该问题,我们设计了一个CRF与SVM的组合框架,通过该框架,短文本特征与领域独立特征可以有效的集成,最终获得的识别模型能够更好地适用于目标领域。
     针对不同的应用场景,本文研究了排序学习中的领域自适应问题。在目标领域没有标注数据的场景下,我们从样本权重的角度,研究了基于权重的排序学习领域自适应；在目标领域存在少量标注预算的场景下,我们研究了基于主动学习的排序自适应算法。另外,我们还研究了领域自适应在语义实体识别中的应用。从特征的角度,我们探讨了基于领域独立特征的语义实体识别领域自适应。我们在标准数据集上评价了算法的有效性。
     在实际应用中,比如多媒体新闻推荐,热点事件检测,情感分析,通用搜索,垂直搜索等等,通过本文提出的领域自适应技术,可以充分利用已有其他相关领域的标注数据,降低目标领域的标注成本,同时获得满意的目标领域模型。
With the widespread use of supervised machine learning techniques in many fields, researchers have considered that the scarce of training data in target domain becomes one of the important reasons which prevent us from quickly deploying a learned model. In re-cent years, how to resolve this problem has became a hot topic in several communities such as machine learning, natural language processing, information retrieval and multimedia.
     Learning to rank is one of the key problems in information retrieval. To date the techniques based on supervised learning have been regarded as the best choice for learning to rank. However, like traditional supervised learning, we also need to resolve the similar problem in ranking learning, i.e. the lack of training data in target domain. For this purpose, we have investigated on how to effectively use labeled data in related domains to learn a model for target domain, which is referred as domain adaptation.
     The contributions of this paper include:
     1. We proposed a framework based on document weighting for learning to rank.. First, we estimate the source document's importance to target domain by domain separator; Then, the weight of document can be transformed to the weight of document pair, which may be integrated into pairwise ranking algorithms.
     2. We investigated the adaptation problem of RankBoost, a famous ranking algorithm. In the framework of ranking adaptation based on document weighting, we proposed three versions of weight-based RankBoost, and made theoretical analysis and empir-ical comparisons.
     3. We proposed to estimate the source query's importance to target domain at query level. In learning to rank, the learned object is a query which contains a set of retrieved documents with relevance label. We estimate the query importance from two distinct perspectives:(1) The query can be compressed into a feature vector, and then we use traditional approaches to estimate the query importance. (2) For each source query, we measure the similarity between the source query and each target query, and then combine these fine-grained similarity values to estimate its importance.
     4. We proposed an algorithm of domain adaptation based on active learning. For obtain-ing target domain-specific ranking knowledge, we adopt active learning techniques to select a small number of informative target queries to label. These queries can provide the domain-specific knowledge which not contained in source domain. Simultaneously, we use these target queries to evaluate the importance weight of source queries such that the source training data can be reused.
     5. We applied the technique of domain adaptation to semantic entity detection, and proposed to improve the adaptation capability using domain independent features. Traditionally, only short context features are used in entity detection. The perfor-mance is degraded when the genre of test documents is different from that of training documents. To resolve this problem, we designed the framework combining CRF and SVM. The framework can effectively integrate short context and domain independent features, and thus the learned model can adapt well to target domain.
     In this thesis, we studied the domain adaptation problem in learning to rank under dif-ferent scenarios. When labeled data are unavailable in target domain, we investigated the weight-based adaptation for learning to rank from the instance weighting point of view. When target domain is allocated limited budget, we explored the ranking adap-tation based on active learning technique. Additionally, we also studied the application of domain adaptation to semantic entity detection. From the perspective on feature, we explored the adaptation of semantic entity detection using domain independent features. The effectiveness of all proposed algorithms are evaluated on several benchmark dataset.
     In real applications such as multimedia news recommendation, hot event detection, sentiment analysis, web search, vertical search and so on, with the proposed domain adaption algorithms, existing labeled data in related domain can be effectively used to learn a model for target domain, such that the label cost may be saved.

引文

[1]Bianca Zadrozny. Learning and evaluating classifiers under sample selection bias. Proceed-ings of ICML.2004,114-121
    [2]Masashi Sugiyama, Shinichi Nakajima, Hisashi Kashima. Paul von Bunau, Motoaki Kawan-abe. Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation. Proceedings of NIPS.2007
    [3]Jing Jiang, ChengXiang Zhai. Instance Weighting for Domain Adaptation in NLP. Proceed-ings of ACL.2007,264-271
    [4]Sinno Jialin Pan, Ivor W. Tsang, James T. Kwok, Qiang Yang. Domain Adaptation via Transfer Component Analysis. Proceedings of the 21st International Jont Conference on Artifical Intelligence.2009,1187-1192
    [5]John Blitzer, Ryan Mcdonald. Fernando Pereira. Domain adaptation with structural corre-spondence learning. Proceedings of EMNLP.2006,120-128
    [6]R. Nallapati A. Arnold, W. W. Cohen. Exploiting feature hierarchy for transfer learning in named entity recognition. ACL-HLT.2008
    [7]Hal Daume III. Frustratingly Easy Domain Adaptation. Proceedings of ACL.2007,256-263
    [8]Keke Chen, Rongqing Lu, C.K. Wong, Gordon Sun, Larry Heck, Belle Tseng. Trada:Tree Based Ranking Function Adaptation. Proceedings of CIKM.2008,1143-1152
    [9]Jianfeng Gao, Qiang Wu, Chris Burges, Krysta Svore, Yi Su, Nazan Khan, Shalin Shah, Hongyan Zhou. Model Adaptation via Model Interpolation and Boosting for Web Search Ranking. Proceedings of EMNLP.2009,505-513
    [10]Hal Daume Ⅲ, Daniel Marcu. Domain Adaptation for Statistical Classifiers. Journal of Artificial Intelligence Research.2006,26(1):101-126
    [11]James J. Heckman. Sample Selection Bias As a Specification Error. Econometrica.1977, 47
    [12]H. Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference.2000,90(2):227-244
    [13]Sinno Jialin Pan, Qiang Yang. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering. October 2010,22(10):1345-1359
    [14]Hidetoshi Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference.2000,90:227-244
    [15]Bianca Zadrozny Zadrozny. Learning and Evaluating Classifiers under Sample Selection Bias. ICML.2004,903-910
    [16]Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Scholkopf, Alexander J. Smola. A Kernel Method for the Two-Sample-Problem. NIPS.2006.513-520
    [17]Jiayuan Huang. Arthur Gretton, Bernhard Sch?lkopf, Alexander J. Smola, Karsten M. Borg-wardt. Correcting sample selection bias by unlabeled data. In NIPS. MIT Press,2007
    [18]Wenyuan Dai, Qiang Yang, Gui rong Xue, Yong Yu. Boosting for transfer learning. In ICML.2007
    [19]Da,vid Pardoe, Peter Stone. Boosting for Regression Transfer. Proceedings of the 27th International Conference on Machine Learning (ICML 2010).2010
    [20]Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, Yong Yu. Spectral Domain-Transfer Learning. Proceedings of the SIGKDD.2008
    [21]Sinno Jialin Pan, Xiaochuan Ni, Jian tao Sun, Qiang Yang, Zheng Chen. Cross-domain sentiment classification via spectral feature alignment. In WWW.2010
    [22]Erheng Zhong, Wei Fan, Jing Peng, Kun Zhang, Jiangtao Ren, Deepak Turaga, Olivier Verscheure. Cross domain distribution adaptation via kernel mapping. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. KDD'09,2009,1027-1036
    [23]Sandeepkumar Satpal, Sunita Sarawagi. Domain Adaptation of Conditional Probability Models Via Feature Subsetting. PKDD.2007,224-235
    [24]Shai Ben-David. John Blitzer, Koby Crammer, Fernando Pereira. Analysis of Representa-tions for Domain Adaptation. NIPS.2006,137-144
    [25]Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, Jennifer Wortman Vaughan. A theory of learning from different domains. Machine Learning.2010, 79(1-2):151-175
    [26]Yishay Mansour. Mehryar Mohri, Afshin Rostamizadeh. Domain Adaptation:Learning Bounds and Algorithms. CoRR.2009, abs/0902.3430
    [27]Ciprian Chelba, Alex Acero. Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lot. EMNLP.2004,285-292
    [28]Xiao Li, Jeff Bilmes. A Bayesian divergence prior for classifier adaptation. In Proceedings of the 11th International Conference on Artificial Intelligence and Statistics.2007
    [29]Bo Long: Sudarshan Lamkhede, Srinivas Vadrevu, Ya Zhang, Belle L. Tseng. A risk mini-mization framework for domain adaptation. CIKM.2009,1347-1356
    [30]Tie-Yan Liu. Learning to Rank for Information Retrieval. Foundations and Trends in Information Retrieval.2009,3(3):225-331
    [31]Ricardo A. Baeza-Yates, Berthier Ribeiro-Neto. Modern Information Retrieval.1999
    [32]G. Salton, C. Buckley. Term-weighting Approaches in Automatic Text Retrieval. Information Processing and Management.1988,24(5):513-523
    [33]S. E. Robertson, S. Walker, M. M. Hancock-Beaulieu, M. Gatford. OK API at TREC-3. Proceedings of TREC-3.1995,109-128
    [34]J. Ponte, W. B. Croft. A Language Modeling Approach to Information Retrieval. Proceed-ings of SIGIR.1998,275-281
    [35]C. X. Zhai, J. Lafferty. A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. Proceedings of SIGIR.2001,334-342
    [36]Sergey Brin, Lawrence Page. The anatomy of a large-scale hypertextual Web search engine. Proceedings of the seventh international conference on World Wide Web 1998.107-117
    [37]Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms.1998.668-677
    [38]C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, G. Hullender. Learn-ing to Ra.nk using Gradient Descent. Proceedings of ICML.2005,89-96
    [39]Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, Hang Li. Learning to rank:from pairwise approach to listwise approach. Proceedings of ICML,2007.129-136
    [40]Y. Freund, R. Iyer, R. Schapire, Y. Singer. An Efficient, Boosting Algorithm for Combining Preferences. Journal of Machine Learning Research.2004,4:933-969
    [41]R. Herbrich, T. Graepel, K. Obermayer. Large Margin Rank Boundaries for Ordinal Re-gression. MIT Press, Cambridge,2000
    [42]Y. Yue, T. Finley. F. Radlinski, T. Joachims. A Support Vector Method for Optimizing Average Precision. Proceedings of SIGIR.2007.271-278
    [43]Ping Li, Christopher J. C. Burges, Qiang Wu. McRank:Learning to Rank Using Multiple Classification and Gradient Boosting. NIPS.2007
    [44]Koby Crammer, Yoram Singer. Pranking with Ranking. NIPS.2001
    [45]D. Sculley. Combined regression and ranking. KDD.2010,979-988
    [46]Ralf Herbrich. Thore Graepel, Klaus Obermayer. Large margin rank boundaries for ordinal regression, MIT Press, Cambridge, MA,2000
    [47]Jun Xu, Hang.Li. AdaRank:a boosting algorithm for information retrieval. SIGIR.2007, 391-398
    [48]Christopher J. C. Burges, Robert Ragno, Quoc Viet Le. Learning to Rank with Nonsmooth Cost, Functions. NIPS.2006,193-200
    [49]Maksims Volkovs, Richard S. Zemel. BoltzRank:learning to maximize expected ranking gain. ICML.2009
    [50]Hamed Valizadegan, Rong Jin, Ruofei Zhang, Jianchang Mao. Learning to Rank by Opti-mizing NDCG Measure. NIPS.2009
    [51]Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang., Hang Li. Listwise approach to learning to rank:theory and algorithm. ICML.2008,1192-1199
    [52]D. Sculley. Large scale learning to rank. NIPS 2009 Workshop on Advances in Ranking. 2009
    [53]Nick Craswell, David Hawking, Ross Wilkinson, Mingfang Wu. Overview of the TREC 2003 Web Track. Proceedings of TREC-2003,2003.78-92
    [54]"Andreas Argyriou, Theodoros Evgeniou, Massimiliano Pontil". Multi-Task Feature Learn-ing. Proceedings of NIPS.2006,41-48
    [55]Kevin Duh, Katrin Kirchhoff. Learning to Rank with Partially-Labeled Data. Proceedings of SIGIR.2008,251-258
    [56]Bo Geng, Linjun Yang, Chao Xu, Xia.n-Sheng Hua. Ranking Model Adaptation for Domain-Specific Search. Proceedings of CIKM.2009,197-206
    [57]Depin Chen, Jun Yan, Gang Wang, Yan Xiong, Weiguo Fan, Zheng Chen. TransRank:A Novel Algorithm for Transfer of Rank Learning. IEEE International Conference on Data Mining Workshops.2008,106-115
    [58]Depin Chen, Yan Xiong, Jun Yan, Gui-Rong Xue. Gang Wang, Zheng Chen. Knowledge Transfer for Cross Domain Learning to Rank. Information Retrieval.2009,13(3):236-253
    [59]Bo Wang, Jie Tang, Wei Fan. Songcan Chen. Zi Yang, Yanzhu Liu. Heterogeneous Cross Domain Ranking in Latent Space. Proceedings of CIKM.2009,987-996
    [60]Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, Keke Chen. Cross-Market Model Adaptation with Pairwise Preference Data for Web Search Ranking. COLING (Posters). 2010,18-26
    [61]Keke Chen, Jing Bai, Srihari Reddy, Belle L. Tseng. On Domain Similarity and Effectiveness of Adapting-to-rank. Proceedings of CIKM.2009,1601-1604
    [62]John C. Platt, John C. Platt. Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods. Advances in Large Margin Classifiers. MIT Press,1999,61-74
    [63]V. N. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag,1995
    [64]T. Truong, M.-R. Amini, P. Gallinari. Learning to rank with partially labeled training data. International Conference on Multidisciplinary Inforation Science and Technolgy.2006
    [65]Tao Qin, Tie-Yan Liu, Jun Xu, Hang Li. LETOR:A Benchmark Collection for Research on Learning to Rank for Information Retrieval. Information Retrieval.2010
    [66]Ellen M. Voorhees. Overview of TREC 2004, Proceedings of TREC-2004.2004,1-12
    [67]S. Shalev-Shwartz, Y. Singer, N. Srebro. Pegasos:Primal Estimated sub-GrAdient SOlver for SVM. Proceedings of ICML.2007,807-814
    [68]S. Roberson. The Probability Ranking Principle in IR. Journal of Documentation.1977, 33(4):294-304
    [69]K. Jarvelin, J. Kekalainen. IR Evaluation Methods for Retrieving Highly Relevant Docu-ments. Proceedings of SIGIR.2000,41-48
    [70]Jerome H. Friedman. Greedy Function Approximation:A Gradient Boosting Machine. Annals of Statistics.2000,29:1189-1232
    [71]Jun Xu, Hang Li. AdaRank:a boosting algorithm for information retrieval. Proceedings of SIGIR.2007,391-398
    [72]Zhaohui Zheng, Hongyuan Zha, Tong Zhang, Olivier Chapelle, Keke Chen, Gordon Sun. A General Boosting Method and its Application to Learning Ranking Functions for Web Search. Proceedings of NIPS.2007
    [73]Chenguang Zhu, Weizhu Chen, Zeyuan Allen Zhu, Gang Wang, Dong Wang, Zheng Chen. A general magnitude-preserving boosting algorithm for search ranking. Proceeding of CIKM. 2009,817-826
    [74]Massih Reza Amini, Tuong Vinh Truong, Cyril Goutte. A boosting algorithm for learning bipartite ranking functions with partially labeled data. Proceedings of SIGIR.2008,99-106
    [75]Shai Ben-David. John Blitzer, Koby Crammer, Alex Kulesza. Fernando Pereira, Jenn Wort-man. A theory of learning from different domains. Machine Learning.2009
    [76]Wei Gao, Peng Cai, Kam-Fai Wong, Aoying Zhou. Learning to Rank Only Using Training Data from Related Domain. Proceedings of SIGIR.2010,162-169
    [77]R. Tibshirani T. Hastie, J. H. Friedman. The Elements of Statistical Learning. Springer Verlag, August 2001
    [78]Ellen M. Voorhees. Overview of TREC 2003. Proceedings of TREC-2003.2003,1-13
    [79]Xiubo Geng, Tie-Yan Liu. Tao Qin, Andrew Arnold, Hang Li, Heung-Yeung Shum. Query dependent ranking using K-nearest neighbor. SIGIR.2008
    [80]Somnath Banerjee, Avinava Dubey, Jinesh Machchhar, Soumen Chakrabarti. Efficient and Accurate Local Learning for Ranking. Learning to rank for information retrieval (LR4IR 2009).2009
    [81]David D. Lewis, William A. Gale. A sequential algorithm for training text classifiers. Pro-ceedings of the 17th annual international conference on Research and development in infor-mation retrieval. SIGIR'94,1994
    [82]Thomas M. Cover, Joy A. Thomas. Elements of information theory. Wiley-Interscience, 1991
    [83]Atsushi Fujii, Kentaro Inui, Takenobu Tokunaga, Hozumi Tanaka. Selective Sampling for Example-based Word Sense Disambiguation. Computational Linguistics.1998,24(4):573-597
    [84]Michael Lindenbaum, Shaul Markovitch, Dmitry Rusakov. Selective Sampling for Nearest Neighbor Classifiers. Machine Learning.2004,54(2):125-152
    [85]Simon Tong, Daphne Koller. Support Vector Machine Active Learning with Applications to Text Classification. ICML.2000,999-1006
    [86]H. S. Seung, M. Opper, H. Soinpolinsky. Query by committee. Proceedings of the fifth annual workshop on Computational learning theory.1992,287-294
    [87]Andrew Mccallum, Kamal Nigam. Employing EM in Pool-Based Active Learning for Text Classification. Proceedings of ICML.1998,350-358
    [88]Naoki Abe, Hiroshi Mamitsuka. Query Learning Strategies Using Boosting and Bagging. Proceedings of ICML.1998,1-9
    [89]Ido Dagan. Sean P. Engelson. Committee-Based Sampling For Training Probabilistic Clas-sifiers. Proceedings of ICML.1995,150-157
    [90]Burr Settles, Mark Craven. Soumya Ray. Multiple-Instance Active Learning. NIPS.2007
    [91]Nicholas Roy, Andrew McCallum. Toward Optimal Active Learning through Sampling Es-timation of Error Reduction. ICML.2001,441-448
    [92]David A. Cohn. Neural Network Exploration Using Optimal Experiment Design. NIPS. 1993,679-686
    [93]Burr Settles, Mark Craven. An Analysis of Active Learning Strategies for Sequence Labeling Tasks. EMNLP.2008,1070-1079
    [94]Hwanjo Yu. SVM selective sampling for ranking with application to data retrieval. Proceed-ings of SIGKDD.2005,354-363
    [95]Pinar Donmez, Jaime G. Carbonell. Optimizing estimated loss reduction for active sampling in rank learning. Proceedings of ICML.2008,248-255
    [96]Pinar Donmez. Jaime G. Carbonell. Active Sampling for Rank Learning via Optimizing the Area under the ROC Curve. Proceedings of ECIR.2009,78-89
    [97]Bo Long. Olivier Chapelle, Ya Zhang, Yi Chang, Zhaohui Zheng, Belle L. Tseng. Active learning for ranking through expected loss optimization. Proceedings of SIGIR.2010,267-274
    [98]Yee Seng Chan, Hwee Tou Ng. Domain Adaptation with Active Learning for Word Sense Disambiguation. Proceedings of ACL.2007,49-56
    [99]Piyush Rai, Avishek Saha, Hal Daume Ⅲ, Suresh Venkatasubramanian. Domain adaptation meets active learning
    [100]Xiaoxiao Shi, Wei Fan, Jiangtao Ren. Actively Transfer Domain Knowledge. Proceedings of ECML and KDD.2008,342-357
    [101]Huan Li, Yuan Shi, Ming-Yu Chen, Alexander G. Hauptmann, Zhang Xiong. Hybrid ac-tive learning for cross-domain video concept detection. Proceedings of the international conference on Multimedia.2010,1003-1006
    [102]L. Yang, L. Wang, B. Geng, X.-S. Hua. Query sampling for ranking learning in web serach. Proceedings of SIGIR.2009,754-755
    [103]E. Yilmaz, S. Robertson. Deep versus shallow judgments in learning to rank. Proceedings of SIGIR.2009,662-663
    [104]B. Carterette, J. Allan, R. Sitarama.n. Minimal test collections for retrieval evaluation. Proceedings of SIGIR.2006,268-275
    [105]Olivier Chapelle, Donald Metzler, Ya Zhang, Pierre Grinspan. Expected.Reciprocal Rank for Graded Relevance. Proceedings of CIKM.2009,621-630
    [106]http://www.ldc.upenn.edu/Catalog/docs/LDC2005T33/BBN-Types-Subtypes.html
    [107]A. Valencia M. Krallinger, L. Hirschman. Linking genes to literature:text mining, informa-tion extraction,and retrieval applications for biology.2008,9(Suppl 2):S8
    [108]http://www.itl.nist.gov/iad/mig/tests/ace/
    [109]Cha.rles Sutton. Andrew McCa.llum. An Introduction to Conditional Random Fields for Re-lational Learning. Lise Getoor, Ben Taskar, (Editors) Introduction to Statistical Relational Learning, MIT Press,2006
    [110]Andrew Borthwick, John Sterling, Eugene Agichtein, Ralph Grishman. NYU:Description of the MENE Named Entity System as Used in MUC-7. In Proceedings of the Seventh Message Understanding Conference MUC-7.1998
    [111]Satoshi Sekine. NYU:Description of the Japanese NE system used for MET-2. Proc. of the Seventh Message Understanding Conference MUC-7.1998
    [112]Hideki Isozaki, Hideto Kazawa. Efficient Support Vector Classifiers for Named Entity Recog-nition. In Proceedings of the 19th International Conference on Computational Linguistics. 2002,390-396
    [113]Masayuki Asahara, Yuji Matsumoto. Japanese Named Entity Extraction with Redundant Morphological Analysis. HLT-NAACL.2003
    [114]Daniel M. Bikel, Scott Miller, Richard Schwartz, Ralph Weischedel. Nymble:a high-performance learning name-finder. Proceedings of the fifth conference on Applied natural language processing.1997,194-201
    [115]Andrew Mccallum, Wei Li. Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons. CoNLL.2003
    [116]John D. Lafferty, Andrew McCalluin, Fernando C. N. Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML.2001,282-289
    [117]W. Chen, Y. Zhang, H. Isahara. Chinese named entity recognition with conditional random fields.5th SIGHAN Workshop on Chinese Language Processing.2006
    [118]R.adu Florian, Abe Ittycheriah, Hongyan Jing, Tong Zhang. Named Entity Recognition through Classifier Combination. In Proceedings of CoNLL-2003.2003,168-171
    [119]Asif Ekbal, Siva.ji Ba.ndyopadhyay. Bengali Named Entity Recognition Using Classifier Com-bination. ICAPR.2009,259-262
    [120]Jing Jiang, ChengXiang Zhai. Exploiting Domain Structure for Named Entity Recognition. HLT-NAACL.2006
    [121]Andrew Arnold, Ramesh Nallapati, William W. Cohen. Exploiting Feature Hierarchy for Ti'ansfer Learning in Named Entity Recognition. ACL.2008
    [122]Dan Wu, Wee Sun Lee, Nan Ye, Hai Leong Chieu. Domain adaptive bootstrapping for named entity recognition. EMNLP.2009
    [123]Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Xian Wu, Zhong Su. Domain Adapta-tion with Latent Semantic Association for Named Entity Recognition. HLT-NAACL.2009
    [124]Laura Chiticariu, Rajasekar Krishnamurthy, Yunyao Li, Frederick Reiss, Shivakumar Vaithyanathan. Domain Adaptation of Rule-Based Annotators for Named-Entity Recog-nition Tasks. EMNLP.2010
    [125]Asad B. Sayeed, Timothy J. Meyer, Hieu C. Nguyen, Olivia Buzek, Amy Weinberg. Crowd-sourcing the evaluation of a domain-adapted named entity recognition system. HLT-NAACL. 2010,345-348
    [126]Hangzai Luo, Peng Cai, Wei Gong, Jianping Fan. Semantic Entity-Relationship Model for Large-Scale Multimedia News Exploration and Recommendation. MMM.2010,522-532