基于本体知识库的教学资源自动采集技术研究

作者：田俊华
论文级别：博士
学科专业名称：教育技术学
中文关键词：主题爬行技术 ; 网络生态链算法 ; 教育本体知识库 ; 文本自动分类 ; 文本自动抽取 ; Web教学资源 ; 主题网站群
英文关键词：Topic Crawling Technology ; Network Ecological Chain Algorithm ; Educational Ontology Knowledge Repository ; Automatic Text Categorization ; Automatic Text Extraction ; Web Teaching Resource ; Topic-Web Group
学位年度：2011
导师：李艺
学科代码：040110
学位授予单位：南京师范大学
论文提交日期：2011-03-15

摘要

Web信息资源已十分丰富,利用技术手段对Web上的教育资源进行自动采集,从而形成各种教学资源库,为教学活动提供信息资源服务,这无疑对促进教育信息化建设起到巨大的促进作用。但面对日益庞大的Web规模和越来越复杂的页面结构,研究如何在有限的网络资源和采集规模下,高效地从Internet采集教学资源,具有重要的学术意义和实践价值。
     本文对主题信息资源自动采集技术进行了系统研究,讨论了主题爬行技术、文本自动分类技术、文本自动抽取技术、本体及本体知识推理技术等,并深入讨论了这些技术在Web教学资源自动采集中的应用。
     本文以生态学的视角分析了Web主题资源的分布规律,提出了网络生态链(Network Ecological Chain)理论,并据此设计了网络生态链算法。提出了把网站的主题特性判断与具体的链接目标预测相结合的自上而下的主题信息资源采集方法：即通过网络生态链算法,辅以文本自动分类、文本自动抽取和本体知识推理等技术,首先从Web中发现主题网站群,然后结合网站、页面及链接邻近块文本的主题特性,再用主题爬行算法对具体的链接目标进行选择性采集。这样,可以有效地解决主题爬行中的方向迷失问题,提高主题信息资源采集的收获比(Harvest Rate)。
     为了提高主题爬行中对链接目标的预测能力,本文重点研究了本体(Ontology)技术及其在Web教学资源自动采集中的应用。讨论了本体语言、本体的构建方法及本体开发技术,尝试性地构建了教育本体知识库,开发了教育本体知识推理引擎,探索了教育本体知识推理引擎的具体应用。由于本体具有开放性和标准化的特点,因而教育本体知识库的构建可以通过共建共享的方式实现知识复用。
     最后,设计开发了一个Web教学资源自动采集原型系统,并以德育教学资源自动采集为例,验证了各种技术的有效性。
     本文的主要工作和创新之处主要有：系统研究了主题信息资源自动采集技术；提出了网络生态链理论,设计了网络生态链算法,并通过实验数据验证了其有效性；把本体技术应用于教育知识库的构建中,尝试性地开发了教育本体知识推理引擎,探索了它在Web教学资源自动采集中的应用。本文的研究可以为相关系统的设计开发提供了一定的理论指导和技术支持。
The Web Information resources are extremely rich. No doubt, it will be a great advancement in education informationization by automatically collecting educational resources on the Web with technological means, in order to construct various teaching resource repositories and provide information resource services. However, in face of the increasing Web scale as well as increasingly complicated page structures, the research on how to efficiently collect educational resources from Internet with limited resource pool and restricted acquisition scale has important academic meaning and practice value.
     This paper systematically researches into automatic acquisition technology of the topic information resources, discusses the topic crawling technology, automatic text categorization, automatic text extraction, ontology and ontology knowledge inference and so on. Finally, the paper discusses the application of those technologies in automatically acquiring Web teaching resources.
     This paper analyzes distribution law of Web topic resources from ecological perspective, provides the theory of Network Ecological Chain, designs Network Ecological Chain Algorithm based on this theory. The paper provides a top-down method of acquiring topic information resources, which combines the judgment of site's topic characteristic and the prediction of specific link target. That's to say, through Network Ecological Chain Algorithm, assisted by such technologies as Automatic Text Categorization, Automatic Text Extraction and Ontology Knowledge Inference, we firstly find topic-web group from the Web, then based on the topic characteristic of web sites, Pages and text adjacent to links, we selectively collect specific link targets with topic crawling algorithm. In this way, we can solve disoriented problem in topic crawling effectively, and increase the harvest rate in acquiring topic information resources.
     To promote the predict ability of link targets in topic crawling, this paper mainly study ontology technology and its application in automatically acquiring web teaching resources. It discusses ontology language, construction method of ontology and ontology development. We tentatively build educational ontology knowledge repository, develops an educational ontology knowledge inference engine, and explore the specific application of this engine. Considering the Openness and standardization of ontology, the construction of educational ontology knowledge repository can bring about knowledge reuse through co-construction and sharing. At last, the paper designs and develops an automatic Web teaching resources acquisition prototype system, and verifies the efficiency of all technologies with an example of automatically acquiring moral educational resources.
     The main work and innovation of this paper are as follows:it systematically studies the automatic acquisition technology of topic information resources, provides the theory of network ecological chain, designs network ecological chain algorithm and verifies its efficiency, applies ontology technology to the construction of educational knowledge repository, tentatively develop an educational ontology knowledge inference engine and explorer its application in automatic Web teaching resources acquisition. The research in this paper can provide sort of theoretical guide and technical support.

引文

9 http://http://www.wordnet.org/(2011.3)
    10 http://www.icsi.berkeley.edu/-framenet/(2010.3)
    11 http://www.darmstadt.gmd.de/publish/komet/gen-um/newUM.html (2002.11)
    12 http://www.isi.edu/natural-Ianguage/resources/sensus.html (2011.3)
    13 http://crl.nmsu.edu/research/projects/mikro/(2005.6)
    14 http://www.keenage.com (2010.12)
    [1]Langly P. Elements of machine learning[M]. CA:Morgan Kaufmann,2001
    [2]Tom M Mitchell. Machine learning[M]. McGraw-Hill Companies,1997
    [3]Martin Anthony, Norman Biggs. Computational Learning Theory:An Introduction[C] Cambridge Univ. Press,1997
    [4]Douglas E. Appelt, David J. Israel. Introduction to Information Extraction Technology[C], IJCAI-99 Tutorial Stockholm Sweden,1999
    [5]Ralph Grishman. Information extraction:Techniques and Challenges.InMaria Teresa Pazienza, editor, Information Extraction. Springer Verlag, Lecture Nots in Artificial Intelligence, Room,1997
    [6]Zaiqing Nie, Jirong Wen, Weiying Ma. Object-level Vertical Search[C].3rd Biennial Conferene on Innovative Data Systems Research(CIDR),2007,235-246
    [7]Chiasen Chung, Charles L A Clarke. Topic-Oriented Collaborative Crawling[C], In:CIKM'02, November McLean, Virginia, USA,2002.
    [8]S. Chakrabarti.M. van den Berg, B. Dom. Focused Crawling:A New Approach to Topic-Specific Web Resource Discovery[C], In Proceedings of the 8th International WWW Conference, Toronto, Canada, May,1999
    [9]Murray B H, Moore A. Sizing the Internet [Z]. A White Paper:Cyveillance, Inc.2000.
    [10]Robert E Filman, Sangam Pant. Searching the Internet [J]. IEEE Internet Computing,1998,2(4)
    [11]李晓明,闫宏飞,王继民.搜索引擎—原理、技术与系统[M].北京：科学出版社,2004
    [12]徐宝文,张卫丰.搜索引擎与信息获取技术[M].北京：清华大学出版社,2003
    [13]高凯,郭立炜,许云峰.网络信息检索技术及搜索引擎系统开发[M]北京：科学出版社,2010
    [14]P. De Bra, R. Post. Searching for Arbitrary Information in the WWW:the Fish-Search for Mosaie[C]. InPoeeedings of the 2nd World Wide Web Conference(WWW'94),1994
    [15]M. Hersovici, M. Jacovi, Y. S. Maarek, et al.The shark-search algorithm-an application:Tailored Web Site Mapping [J]. Computer Networks.1998,30(17):317-326.
    [16]Menczer F,Pant G, Ruiz M, et al. Evaluating topic-driven Web crawlers[C]. Proceedings of the 24th Annual Intl. ACM SIGIR Conference on Research and Development in Information Retrieval [C], New York,2001.241-249.
    [17]罗娜.基于本体的主题爬行技术研究[D].吉林大学博士论文,2009.6
    [18]陈竹敏.面向垂直搜索引擎的主题爬行技术研究[D].山东大学博士论文,2008.10
    [19]彭涛.面向专业搜索搜索引擎的主题爬行技术研究[D].吉林大学博士论文,2007.6
    [20]F Menczer, G Gant,P Srinivasan. Topic-driven crawlers:Machine Learning Issues[C]. ACM TOIT,2002.
    [21]T. Berners-Lee, J. Hendler,O. Lassila. The Semantic Web[M]. Scientific America 284:34-43 May,2001
    [22]高志强,潘越,马力.语义Web原理及应用[M].北京：机械工业出版社
    [23]G.Antoniou, F. van Harmelen著,陈小平译.语义网基础教程[M].北京:机械工业出版社
    [24]何琳.领域本体的半自动构建及检索研究[M].南京：东南大学出版社
    [25]邓志鸿,唐世渭等Ontology研究综述[J].北京大学学报(自然科学版),Vol.38,No.52002.9
    [26]刘柏嵩.基于知识的语义网：概念、技术及挑战[J].中国图书馆学报,2003.2
    [27]田春虎.国内语义Web研究综述[J].情报学报Vol.24,No.2 2005.4
    [28]印鉴,陈忆群,张纲.搜索引擎技术研究与发展[J].计算机工程,Vol.31,No.14 2005.7
    [29]湛燕,陈昊等.基于中文文本分类的分词方法研究[J].计算机工程与应用,2003.23
    [30]邹海山,吴勇等.中文搜索引擎中的中文信息处理技术[J].计算机应用研究.2000.12
    [31]吴立德.大规模中文文本处理[M].上海：复旦大学出版社,1997.7
    [32]苗夺谦,卫志华.中文文本信息处理的原理与应用[M].北京：清华大学出版社,2007
    [33]张华平.中文信息处理发展简史[DB/OL]. DOI:http://www.nlp.org.cn,2010.3
    [34]王伟,王惠荣等.自动分类模型及算法研究[J].微电子学与计算2004.5
    [35]韩毅.基于文档结构的向量空间检索模型研究[J].情报学报,Vol.23,No.2 2004.4
    [36]George F. Luger著,史忠植等译.人工智能复杂问题求解的结构和策略[M].北京：机械工业出版社,2006
    [37]Ricardi Baeza-Yates等著,王知津等译.现代信息检索[M].北京：机械工业出版社,2005
    [38]高洁,吉根林.文本分类技术研究[J].计算机应用研究.2004.7
    [39]Applet D E, Israel D J, Introduction to Information Extraction Technology, A Tutorial for IJCAI-99,1999
    [40]Gaizauskas R, Wilks Y, Information Extraction:Beyond Document Retrieval. Journal of Documentation,1997.
    [41]Chinchor N, Marsh E, MUC-7 Information Extraction Task Definition (version 5.1)[C]. In Proceedings of the Seventh Message Understanding Conference,1998
    [42]Freitag D, McCallum A. Information extraction with HMM structures learned by stochastic optimization [A]. Proceedings of the 18th Joint Conference on Artificial Intelligence [C].Acapulco, Mexico:Morgan Kaufmann Publisher,2000.584-589
    [43]邓尚民,孙玉伟.信息抽取系统的研究现状[J].现代图书情报技术,2006.3
    [44]田俊华,杨林,李艺.基于网络生态链的主题资源分布研究[J].情报学报,Vol.29,No.5,2010.10
    [45]Mike Tthelwall著,孙建军等译.链接分析：信息科学的研究方法[M].南京：东南大学出版社,2009
    [46]胡小勇,祝智庭.教育信息资源的本地化研究[J].中国远程教育,2003.5
    [47]全国信息技术标准化技术委员会教育技术分技术委员会(CELTSC),教育信息化标准(CELTS)学习对象元数据：实践指南(CELTS-3:CD1.6),2002.8
    [48]教育信息化技术标准委员会.教育信息化技术标准CELTS-41.1教育资源建设技术规范信息模型[S]2002.12
    [49]杨宗凯,吴砥,刘清堂.网络教育标准与技术[M].北京：清华大学出版社,2003.11
    [50]汪小刚.网络教育资源建设的发展性问题与对策[J].中国电化教育,2007.9
    [51]吴砥.学习资源管理与服务关键技术研究[D].武汉：华中科技大学,2006.5
    [52]谢新观.远距离开放教育词典[]. 北京：中央广播电视大学出版社,1999,p193
    [53]B. Seels,R. Richey著,乌美娜译.教学技术：领域的定义和范畴.中央广播电视大学出版社,1999
    [54]项聪,骆雪超.网络教学资源共建共享的探索与实践.中国远程教育,2005.10
    [55]李烁,冯秀琪.关于教育资源库建设的几点思考.中国电化教育,2003.1
    [56]刘莹,聂钢.MIT开放课程述略,高等工程教育研究,2005.10
    [57]杨晓江,李丽娟,田俊华,李艺.面向基础教育的Web资源垂直服务体系研究.中国远程教育,2006.7
    [58]CNNIC.2003年中国互联网络信息资源数量调查报告[R]. http://www.cnnic.net.cn/xzzx/tjbgxz/ 201010A20101020_16033.html 2003 (2011.1)
    [59]CNNIC.2004年中国互联网络信息资源数量调查报告[R]. http://www.cnnic.net.cn/xzzx/tjbgxz/ 201010/t20101020_16033. html 2004 (2011.1)
    [60]CNNIC.2005年中国互联网络信息资源数量调查报告[R]. http://www.cnnic.net.cn/xzzx/tjbgxz/ 201010/t20101020_16033. html 2005 (2011.1)
    [61]CNNIC.第18-26次中国互联网络发展状况统计报告[R]. http://www.cnnic.net.cn/xzzx/tjbgxz/ 201010/t20101020_16033.html (2011.1)
    [62]Albert R, Jeong H. Barabasi AL. Diameter of the World Wide Web[J]. Nature,1999,401(9):130-131
    [63]Brian D Davison. Topical Locality in the Web[C]. Proc. of the 23rd'Annual International Conference on Research and Development in Information Retrieval (SI-GIR 2000), Athens, Greece,2000:272-279.
    [64]L. Page. The PageRank citation ranking:Bringing order to the Web[R]. Stanford Digital Libraries Working Paper,1999.
    [65]Andrei Broder, Ravi Kumar, et al. Graph Structure in the Web Experiments and Models[C]. Proc. of the 9th WWW Conference, Amsterdam,2000:309-320.
    [66]Paolo Boldi, Bruno Codenotti, Massimo Santini, et al. Structural Properties of the African Web[C]. Poster Proc. of 11th World Wide Web Conf.,2002.
    [67]Guowei Liu, Yong Yu, Jie Han et al. China Web Graph Measurements and Evolution[C]. Proc. of the 7th Asia-Pacific Web Conference, LNCS 3399, Springer, Heidelberg 2005.
    [68]Yu Hirate, Shin Kato, Hayato Yamana. Web Structure in 2005[C].WAW 2006, LNCS 4936, Springer-Verlag Berlin Heidelberg 2008.
    [69]Jaap Kamps, Marijn Koolen. Is Wikipedia Link Structure Different?[C], WSDM'09, Barcelona, Spain,2009
    [70]Ricardo Baeza-Yates, Barbara Poblete. Evolution of the Chilean Web Structure Composition[C]. Proc. of the First Latin American Web Congress,2003
    [71]Soumen Chakrabarti, Mukul M. Josh, Kunal Punera, et al. The structure of broad topics on the web[C]. Proc. of the WWW 2002 Conference. Honolulu, USA.2002:251-262.
    [72]Filippo Menczer. Lexical and semantic clustering by web links[J]. Journal of the American Society for Information Science and Technology,2004,55 (14):1261-1269.
    [73]Lennart Bjorneborn. Small-world link structures across an academic web space:a library and information science approach[D]. Royal School of Library and Information Science, Copenhagen, Denmark.2004.
    [74]周兴茂,汪玲丽.人类学视野下的网络社会与虚拟族群[J].黑龙江民族从刊.2009,1：128-132.
    [75]戈峰.现代生态学[M].北京：科学出版社.2002.
    [76]李盛韬,赵章界,余智华.基于主题的Web信息采集系统的设计与实现[J].计算机工程,2003.10
    [77]张翔,周明全,李智杰等.基于PageRank与Bagging的主题爬虫研究[J].计算机工程与设计,2010,31(14)
    [78]李勇,韩亮.主题搜索引擎中网络爬虫的搜索策略研究[J]. 计算机工程与科学,2008,Vol.30,No.3
    [79]杨仁广,宋宇,孟祥增.一种改进Shark-Search的多媒体主题搜索算法[J].计算机工程与应用,2010,46(14)
    [80]戚欣.基于本体的主题网络爬虫设计[J].武汉理工大学学报,2009.2,Vol.31,No.3
    [81]P. D. Bra, G. Houben, Y. Kornatzky, and R. Post. Information retrieval in distributed hypertexts [C]. In Procs. of the 4th RIAO Conf erence, New York,1994, pages 481-491.
    [82]C. C. Aggarwal, F. A1-Garawi and P. S. Yu. Intelligent crawling on the world wide web with arbitrary predicates. Proceedings of the 10th International Conference on World Wide Web [C].2001.96-105.
    [83]S. Chakrabarti, K. Punera and M. Subramanyam. Accelerated focused crawling through online relevance feedback[C]. Proceedings of the 11th International Conference on World Wide Web[C].2002.148-159.
    [84]J. Cho, H. Efficient Crawling Through URL Ordering, Garcia-Molina, L. Page. In Proceedings of the 7th International WWW Conference, Brisbane, Australia, April 1998.
    [85]A. McCallum, K. Nigam, J. Rennie, and K. Seymore. Building domain-specific search engines with machine learning technique[C]. In Procs. of AAAI Spring Symposium on Intelligents Engine in Cyberspace,1999.
    [86]Rennie J, A. McCallum. Using Reinforcement Learning to Spider the Web Efficiently[C]. In: Proceedings of ICML-99,16th International Conference on Machine Learning, Bled, Slovenia, 1999,335-343.
    [87]M. Diligenti, F. M. Coetzee, S. Lawrence, C. L. Giles, and M. Gori. Focused Crawling Using Context Graphs [C]. In Procs. of the 26th VLDB Conference, Cairo, Egypt,2000.
    [88]罗娜.基于本体的主题爬行技术研究[D].长春：吉林大学,2009.6
    [89]郑健珍,林坤辉,周昌乐等.基于本体语义的定题爬虫[J].山东大学学报(理学版),2006.6Vol.41,No.3
    [90]刘金红,陆余良.主题网络爬虫研究综述[J].计算机应用研究,2007.10 Vol.24,NO.10
    [91]S. Brin, L. Page. The Anatomy of a Large Scale Hypertextual Web Search Enging[J]. Computer Networks and ISDN Systems,1998,30(1-7),107-117.
    [92]李建中.并行数据库的查询处理并行化技术和物理设计方法[J].软件学报,1994.5(10)：1-101.
    [93]Albert Y. Zomaya, editor. Paralle and Distributed Computing Handbook[M]. McGraw-Hill, New York,1996.
    [94]I. A. Macleod, T.Patrick Martin, Brent Nordin, et al. Strategies for Building Distributed Information Retrieval System[J]. Information Processing & Management,23(6):511-528,1987.
    [95]Brendon Cahoon, Kathryn McKinley. Performance Evaluation of a Distributed Architecture for Information Retrieval [C]. In Proc.19th Int. ACM SIGIR Conference on Research and Development in Information Retrieval, P110-118, Zurich, Switzerland, August 1996.
    [96]MareNa Jorkand Janet L. Wiener. Breadth-First Crawling Yelds High-Quality Pages[C]. InProeeedings of the 10th International Conference on Wbrld Wide Web. (WWW'01),2001, 114-118
    [97]P. De Bra, R. Post. Information Retrieval in the World Wide Web:Making Client-based Searching Feasible[J]. Computer Networks and ISDN Systems,1994,27(2):183-192.
    [98]Menczer F, Pant C, Ruizme.Evaluting Topic-driven Web Crawlers[C]. Proc. SIGIR'01 New Orleans, Louisiana,2001:241-249.
    [99]Soumen Chakrabarti, Martin van den Berg, Byron Dom. Focused Crawling:A New Approach to Topic-Specific Web Resource Discovery[J]. Computer Networks,1999,31(11-16):1623-1640.
    [100]Menczer F, Belew R. Adaptive Retrieval Agents:Internalizing Local Context and Scaling up to the Web[J]. Machine Learning,2000,39:203-242.
    [101]Bryan Singer, Manuela Veloso. Learning State Features from Policies to Bias Exploration and a In-Reinforcement Learning[R]. Technical Report CMU-CS-99-122,1999:678-685.
    [102]Seeley J R.The Net of Reciprocal Influence:A Problem in Treating Sociometric Data[J]. Canadian Journal of Psychology,1949,3:234-240
    [103]M. E. Maron, J. L. Kuhns. On Relevance, Probabilistic Indexing and Information Retrieval [C]. J. ACM 1960,7(3):216-244
    [104]G. Salton, M. J. McGill. Introduction to Modern Information Retrieval [M]. New York:McGraw-Hill Book Co.,1983.
    [105]焦玉英.信息检索进展[M].北京：科学出版社,2003：17-106.
    [106]Ricardo Baeza-Yates, Berthier Riberio-Neto. Modern Information Retrieval[M]. Addison-Wesley,1999, p19-71.
    [107]K. Sparck Jones. A Statistical Interpretation of Term Specificity and Its Application to Retrieval[J]. Intormation Storage and Retrieval,9(11):619-633,1973.
    [108]G. Salton, C. S. Yang. On the Specification of Term Values in Automatic Indexing[J]. Journ of Documentation,29:351-372,1973.
    [109]李文斌,刘椿年,陈嶷瑛.基于特征信息增益权重的文本分类算法[J].北京工业大学学报,2006.5
    [110]贺兴时,于洁琼,李丽丽.基于互信息的特征子集选择[J].西安工程大学学报,2008.3
    [111]A. McCallum, K. Nigam, J. Rennie, and K. Seymore. A machine learning approach to building Domain-Specific Search Enginers[C]. In Proceeding of IJCAI-99.1999,622-667.
    [112]Ellien Riloff. Automatically Constructing a Dictionary for Information Extraction Task[C]. Proceeding for the Eleventh National Conference on Artificial Intelligence.1993,811-816.
    [113]KUSHMERICK N. Wrapper induction:efficiency and expressiveness. Artificial Intelligence Journal,2000,118(12):15-68.
    [114]林亚平,刘云中,周顺先,等.基于最大熵的隐马尔可夫模型文本信息抽取[J].电子学报,2003,33(2)：236-240.
    [115]张玲,黄铁军,高文.基于隐马尔可夫模型的引文信息抽取[J].计算机工程,2003,29(20)：33-34
    [116]刘云中,林亚平,陈治平,等.基于隐马尔可夫模型的文本信息抽取[J].系统仿真学报,2004,16(3)：507-510
    [117]Rabinicr LR. A Tutorial on Hidden Markov Models and Selected Aapplications in Speech Recognition[C]. Proc of IEEE,1989,7(2):257-285.
    [118]方浩,许鸿文,蔡益宇,等.一种基于语义关系改进的隐马尔可夫模型研究[J].通信技术,2008,41(5)：157-159
    [119]洪流,张巍,肖明军,等.一种改进的基于HMM的信息抽取模型[J].模式识别与人工智能,2004,17(3)：347-349.
    [120]张玲,黄铁军,高文.基于隐马尔可夫模型的引文信息抽取[J].计算机工程,2003,29(20)：33-34
    [121]刘云中,林亚平,陈治平,等.基于隐马尔可夫模型的文本信息抽取[J].系统仿真学报,2004,16(3)：507-510
    [122]王晓龙,关毅.计算机自然语言处理[M].北京.清华大学出版社,2005
    [123]Neches R, Fikes R E, Finin T, et al. Enabling Technology for Knowledge Sharing[J]. AIMagazine, 1991,12(3):36-56.
    [124]Gruber T R. A Translation Approach to Portable Ontology Specifications[J]. Knowledge Acquisition,1993,5:199-220.
    [125]Borst W N. Construction of Engineering Ontologies for Knowledge Sharing and Reuse[D]. PhD Thesis, University of Twente, Enschede,1997.
    [126]Studer R, Benjamins V R,Fensel D. Knowledge Engineering:Principles and Methods[J]. Data and Knowledge Engineering,1998,25(122):161-197.
    [127]William S, Austin T. Ontologies [J]. IEEE Intelligent Systems,1999 (1/2):18219.
    [128]Guarino N. Semantic Matching. Formal Ontological Distinctions for Information Organization, Extraction, and Integration. In Pazienza M T, eds. Information Extraction. A Multidisciplinary Approach to an Emerging Information Technology, Springer Verlag,1997, 139-170.
    [129]Perez A G.Benjamins V R. Overview of Knowledge Sharing and Reuse Components:Ontologies and Problem-Solving Methods. Workshop on Ontologies and Problem-Solving Methods:Lessons Learned and Future Trends (I JCAI99), de Agosto, Estocolmo,1999.
    [130]Maedche A. Ontology Learning for the Semantic Web[M]. Boston:Kluwer Academic Publishers, 2002.
    [131]丘威,张立臣.本体语言研究综述[J]. 情报杂志,2006.7
    [132]P.Buitelaar, P.Ciminano, M. Grobelnik, et al. Ontology Learning from Text. Tutorial at ECML/PKDD, Porto, Portugal,2005
    [132]Maedche A. Ontology Learning for the Semantic Web[M]. Boston:Kluwer Academic Publishers, 2002.
    [133]杜小勇.李曼.王珊.本体学习研究综述[J].软件学报,2006.9,Vol.17,No.9.
    [134]Gruber T R. Towards Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal of Human-Computer Studies,1995,43:907-928.
    [135]教育主题词表编辑委员会.教育主题词表.中国图书馆图书分类法教育专业分类表[Z].北京：教育科学出版社,1993.12
    [136]Jorge Perez, Marcelo Arena, Claudio Gutierrez. Semantics and Complexity of SPARQL[C]. International Semantic Web Conference,2006:30-43
    [137]张洋,张磊.网络信息资源评价研究综述[J].中国图书馆学报.2010.9,Vol.36,No.189
    [138]王知津,李明珍.网站评价指标体系的方法与过程[J].图书与情报.2006.3
    [139]Ingwersen P.The calculation of Web impact factors[J]. Journal of Documentation, 1998,54(2):236-243.
    [140]Fazli Can, Rabia Nuray, Ayisigi B. Sevdik.Automatic Performance Evaluation of Web Search Engines[C]. Information Processing & Management,2004,40(3):495-514.