1、科学知识图谱方法与应用绘制图谱的主要软件操作侯海燕 博士 大连理工大学人文社会科学学院 科技伦理与科技管理研究中心 网络-科学-信息-经济计量实验室WISE LAB2009年9月12日,讲座提纲,科学知识图谱的主要研究方法科学知识图谱的主要数据来源科学知识图谱的主要应用软件绘制图谱的软件操作演示,科学知识图谱的主要研究方法 Methods,共引分析 Co-citation analysis Journal co-citation analysis (JCA) 期刊共引分析 Author co-citation analysis (ACA) 作者共引分析 Document co-citation
2、 analysis (DCA) 文献共引分析 共词分析 Co-word analysis 关键词共词分析Co-word analysis of keywords 主题词共词分析Co-word analysis of title words 多元统计分析方法 因子分析(主成分分析)factor analysis PCA 多维尺度分析MDS-multidimensional scaling 聚类分析Cluster analysis词频分析方法 Frequency analysis of words 社会网络分析方法 Social network analysis,科学知识图谱的主要数据来源 Dat
3、a,主要数据来源:Web of Science 科学文献数据:科学引文索引数据库(SCI) 社会科学引文索引数据库(SSCI) 专利文献数据:德温特创新索引Derwent Innovations Index 国际会议文献数据:Conference Proceedings Citation Index - Science (CPCI-S)分析的基本数据单元: 作者 author 标题 title 关键词 keywords 摘要 Abstract 引文 cited references(被引作者、被引期刊、被引文献) 作者地址(机构合作、国家合作),应用的主要软件Software 共引分析、共词分
4、析软件,Bibexcel http:/www.umu.se/inforsk/Bibexcel/- a free-ware on-line developed by Olle Perssonbibliometricscitation analysisco-citation analysisbibliographic coupling cluster analysis,SPSS 多元统计分析及可视化软件Correlation analysis, PCA(factor analysis), MDS, cluster analysisWordsmith Tools 词频分析软件Frequency an
5、alysis of words,应用的主要软件Software,科学计量学主要的分支学科及其代表人物 图3-1 科学计量学学科结构知识图谱,1978-2004,科学计量学在亲本学科科学学中的位置,科学计量学与相邻学科关系 图3-4 科学计量学与相邻学科关系知识图谱,科学计量学主流研究领域知识图谱,1978-2004,科学计量学研究领域的演变 图4-8 1978-1986年科学计量学主流研究领域知识图谱,科学计量学研究领域的演变 图4-10 1987-1995年科学计量学主流研究领域知识图谱,科学计量学研究领域的演变 图4-12 1996-2004年科学计量学前沿研究领域知识图谱,PajekSo
6、cial network analysis UCINETSocial network analysis,应用的主要软件Software 社会网络分析软件、可视化软件,UCINET,最流行的社会网分析软件,其中包括一维与二维数据分析的NetDraw,同时集成了Pajek用于大型网络分析的免费应用软件程序。http:/ 不同的社会网软件的文件都具有自己的格式,但是也可以自由转换。譬如,利用UCINET软件可以读取文本文件、KrackPlot、Pajek、Negopy、VNA等格式的文件。,Pajek,Pajek (Program Analysis for Large Network),由卢布尔雅那
7、大学的Vladimir Batagelj 和Andrej Mrvar于1997年1月15日正式发布0.1版,是一项基于Windows的免费社会科学软件,主要用于社会网络分析,特点是可视化。该软件可以提供非商业应用的免费下载,下载网址:http:/vlado.fmf.uni-lj.si/pub/networks/pajek/CreatPajek可以把excel格式的文件转换为Pajek格式的软件。,科学计量学合作网络知识图谱,1978-2004,科学计量学合作网络结构的演变 图5-4 科学计量学合作网络的微观结构图谱,1978-1986,科学计量学合作网络结构的演变 图5-5 科学计量学合作网络
8、的微观结构图谱,1987-1995,科学计量学合作网络结构的演变 图5-6 科学计量学合作网络的微观结构图谱,1996-2004,应用的主要软件Software 多视角共引分析可视化软件,citespace- a free-ware on-line developed by Chaomei Chen1、通过引文网络分析,找出学科领域演化的关键路径 2、找出学科领域演化的关键点文献(知识拐点) 3、分析学科前沿热点 4、探测学科知识基础,文献共引图谱,国际纳米研究领域的主要期刊分布图谱,物理期刊,化学期刊、纳米期刊、及其他,期刊共引图谱,物理期刊,化学期刊、纳米期刊、及其他,期刊共引图谱,期刊共
9、引图谱,作者共引图谱,作者合作网络图谱,2纳米粒子,1纳米薄膜、纳米晶体、纳米线、纳米结构、碳纳米管,4 纳米管吸附、纳米硅,6纳米棒阵列、氧化锌纳米线,3光谱分析、纳米金、纳米簇,5 纳米复合材料、纳米输运、纳米器件、纳米技术、纳米场效应晶体管,7碳纳米管场发射机理,8纳米传感器、蛋白质纳米技术,图9 国际纳米研究热点知识图谱,共词图谱,2纳米粒子,1纳米薄膜、纳米晶体、纳米线、纳米结构、碳纳米管,4 纳米管吸附、纳米硅,6纳米棒阵列、氧化锌纳米线,3光谱分析、纳米金、纳米簇,5 纳米复合材料、纳米输运、纳米器件、 纳米技术、纳米场效应晶体管,7碳纳米管场发射机理,图9 国际纳米研究热点知识
10、图谱,8纳米传感器、蛋白质纳米技术,共词图谱,CiteSpace 软件介绍,学科领域图谱,The CiteSpace Homepage http:/cluster.cis.drexel.edu/cchen/citespace,CiteSpace 始开发于2004年9月13日,于2007年3月20日最近更新(CiteSpace2.0.11b) 由美国德雷克塞尔大学(费城)信息科学与技术学院(The College of Information Science and Technology, Drexel University)Chaomei Chen教授研究开发。,Chaomei Chen教授研
11、究的Information Visualization CiteSpace是近几年来在全美信息分析中最具有特色和影响力的可视化信息软件。从2000年至今,在这一研究领域中Chaomei Chen 教授发表了研究论文65篇,出版了6部研究专著。现担任美国Information Visualization期刊的总编。,CiteSpace是Java的应用程序,可免费获得使用。http:/cluster.cis.drexel.edu/cchen/citespace/download.html它要求JRE1.4.2或是更高的版本作为运行环境。尽管CiteSpace能够通过 PubMed 或者大量的网络服
12、务获取额外的信息,但是互联网对CiteSpace而言并不是必须的。,citespace输入的数据文件应该是ISI输出的格式,即从web of science下载的格式。citespace带有一个数据转换器,可以将从网络上保存的数据进行转换。,软件运行之前的基本步骤:1.为citespace准备数据。将数据文件放在一个文件夹。每个数据文件名必须以“download”开头,并以“.txt”结尾,例如“download-mass-extinction-2006.txt” 2.开始用citespace创建一个新的项目之前,你需要具体制定两个路径,一个是数据存储路径,一个是项目存储路径,在项目存储路径可
13、以找到你保存的图谱和输出文件。 3.根据相应的时间分段调整time slicing 。 4.按“GO”按钮,你就会看到软件处于运行状态,当运行过程结束的时候你会发现弹出一个新的窗口。,CiteSpace相关名词术语,Thresholds selection criteria used by CiteSpace items must have measures above threshold values to be included in modeling and visualization processes. Time slicing a divide-and-conquer strate
14、gy that divides a period of time into a series of smaller windows. Betweenness centrality a metric of a node in a network that measures how likely an arbitrary shortest path in the network will go through the node. Burst terms single or multi-word phrases extracted from the title, abstract, or other
15、 fields of a bibliographic record and the frequency of the term bursts, i.e. sharply increases, over a period of time. Citation an instance that a publication references to another publication.,Co-authors authors who appear in the author field of the same bibliographic record. Co-citation an instanc
16、e in which two items, such as authors, documents, or journals, that are cited by a publication. Time-zone view a restricted view in which the movement of nodes is limited to vertical time zones corresponding to the time of their publication. Turning points nodes of high betweenness centralities ( 1.
17、00). Such nodes tend to be critical in network transitions from one time slice to another. Cluster view a network is visualized in a modified spring-embedder node placement algorithm. Pathfinder network scaling a network scaling algorithm that removes links that violate triangle inequality condition
18、s so as to simplify a network by retaining salient links and paths only.,CiteSpace的简要操作步骤,1.Access/Obtain CiteSpace and how to run 获取和运行The CiteSpace Homepage http:/cluster.cis.drexel.edu/cchen/citespace 两种方法运行citespace a. Use Java WebStart directly 通过网络直接启动 b. 下载citespace.jar文件(网页download) 方法a:能保证始
19、终用到最新版本,且运行速度快,2. Prepare Bibliographic Data Files 准备文献数据从web of science 检索和保存数据方法: a. Make a general search in Web of Science b. Mark all search results c. Save the records, including Cited References, in field tagged format. d. Name your files as download*.txt E.g. downloadScience1999a.txt, downlo
20、ad2004.txt e. Save all data files in a folder on your computer.,3. What information in bibliographic data is used by CiteSpace?CiteSpace 使用下载数据的那些信息?A: Authors 作者 B: Title, Descriptors, Identifiers, Abstract题目、主题词、关键词、摘要 C: Cited References 引用的参考文献 D: Times Cited 引用次数 E: Year of Publication 发表年份,BLA
21、ZER DG, 1994, AM J PSYCHIAT, V151, P979EATON L, 2001, NY TIMES 1116, A1FOTHERGILL A, 1999, DISASTERS, V23, P156FULLERTON CS, 1999, AVIAT SPACE ENVIR MD, V70, P902GINEXI EM, 2000, AM J COMMUN PSYCHOL, V28, P495GOENJIAN AK, 2001, AM J PSYCHIAT, V158, P788GREEN BL, 1990, J APPL SOC PSYCHOL, V20, P1033H
22、ANSON RF, 1995, J CONSULT CLIN PSYCH, V63, P987HARVEY AG, 1999, J CONSULT CLIN PSYCH, V67, P985KAWACHI I, 2001, J URBAN HEALTH, V78, P458KESSLER RC, 1995, ARCH GEN PSYCHIAT, V52, P1048KILPATRICK DG, 1987, CRIME DELINQUENCY, V33, P479MADAKASIRA S, 1987, J NERV MENT DIS, V175, P286MAZURE CM, 2000, AM
23、J PSYCHIAT, V157, P896NORTH CS, 1999, JAMA-J AM MED ASSOC, V282, P755ORTEGA AN, 2000, AM J PSYCHIAT, V157, P615POLE N, 2001, J NERV MENT DIS, V189, P442RESNICK H, 1999, J ANXIETY DISORD, V13, P359RESNICK HS, 1993, J CONSULT CLIN PSYCH, V61, P984ROTHBAUM BO, 1992, J TRAUMA STRESS, V5, P455RUBONIS AV,
24、 1991, PSYCHOL BULL, V109, P384RUEF AM, 2000, CULTURAL DIVERSITY E, V6, P235SHAH B, 1997, SUDAAN USERS MANUALSHALEV AY, 1998, AM J PSYCHIAT, V155, P630SHALEV AY, 2000, J CLIN PSYCHIAT S5, V61, P33SHERBOURNE CD, 1991, SOC SCI MED, V32, P705SHORE JH, 1989, J NERV MENT DIS, V177, P681TUCKER P, 2000, J
25、BEHAV HEALTH SER R, V27, P406 NR 32 TC 179 PU MASSACHUSETTS MEDICAL SOC/NEJM PI WALTHAM PA WALTHAM WOODS CENTER, 860 WINTER ST, WALTHAM, MA 02451-1413 USA SN 0028-4793 J9 N ENGL J MED JI N. Engl. J. Med. PD MAR 28 PY 2002 VL 346 IS 13 BP 982 EP 987 PG 6 SC Medicine, General & Internal GA 534UY UT IS
26、I:000174608600006 ER,11, 2001, was unprecedented in the United States. We assessed the prevalence and correlates of acute post-traumatic stress disorder (PTSD) and depression among residents of Manhattan five to eight weeks after the attacks. Methods: We used random-digit dialing to contact a repres
27、entative sample of adults living south of 110th Street in Manhattan. Participants were asked about demographic characteristics, exposure to the events of September 11, and psychological symptoms after the attacks. Results: Among 1008 adults interviewed, 7.5 percent reported symptoms consistent with
28、a diagnosis of current PTSD related to the attacks, and 9.7 percent reported symptoms consistent with current depression (with current defined as occurring within the previous 30 days). Among respondents who lived south of Canal Street (i.e., near the World Trade Center), the prevalence of PTSD was
29、20.0 percent. ,C1 New York Acad Med, Ctr Urban Epidemiol Studies, New York, NY 10029 USA. Columbia Univ, Mailman Sch Publ Hlth, Dept Epidemiol, New York, NY USA. Med Univ S Carolina, Natl Crime Victims Res & Treatment Ctr, Charleston, SC 29425 USA. Schulman Ronca & Bucuvalas, New York, NY USA. Belle
30、vue Hosp Ctr, New York, NY 10016 USA. RP Galea, S, New York Acad Med, Ctr Urban EpidemiolStudies, Rm 556,12165th Ave, New York, NY 10029 USA. CR 2001, NY TIMES 1226, B2*AM PSYCH ASS, 1994, DIAGN STAT MAN MENT*DEP HLTH HUMAN SE, 1999, MENT HLTH REP SURG G*US BUR CENS, 2000, STF3A DEP COMM BUR C,AU Ga
31、lea, SAhern, JResnick, HKilpatrick, DBucuvalas, MGold, JVlahov, D TI Psychological sequelae of the September 11 terrorist attacks in New York City. SO NEW ENGLAND JOURNAL OF MEDICINE LA English DT Article ID POSTTRAUMATIC-STRESS-DISORDER; NATIONAL COMORBIDITY SURVEY; MAJOR DEPRESSION; NATURAL DISAST
32、ER; SOCIAL SUPPORT; OKLAHOMA-CITY; PREVALENCE; PSYCHOPATHOLOGY; SURVIVORS; SYMPTOMS AB Background: The scope of the terrorist attacks of September,A,B,C,D,E,11, 2001, was unprecedented in the United States. We assessed the prevalence and correlates of acute post-traumatic stress disorder (PTSD) and
33、depression among residents of Manhattan five to eight weeks after the attacks. Methods: We used random-digit dialing to contact a representative sample of adults living south of 110th Street in Manhattan. Participants were asked about demographic characteristics, exposure to the events of September
34、11, and psychological symptoms after the attacks. Results: Among 1008 adults interviewed, 7.5 percent reported symptoms consistent with a diagnosis of current PTSD related to the attacks, and 9.7 percent reported symptoms consistent with current depression (with current defined as occurring within t
35、he previous 30 days). Among respondents who lived south of Canal Street (i.e., near the World Trade Center), the prevalence of PTSD was 20.0 percent. ,AU Galea, SAhern, JResnick, HKilpatrick, DBucuvalas, MGold, JVlahov, D TI Psychological sequelae of the September 11 terrorist attacks in New York Ci
36、ty. SO NEW ENGLAND JOURNAL OF MEDICINE LA English DT Article ID POSTTRAUMATIC-STRESS-DISORDER; NATIONAL COMORBIDITY SURVEY; MAJOR DEPRESSION; NATURAL DISASTER; SOCIAL SUPPORT; OKLAHOMA-CITY; PREVALENCE; PSYCHOPATHOLOGY; SURVIVORS; SYMPTOMS AB Background: The scope of the terrorist attacks of Septemb
37、er,co-authorship,co-occurring burst terms,A,B,B,B,CR 2001, NY TIMES 1226, B2*AM PSYCH ASS, 1994, DIAGN STAT MAN MENT*DEP HLTH HUMAN SE, 1999, MENT HLTH REP SURG G*US BUR CENS, 2000, STF3A DEP COMM BUR CBLAZER DG, 1994, AM J PSYCHIAT, V151, P979EATON L, 2001, NY TIMES 1116, A1FOTHERGILL A, 1999, DISA
38、STERS, V23, P156FULLERTON CS, 1999, AVIAT SPACE ENVIR MD, V70, P902GINEXI EM, 2000, AM J COMMUN PSYCHOL, V28, P495GOENJIAN AK, 2001, AM J PSYCHIAT, V158, P788GREEN BL, 1990, J APPL SOC PSYCHOL, V20, P1033HANSON RF, 1995, J CONSULT CLIN PSYCH, V63, P987HARVEY AG, 1999, J CONSULT CLIN PSYCH, V67, P985
39、KAWACHI I, 2001, J URBAN HEALTH, V78, P458KESSLER RC, 1995, ARCH GEN PSYCHIAT, V52, P1048KILPATRICK DG, 1987, CRIME DELINQUENCY, V33, P479MADAKASIRA S, 1987, J NERV MENT DIS, V175, P286MAZURE CM, 2000, AM J PSYCHIAT, V157, P896NORTH CS, 1999, JAMA-J AM MED ASSOC, V282, P755ORTEGA AN, 2000, AM J PSYC
40、HIAT, V157, P615POLE N, 2001, J NERV MENT DIS, V189, P442RESNICK H, 1999, J ANXIETY DISORD, V13, P359RESNICK HS, 1993, J CONSULT CLIN PSYCH, V61, P984ROTHBAUM BO, 1992, J TRAUMA STRESS, V5, P455RUBONIS AV, 1991, PSYCHOL BULL, V109, P384RUEF AM, 2000, CULTURAL DIVERSITY E, V6, P235SHAH B, 1997, SUDAA
41、N USERS MANUALSHALEV AY, 1998, AM J PSYCHIAT, V155, P630SHALEV AY, 2000, J CLIN PSYCHIAT S5, V61, P33SHERBOURNE CD, 1991, SOC SCI MED, V32, P705SHORE JH, 1989, J NERV MENT DIS, V177, P681TUCKER P, 2000, J BEHAV HEALTH SER R, V27, P406,document co-citation,author co-citation,journal co-citation,ACA/D
42、CA/JCA,C,NR 32 TC 179 PU MASSACHUSETTS MEDICAL SOC/NEJM PI WALTHAM PA WALTHAM WOODS CENTER, 860 WINTER ST, WALTHAM, MA 02451-1413 USA SN 0028-4793 J9 N ENGL J MED JI N. Engl. J. Med. PD MAR 28 PY 2002 VL 346 IS 13 BP 982 EP 987 PG 6 SC Medicine, General & Internal GA 534UY UT ISI:000174608600006 ER,
43、D,E,4.Get started with CiteSpace (开始使用citespace) 数据导入(date directory):直接选定数据存放的文件夹,注意不要进入文件夹。,5.Choose Network Analysis(选择分析方法): Cited Author: Author Co-Citation Analysis (ACA) Cited Reference: Document Co-Citation Analysis (DCA) Cited Journals: Journal Co-Citation Analysis (JCA) Authors: Co-Authors Terms: Co-Terms,ACA,DCA,Co-Authorship,JCA,Co-Term (Burst),CiteSpace 软件特色及原理,1、原始数据不需要转化为矩阵的格式,可以将WOS及PubMed等数据库的原始数据格式直接导入CiteSpace进行运算及作图。 2、对于同一数据样本,可以进行多种图谱的绘制,从不同角度展现数据演化特征。 3、软件通过为节点和连线标记不同颜色,清晰地展现出文献数据随时间变化的脉络。 4、节点的彩色年轮表示法清晰展现了不同时间段的引证情况。 5、连线的颜色代表了该连接共引次数最早达到所选择阈值的时间。,特色:,图谱绘制的具体操作演示,谢谢大家,