1、人类肿瘤的鼠模型网络入口:鼠的肿瘤生物学数据库生命科学学院2002级生物技术 曹文瑞 学号:021402153摘要鼠的肿瘤生物学数据库(MTB)提供有遗传详细说明(即先天性,转基因和有目标的突变种)的老鼠种群肿瘤遗传学和病理学的详细信息资源。这个数据库中的数据资源包括已经发表的科学文献以及科学团体的直接的数据论据。研究者利用基于环球网的询问窗体访问MTB,他们可以利用这个数据库回答诸如“对基于C57BL/6J背景创造的转基因老鼠的肿瘤有哪些报道?”,“老鼠中哪些肿瘤与Trp53 基因的突变有关?”和“在不考虑遗传背景的情况下,有哪些乳腺肿瘤的病理模型是可用的?”之类的问题。鼠的肿瘤生物学数据库
2、自1998年在鼠基因组信息学网站(http:/www.informatics.jax.org)上提供使用。最近我们对MTB进行了一些改进,包括新的查询选项、重新设计的查询窗体和病理学及遗传学数据的结果界面、增加了病理学数据的电子数据论据和注解工具。导言实验室中老鼠已经被用作研究人类疾病的模型有很长时间了,这是因为他们在生理上、在基因组的容量和组织上与人类相似,并且易于在遗传上进行改造、易于实验操作。能够确切的反映人类癌症和组织病理学的发育中的鼠模型是1998年由国家癌症学会一次“偶然的机会”发现的。鼠模型提供了探究疾病进展过程中遗传及细胞方面的变化以及对有可能应用于临床的治疗策略进行测试的手段
3、。不同的鼠纯系株在肿瘤易感性方面有很大不同。标准的鼠纯系通常并不是人类肿瘤的恰当模型,这是由于鼠中零星肿瘤的低频率和较晚的发作时间。尽管如此,对特殊遗传背景下肿瘤特征模式的了解对于选择用于得到疾病进展模式更有利于建造人类特殊疾病的遗传及分子模型的转基因或靶突变的合适鼠种系是重要的。许多关于有遗传详细说明(即先天性,转基因和有目标的突变种)的鼠肿瘤易感性和抗性的数据在允许研究者对不同鼠品系进行比较或比较标准鼠纯品系和基于同样先天背景建造的转基因或定点突变鼠品系癌症模式的形式下是不可用的。将连成一体的关于有遗传详细说明的鼠品系遗传和病理生物学的不同数据放入一个可查询的数据库是MTB第一位的任务。在
4、最近的一项对肿瘤遗传学研究的网络资源的调查中,我们发现了70余个与基础肿瘤遗传学研究相关的数据库和信息资源。大多数现有肿瘤相关资源和数据库集中于单个基因或特殊的肿瘤性状。在我们调查的站点中,仅有少数提供关于人肿瘤鼠模型的信息;提供实验室老鼠病理生物学详细信息的站点则更少。在提供有关肿瘤遗传学和病理学的综合数据的范围和深度上,MTB在现有的资源中是独一无二的。MTB自1998年以来就可以通过环球网方便的使用。出现于MTB中的最初数据是肿瘤类别、老鼠品系、遗传学、病理学和参考文献(已发表和未发表的文献都包含于这个数据库中)。这些方面代表了查询此数据库的基于网络的窗体的主要内容。MTB是用于描绘由J
5、ackson实验室鼠基因组信息(MGI)组织建立的实验室老鼠品系遗传及生物学信息的信息学下部构造的延伸。MTB中鼠基因和品系的命名法来自于鼠基因数据库(http:/www.informatics.jax.org/mgihome/nomen)描述的鼠的正式命名法。数据库中应用的解剖学术语来自于基因表达数据库(GXD)鼠解剖学控制性词汇。MTB中的许多病理学和诊断学术语来自于一本权威的教科书衰老鼠的病理学。2000年对MTB的改进关于MTB的设计和实施细节已在别处介绍。这篇报告的目的是介绍MTB的新特色和对它的改进。我们在过去一年我们收到的数据库使用者的最普遍的意见是提供额外的查询选项和对病理学及
6、遗传学数据的报道。使用者还要求我们重新设计一些数据摘要页面以便使他们不必经过如此多的超文本链接得到他们想要的信息。根据用户反馈意见而作出的对系统的改变细节将在下面进行说明。对这些变化进行说明的网络链接可以在本文的在线版本中看到(增补材料)。肿瘤类型查询的新查询选项我们实现了对肿瘤类型查询选项两个方面的改进。首先,我们增加了通过解剖学体系而不仅仅是器官名称查询数据库的能力。使用者现在可以提交如“从MTB中找到所有消化系统肿瘤的信息”之类的查询。第二,我们增加了对基于肿瘤转移状态查询限制的支持。例如,现在搜寻已知的扩散到肺或肝脏的乳腺瘤成为可能。对病理学查询的改进在早先的MTB版本中,使用者可以仅
7、仅查询和查看特殊的种/肿瘤联合的显微照片和诊断记录(例如,“显示所有FVBN-TgN(MMTVPyVT)634Mul雌性转基因鼠的乳腺癌种类”)。在2000年十月的数据库版本中,我们增加了新的查询窗体允许使用者用更加普通标准(包括器官名称(如“找到所有物种的肝脏肿瘤的病理学图像”)、品系名称(如“找所有可用的BALB/c品系所有器官系统和所有类型的肿瘤”)和肿瘤类型(如“找所有可用的所有物种的乳腺癌病理学数据”)搜索病理学数据。通过这项改进,数据库使用者可以在MTB中通过一项查询产生关于病理学数据广泛查询的结果而不必须一次找一个肿瘤/品系组合。MTB中对病理学数据陈述的第二项改进与对组织学图像
8、的显示相关联。在早先的数据库版本中,使用者需要在病理学摘要页面用鼠标点击一个组织学图像极小的版本来查看一个对细胞面貌有更高分辨率的图像。具有更高分辨率图像会替代当前窗口,使用者不能再对图像和为图像提供的诊断正文进行比较。我们实现了对这种进退两难的局面的简单解决办法,使用者可以通过用鼠标点击一个位于极小版本的显微图片下方的标有“查看更大图片”的按钮在一个独立窗口中查看更高分辨率的图像和诊断正文。这个显示更高分辨率图像的独立窗口可以调整大小、可以独立于网站主页而关闭。对基因查询的改进在MTB数据库的概念和逻辑设计中,我们过去把肿瘤细胞中遗传改变的概念从与某一特定品系老鼠的背景相关的遗传学中分离出来
9、。结果,对MTB中品系查询和以基因名称或符号表示的肿瘤查询都感兴趣的用户需要用两种不同的查询窗体进行搜索。我们实现了一种允许用户用基因符号同时搜索品系和肿瘤信息。例如,用基因符号在MTB中搜索眼癌1(Rb1)基因可以得到带有靶突变,诱发突变或是自然突变的Rb1基因的品系的信息和已报道的在Rb1基因中发生遗传改变(如点突变,缺失等)的肿瘤的信息。基因符号的查询结果以两部分被返回。首先,MTB中所记录的基因的等位基因列表被返回。第二,这些等位基因与肿瘤和/或品系的联系会连同适当详细页面的超文本链接一起显示。对肿瘤发生频率表格的改进MTB肿瘤发生频率表格作为查询和显示鼠纯系株家族复杂的癌症模式信息的
10、图表手段于1999年提出的。肿瘤发生频率表格包括大多数作为以建立通常使用和遗传上多样的实验室鼠纯系的基本表形数据为目的的广泛国际合作(也称为鼠“表型组”)的一部分被系统描述的鼠纯系株。我们对肿瘤发生频率表格做了两项改进以使它为我们的用户提供更多的信息。首先,我们将表格从用三色编码反映肿瘤发生频率的系统变为五色编码系统。五种颜色的显示方式为用图表交流肿瘤发生频率提供了更加准确的信息,还有一个额外的好处,就是即使图表以黑白方式打印或显示也容易辨别它们的相对发生频率。第二,我们对表格中的种系清单进行了重新构造,使它们由以字母顺序排序变为以纯系株的系谱关系分组。这种组织方式使寻找预期在肿瘤易感性模式比
11、较相似的种系。病理学信息的电子数据论据:JaxPathMTB最初获得信息的方式是通过肿瘤生物学和鼠遗传学方面的专家级生物学家对已发表科学文献的进行定期回顾。为了给病理学信息提供电子论据和共有的查询方法,我们创造了一个从MTB主页容易进入的基于网络的数据库JaxPath。JaxPath允许用户向MTB提交未发表的病理学图像和数据,或者或者增加由于空间或费用原因而没有包含于初始发表文章之内的增补图像。向JaxPath提交数据的使用者会被分配一个允许他们通过网络对投稿的说明和注解进行编辑的密码。对投稿者图片和注解的引用会在网站的病理学数据摘要中显示。利用MTB进行查询ID的增加就像在以前的一篇MTB
12、的报告中提到的那样数据库中每一肿瘤的实例都以肿瘤名称、品系、性别和肿瘤起源器官的组合进行描述的。MTB中这种组织信息的方式反映了我们的遗传背景在肿瘤发生模式中发挥重要作用的基本设想。MTB中每一肿瘤实例都被自动分派到一个允许我们明白无误地指定该肿瘤并建立通往其它数据库的稳定链接的标识符(这些增加的标识符在许多查询结果页面中显示)中。已经在他们感兴趣通过常规方法进行查询的数据库中鉴别出特定记录的使用者现在可以可用合适的MTB新增ID直接查询MTB。未来研究方向国家肿瘤学会最近成立了人类肿瘤鼠模型协会(MMHCC),并把它作为加速人肿瘤鼠模型建立和完成肿瘤名称和诊断术语一致命名法的机构。MMHCC
13、正在建造一个将包括关于治疗药物检测和在MTB范围以外的试验草案信息的人肿瘤鼠模型数据库,尽管它现在还不易为公众使用。因为许多将在MMHCC中描述的鼠品系也将在MTB中有所记录,未来一年我们数据库小组的一个组要任务就是将MTB连接到MMHCC。这两个数据库的综合将允许用户无差错地从基本癌症表型和遗传信息转向详细记述的临床前和临床鼠模型、实验草案和治疗试验结果。参考文献1 Paigen,K. (1995) A miracle enough: the power of mice. Nature Med., 1, 215220.ISIMedline 2 DePinho,R.A. and Jacks,T
14、. (1999) Mouse models of cancer: Introductory comments. Oncogene, 18, 5248.ISI 3 Klausner,R. (1999) Studying cancer in the mouse. Oncogene, 18, 52495252.ISIMedline 4 Bult,C.J., Krupke,D.M. and Eppig,J.T. (1999) Electronic access to mouse tumor data: the Mouse Tumor Biology Database (MTB) project. Nu
15、cleicAcids Res., 27, 99105.Abstract/Free FullText 5 Bult,C.J., Krupke,D.M., Sundberg,J.P. and Eppig,J.T. (2000) Mouse Tumor Biology Database (MTB): enhancements and current status. Nucleic Acids Res., 28, 112114.Abstract/Free FullText 6 Bult,C.J., Krupke,D.M., Tennent,B.J. and Eppig,J.T. (1999) A su
16、rvey of web resources for basic cancer genetics research. Genome Res., 9, 397408.Free FullText 7 Blake,J.A., Eppig,J.T., Richardson,J.E., Kadin,J.A., Bult,C.J. and the Mouse Genome Database Group. (2001) The Mouse Genome Database (MGD): Integration Nexus for the Laboratory Mouse. Nucleic Acids Res.,
17、 29, 9194.Abstract/Free FullText 8 Ringwald,M., Eppig,J.T., Kadin,J.A., Richardson,J.E. and the Gene Expression Database Group. (2000) GXD: a gene expression database for the laboratory mouse: current status and recent enhancements. NucleicAcids Res., 28, 115119. Updated article in this issue: Nucle
18、ic Acids Res. (2001), 29, 98101.Abstract/Free FullText 9 Mohr,U., Dungworth,D.L., Capen,C.C., Carlton,W.W., Sundberg,J.P. and Ward,J.M. (1996) Pathobiology of the Aging Mouse. International Life Science Institute, Washington, DC, USA, vols 1 and 2. 10 Paigen,K. and Eppig,J.T. (2000) A mouse phenome
19、project. Mamm.Genome, 11, 715717.ISIMedline 11 Beck,J.A., Lloyd,S., Hafezparast,M., Lennon-Pierce,M., Eppig,J.T., Festing,M.F.W. and Fisher,M.C. (2000) Genealogies of mouse inbred strains. Nature Genet., 24, 2325.ISIMedlineWeb-based access to mouse models of human cancers: the Mouse Tumor Biology (M
20、TB) Database Carol J. Bult*, Debra M. Krupke, Dieter Nf, John P. Sundberg and Janan T. Eppig The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA Received October 3, 2000; Accepted October 4, 2000. ABSTRACTThe Mouse Tumor Biology (MTB) Database serves as a curated, integrated resource
21、for information about tumor genetics and pathology in genetically defined strains of mice (i.e., inbred, transgenic and targeted mutation strains). Sources of information for the database include the published scientific literature and direct data submissions by the scientific community. Researchers
22、 access MTB using Web-based query forms and can use the database to answer such questions as What tumors have been reported in transgenic mice created on a C57BL/6J background?, What tumors in mice are associated with mutations in the Trp53 gene? and What pathology images are available for tumors of
23、 the mammary gland regardless of genetic background?. MTB has been available on the Web since 1998 from the Mouse Genome Informatics web site (http:/www.informatics.jax.org). We have recently implemented a number of enhancements to MTB including new query options, redesigned query forms and results
24、pages for pathology and genetic data, and the addition of an electronic data submission and annotation tool for pathology data. INTRODUCTION The laboratory mouse has long served as an important animal model for human disease because it is known to resemble humans physiologically, is highly similar t
25、o humans in both genome content and organization, is well characterized genetically and is easily manipulated experimentally (1). Developing mouse models that accurately reflect the genetics and histopathology of human cancers was recognized in 1998 as an exceptional opportunity by the National Canc
26、er Institute (http:/www.nci.nih.gov) (2). Mouse models provide the means to explore genetic and cellular aspects of disease progression and to test therapeutic strategies that might ultimately be used clinically in humans (2,3). Different inbred strains of mice vary in their intrinsic tumor suscepti
27、bility. Standard inbred mice are not usually appropriate models for human cancers because of the relatively low frequency and late onset of sporadic cancers in mice. However, knowing the characteristic cancer profile of a particular genetic background is critical to the process of selecting the appr
28、opriate mouse strain for developing transgenic or targeted mutation mice whose disease progression patterns may be more useful for modeling genetic and molecular aspects of a specific human disease. Much of the data about tumor susceptibility and resistance in genetically defined strains of mice (i.
29、e., inbred lines, transgenics, targeted mutation strains) are not available in a format that allows researchers to compare different strains of mice to one another or to compare the cancer profile of a standard inbred strain to that of a transgenic or targeted mutation line created on the same inbre
30、d background. Integrating diverse data about genetics and pathobiology for genetically defined strains of mice in a queryable database system is the primary mission of the Mouse Tumor Biology (MTB) Database (4,5). In a recent survey of Web-based resources for cancer genetics research, we identified
31、over 70 databases and information resources related to basic cancer genetics research (6). The majority of existing cancer-related resources and databases focus on single genes or specific cancer syndromes. Only a handful of the sites we surveyed provided information about mouse models of human canc
32、ers; even fewer sites provided detailed information about the pathobiology of laboratory mice. The MTB Database is unique among existing resources in both its scope and degree of integration of data about cancer genetics and pathology in laboratory mice. The MTB Database has been accessible via the
33、World Wide Web since 1998 (5). The primary data types represented in MTB are tumor types, mouse strain, genetics, pathology and references (both published and unpublished references are included in the database). These areas, in turn, represent the main Web-based forms that are used to query the dat
34、abase. MTB is an extension of the informatics infrastructure developed for representing genetic and biological information about the laboratory mouse established by the Mouse Genome Informatics (MGI) Group at The Jackson Laboratory (http:/www.informatics.jax.org). The nomenclature used in MTB for ge
35、nes and strains of mice comes from the official mouse nomenclature represented in the Mouse Genome Database (http:/www.informatics.jax.org/mgihome/nomen) (7). Anatomical terms used in the database come from a controlled vocabulary of mouse anatomy supported by the Gene Expression Database (GXD) (8).
36、 Much of the pathology and diagnostic terminology used in MTB comes from the Pathobiology of the Aging Mouse (9), a standard mouse pathology text. ENHANCEMENTS TO MTB IN 2000Details concerning the design and implementation of MTB have been described elsewhere (4,5). The purpose of this report is to
37、describe new features and recent enhancements to MTB. The most common input we received from our database users during the past year was to provide additional query options and reports for pathology and genetic data. Users also requested that we redesign some of the data summary pages so that they d
38、id not have to follow as many hypertext links to retrieve the information they were seeking. The details of the changes to the system in response to user feedback are described below. Screen shots and web links illustrating these changes can be viewed in the online version of this article (Supplemen
39、tary Material). New query options for tumor type searchesWe have implemented two enhancements to the query options for tumor types. First, we added the capacity to search the database by anatomical system instead of just by organ name. Users can now submit queries such as Retrieve all information fr
40、om MTB for tumors of the Digestive System. Second, we added support for constraints on queries based on the status of metastases of a tumor. It is now possible, for example, to search for tumors of the mammary gland that are known to metastasize to the lung or the liver. Enhancements to pathology qu
41、eriesIn the previous versions of MTB, users could only query for and view photomicrographs and diagnostic descriptions for specific strain/tumor combinations (e.g., Show me all mammary gland adenocarcinomas for FVBN-TgN(MMTVPyVT)634Mul female transgenic mice). In the October 2000 release of the data
42、base, we added new query forms to allow users to search for pathology data by more general criteria, including organ system (e.g., Retrieve all pathology images for tumors of the liver regardless of strain), strain name (e.g., Retrieve all available pathology images for tumors in BALB/c mice regardl
43、ess of organ system or type of tumor) and tumor type (e.g., Retrieve all available pathology data for mammary gland adenocarcinomas regardless of strain). With this enhancement, database users can now generate results for broad queries about the pathology data in MTB with a single query instead of h
44、aving to retrieve tumor/strain combinations one at a time. A second enhancement to the representation of pathology data in MTB relates to the display of the histology images themselves. In previous versions of the database, users would mouse click on a thumbnail version of the histology images in a
45、pathology summary page to view a version of the image with higher resolution of the cellular features. The higher resolution image replaced the current window and the user could no longer compare the image with the diagnostic text provided for the image. We have implemented a simple solution to this
46、 dilemma in which the user can view the higher resolution image and diagnostic text in a separate window by mouse clicking on a button labeled View Large Image that is below the thumbnail version of the photomicrograph. The separate window displaying the higher resolution image can be resized and cl
47、osed independently of the main web page. Enhancements to gene queriesIn both the conceptual and logical design of the MTB database we separated the concepts of genetic changes in tumor cells from the genetics associated with the background of a particular strain of mouse. As a result, users interest
48、ed in querying both strains and tumors represented in MTB by gene name or symbol needed to search the database using two different query forms. We have implemented a new query mechanism that allows users to search strain and tumor information by gene symbol simultaneously. Now, for example, a search
49、 of MTB using the gene symbol for the retinoblastoma 1 gene (Rb1) will return information both on the strains that carry a targeted, induced or spontaneous allele of the Rb1 gene, as well as on the tumors that have reported genetic alterations (e.g.,point mutations, deletions, etc.) in the Rb1 gene. The query results for gene symbol searches are returned in two parts. First, a list of the alleles f