收藏 分享(赏)

automatic annotation of gene lists from literature analysis.ppt

上传人:无敌 文档编号:1409998 上传时间:2018-07-13 格式:PPT 页数:14 大小:1.32MB
下载 相关 举报
automatic annotation of gene lists from literature analysis.ppt_第1页
第1页 / 共14页
automatic annotation of gene lists from literature analysis.ppt_第2页
第2页 / 共14页
automatic annotation of gene lists from literature analysis.ppt_第3页
第3页 / 共14页
automatic annotation of gene lists from literature analysis.ppt_第4页
第4页 / 共14页
automatic annotation of gene lists from literature analysis.ppt_第5页
第5页 / 共14页
点击查看更多>>
资源描述

1、Automatic Annotation of Gene Lists from Literature Analysis,Xin HeBeespace Annual Workshop05/21/2009,Annotating Gene Lists,Enrichment of Gene Ontology Terms,Enrichment test based on these numbers,In the given gene list,In the background,Limitations of GO Analysis,GO annotations of all genes involve

2、substantial manual effortsRapid growth of literature: constantly add new functions to existing genesCoverage is not even in all areas. E.g. ecology and behavior; medicine; anatomy and physiology; etc.,Literature-based Analysis,Enrichment of terms: if a term is associated with many genes in the input

3、 list, this term is likely important for this list. Need to account for the expected term occurrences by chance: a term may occur in a gene, but not important.,Gene-term matrix: the count of terms in the documents of a gene.,Overview of Gene List Annotator,Document Retrieval for Genes,Input: a list

4、of gene identifiersYeast: SGD idsFruit fly: FlyBase idsMouse: MGI idsMapping genes to synonyms: use Entrez Gene database (manually created synonyms)Document collection: choose or create one from BeespaceRetrieve documents in the collection that match at least one synonym,Statistical Method (I),Intui

5、tion: For a gene i, if the term count xi is significantly higher than expected by chance (determined by 0 and di), then the term may be related to the gene i; If there are many genes related to the term, then this term is enriched in the given gene list.,Statistical Method (II),Dataset distribution:

6、 Poisson(;d),Reference distribution: Poisson(0;d),Model: whether a gene is related to the term is unknown, so assume the term count xi follows the mixture of two Poisson distributions.,Likelihood ratio test: on the observed term counts, mixture distribution vs null distribution (reference distributi

7、on only),Interactive Analysis (I),Output control,Significant Concepts,Relevant Statistics,Information of Input Genes,Choose concepts,Interactive Analysis (II),User-selected concepts,Genes containing the selected concepts,Term counts in genes, and link to documents,Applications,Test case 1. bee genes

8、 differentially expressed in brain in different species during behavior maturationBroadly consistent with the results from GO enrichment analysisIdentify interesting genesTest case 2. bee genes up-regulated in brain by the methoprene treatment (inducing behavior maturation)GO enrichment analysis: no

9、 significant termsA theme about myosin is overrepresented: may suggest neuron growth and movement, or remodeling, during behavior maturationSee Beespace v4 Demo for details: 1pm, Friday,Summary,Not limited to a controlled vocabulary (GO) Even for concepts covered by GO, a broader notation of term re

10、levance (gene-term co-occurrence in literature)Possible to retrieve the supporting documents for further explorationNot meant to substitute GO-based analysis, but a complementary tool,Acknowledgement,Bruce Schatz,Software support: Xu Ling, Jing Jiang, Brant Chee, David ArcoeloBiological evaluation: Moushumi Sen Sarma, Amy Toth,Gene Robinson,Chengxiang Zhai,

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 企业管理 > 经营企划

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报