著者
Imanishi Tadashi Itoh Takeshi Suzuki Yutaka O'Donovan Claire Fukuchi Satoshi Koyanagi Kanako O. Barrero Roberto A. Tamura Takuro Yamaguchi-Kabata Yumi Tanino Motohiko Yura Kei Miyazaki Satoru Ikeo Kazuho Homma Keiichi Kasprzyk Arek Nishikawa Tetsuo Hirakawa Mika Thierry-Mieg Jean Thierry-Mieg Danielle Ashurst Jennifer Jia Libin Nakao Mitsuteru Thomas Michael A. Mulder Nicola Karavidopoulou Youla Jin Lihua Kim Sangsoo Yasuda Tomohiro Lenhard Boris Eveno Eric Suzuki Yoshiyuki Yamasaki Chisato Takeda Jun-ichi Gough Craig Hilton Phillip Fujii Yasuyuki Sakai Hiroaki Tanaka Susumu Amid Clara Bellgard Matthew Bonaldo Maria de Fatima Bono Hidemasa Bromberg Susan K. Brookes Anthony J. Bruford Elspeth Carninci Piero Chelala Claude Couillault Christine Souza Sandro J. de Debily Marie-Anne Devignes Marie-Dominique Dubchak Inna Endo Toshinori Estreicher Anne Eyras Eduardo Fukami-Kobayashi Kaoru R. Gopinath Gopal Graudens Esther Hahn Yoonsoo Han Michael Han Ze-Guang Hanada Kousuke Hanaoka Hideki Harada Erimi Hashimoto Katsuyuki Hinz Ursula Hirai Momoki Hishiki Teruyoshi Hopkinson Ian Imbeaud Sandrine Inoko Hidetoshi Kanapin Alexander Kaneko Yayoi Kasukawa Takeya Kelso Janet Kersey Paul Kikuno Reiko Kimura Kouichi Korn Bernhard Kuryshev Vladimir Makalowska Izabela Makino Takashi Mano Shuhei Mariage-Samson Regine Mashima Jun Matsuda Hideo Mewes Hans-Werner Minoshima Shinsei Nagai Keiichi Nagasaki Hideki Nagata Naoki Nigam Rajni Ogasawara Osamu Ohara Osamu Ohtsubo Masafumi Okada Norihiro Okido Toshihisa Oota Satoshi Ota Motonori Ota Toshio Otsuki Tetsuji Piatier-Tonneau Dominique Poustka Annemarie Ren Shuang-Xi Saitou Naruya Sakai Katsunaga Sakamoto Shigetaka Sakate Ryuichi Schupp Ingo Servant Florence Sherry Stephen Shiba Rie Shimizu Nobuyoshi Shimoyama Mary Simpson Andrew J. Soares Bento Steward Charles Suwa Makiko Suzuki Mami Takahashi Aiko Tamiya Gen Tanaka Hiroshi Taylor Todd Terwilliger Joseph D. Unneberg Per Veeramachaneni Vamsi Watanabe Shinya Wilming Laurens Yasuda Norikazu Yoo Hyang-Sook Stodolsky Marvin Makalowski Wojciech Go Mitiko Nakai Kenta Takagi Toshihisa Kanehisa Minoru Sakaki Yoshiyuki Quackenbush John Okazaki Yasushi Hayashizaki Yoshihide Hide Winston Chakraborty Ranajit Nishikawa Ken Sugawara Hideaki Tateno Yoshio Chen Zhu Oishi Michio Tonellato Peter Apweiler Rolf Okubo Kousaku Wagner Lukas Wiemann Stefan Strausberg Robert L. Isogai Takao Auffray Charles Nomura Nobuo Gojobori Takashi Sugano Sumio
出版者
Public Library of Science
雑誌
PLoS Biology (ISSN:15449173)
巻号頁・発行日
vol.2, no.6, pp.856-875, 2004-06
被引用文献数
3 226

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, geneprediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level.Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following:integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates(1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for nonprotein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causingphenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology.
著者
Yamasaki Chisato Murakami Katsuhiko Fujii Yasuyuki Sato Yoshiharu Harada Erimi Takeda Jun-ichi Taniya Takayuki Satake Ryuichi Kikugawa Shingo Shimada Makoto Tanino Motohiko Halligan Brian Shimoyama Mary Twigger Simon Yura Kei Kimura Kouichi Yasuda Tomohiro Nishikawa Tetsuo Akiyama Yutaka Motono Chie Mukai Yuri Shionyu Masafumi Nagasaki Hideki Suwa Makiko Horton Paul Kikuno Reiko Ohara Osamu Lancet Doron Eveno Eric Graudens Esther Imbeaud Sandrine Debily Marie Anne Jia Libin Hayashizaki Yoshihide Amid Clara Han Michael Osanger Andreas Endo Toshinori Thomas Michael A. Hirakawa Mika Makalowski Wojciech Nakao Mitsuteru Kim Nam-Soon Thierry-Mieg Danielle Yoo Hyang-Sook De Souza Sandro J. Bonaldo Maria de Fatima Niimura Yoshihito Kuryshev Vladimir Schupp Ingo Wiemann Stefan Bellgard Matthew Thierry-Mieg Jean Wagner Lukas Zhang Qinghua Go Mitiko Minoshima Shinsei Ohtsubo Masafumi Hanada Kousuke Koyanagi Kanako O. Tonellato Peter Isogai Takao Zhang Ji Lenhard Boris Kim Sangsoo Chen Zhu Hinz Ursula Estreicher Anne Nakai Kenta Makalowska Izabela Barrero Roberto A. Hide Winston Tiffin Nicola Wilming Laurens Chakraborty Ranajit Soares Marcelo Bento Chiusano Maria Luisa Suzuki Yutaka Auffray Charles Yamaguchi-Kabata Yumi Itoh Takeshi Gough Craig Hishiki Teruyoshi Fukuchi Satoshi Nishikawa Ken Sugano Sumio Nomura Nobuo Tateno Yoshio Imanishi Tadashi Gojobori Takashi Chun Hong-Woo Habara Takuya Hanaoka Hideki Hayakawa Yosuke Hilton Philip B. Kaneko Yayoi Kanno Masako Kawahara Yoshihiro Kawamura Toshiyuki Matsuya Akihiro Nagata Naoki Nishikata Kensaku Ogura Noda Akiko Nurimoto Shin Saichi Naomi Sakai Hiroaki Sanbonmatsu Ryoko Shiba Rie Suzuki Mami Takabayashi Kazuhiko Takahashi Aiko Tamura Takuro Tanaka Masayuki Tanaka Susumu Todokoro Fusano Yamaguchi Kaori Yamamoto Naoyuki Okido Toshihisa Mashima Jun Hashizume Aki Jin Lihua Lee Kyung-Bum Lin Yi-Chueh Nozaki Asami Sakai Katsunaga Tada Masahito Miyazaki Satoru Makino Takashi Ohyanagi Hajime Osato Naoki Tanaka Nobuhiko Suzuki Yoshiyuki Ikeo Kazuho Saitou Naruya Sugawara Hideaki O'Donovan Claire Kulikova Tamara Whitfield Eleanor
出版者
Oxford University Press
雑誌
Nucleic Acids Research (ISSN:03051048)
巻号頁・発行日
vol.36, no.1, pp.D793-D799, 2008-01
被引用文献数
12

Here we report the new features and improvements in our latest release of the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/), a comprehensive annotation resource for human genes and transcripts. H-InvDB, originally developed as an integrated database of the human transcriptome based on extensive annotation of large sets of full-length cDNA (FLcDNA) clones, now provides annotation for 120 558 human mRNAs extracted from the International Nucleotide Sequence Databases (INSD), in addition to 54 978 human FLcDNAs, in the latest release H-InvDB_4.6. We mapped those human transcripts onto the human genome sequences (NCBI build 36.1) and determined 34 699 human gene clusters, which could define 34 057 (98.1%) protein-coding and 642 (1.9%) non-protein-coding loci; 858 (2.5%) transcribed loci overlapped with predicted pseudogenes. For all these transcripts and genes, we provide comprehensive annotation including gene structures, gene functions, alternative splicing variants, functional non-protein-coding RNAs, functional domains, predicted sub cellular localizations, metabolic pathways, predictions of protein 3D structure, mapping of SNPs and microsatellite repeat motifs, co-localization with orphan diseases, gene expression profiles, orthologous genes, protein–protein interactions (PPI) and annotation for gene families. The current H-InvDB annotation resources consist of two main views: Transcript view and Locus view and eight sub-databases: the DiseaseInfo Viewer, H-ANGEL, the Clustering Viewer, G-integra, the TOPO Viewer, Evola, the PPI view and the Gene family/group.