著者
So Nakagawa Toshiaki Katayama Lihua Jin Jiaqi Wu Kirill Kryukov Rise Oyachi Junko S Takeuchi Takatomo Fujisawa Satomi Asano Momoka Komatsu Jun-ichi Onami Takashi Abe Masanori Arita
出版者
The Genetics Society of Japan
雑誌
Genes & Genetic Systems (ISSN:13417568)
巻号頁・発行日
pp.23-00085, (Released:2023-10-14)
参考文献数
53

Since the early phase of the coronavirus disease 2019 (COVID-19) pandemic, a number of research institutes have been sequencing and sharing high-quality severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes to trace the route of infection in Japan. To provide insight into the spread of COVID-19, we developed a web platform named SARS-CoV-2 HaploGraph to visualize the emergence timing and geographical transmission of SARS-CoV-2 haplotypes. Using data from the GISAID EpiCoV database as of June 4, 2022, we created a haplotype naming system by determining the ancestral haplotype for each epidemic wave and showed prefecture- or region-specific haplotypes in each of four waves in Japan. The SARS-CoV-2 HaploGraph allows for interactive tracking of virus evolution and of geographical prevalence of haplotypes, and aids in developing effective public health control strategies during the global pandemic. The code and the data used for this study are publicly available at: https://github.com/ktym/covid19/.
著者
Yasuhiro TANIZAWA Takatomo FUJISAWA Eli KAMINUMA Yasukazu NAKAMURA Masanori ARITA
出版者
BMFH出版会
雑誌
Bioscience of Microbiota, Food and Health (ISSN:21863342)
巻号頁・発行日
vol.35, no.4, pp.173-184, 2016 (Released:2016-10-28)
参考文献数
49
被引用文献数
188

Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus, obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii, whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.
著者
Eli Kaminuma Yukino Baba Masahiro Mochizuki Hirotaka Matsumoto Haruka Ozaki Toshitsugu Okayama Takuya Kato Shinya Oki Takatomo Fujisawa Yasukazu Nakamura Masanori Arita Osamu Ogasawara Hisashi Kashima Toshihisa Takagi
出版者
The Genetics Society of Japan
雑誌
Genes & Genetic Systems (ISSN:13417568)
巻号頁・発行日
pp.19-00034, (Released:2020-03-26)
参考文献数
37
被引用文献数
3

Recently, the prospect of applying machine learning tools for automating the process of annotation analysis of large-scale sequences from next-generation sequencers has raised the interest of researchers. However, finding research collaborators with knowledge of machine learning techniques is difficult for many experimental life scientists. One solution to this problem is to utilise the power of crowdsourcing. In this report, we describe how we investigated the potential of crowdsourced modelling for a life science task by conducting a machine learning competition, the DNA Data Bank of Japan (DDBJ) Data Analysis Challenge. In the challenge, participants predicted chromatin feature annotations from DNA sequences with competing models. The challenge engaged 38 participants, with a cumulative total of 360 model submissions. The performance of the top model resulted in an area under the curve (AUC) score of 0.95. Over the course of the competition, the overall performance of the submitted models improved by an AUC score of 0.30 from the first submitted model. Furthermore, the 1st- and 2nd-ranking models utilised external data such as genomic location and gene annotation information with specific domain knowledge. The effect of incorporating this domain knowledge led to improvements of approximately 5%–9%, as measured by the AUC scores. This report suggests that machine learning competitions will lead to the development of highly accurate machine learning models for use by experimental scientists unfamiliar with the complexities of data science.
著者
Yasuhiro TANIZAWA Hisami KOBAYASHI Eli KAMINUMA Mitsuo SAKAMOTO Moriya OHKUMA Yasukazu NAKAMURA Masanori ARITA Masanori TOHNO
出版者
BMFH出版会
雑誌
Bioscience of Microbiota, Food and Health (ISSN:21863342)
巻号頁・発行日
vol.36, no.3, pp.129-134, 2017 (Released:2017-07-13)
参考文献数
28
被引用文献数
2

Whole-genome sequencing was performed for Lactobacillus parakefiri JCM 8573T to confirm its hitherto controversial taxonomic position. Here, we report its first reliable reference genome. Genome-wide metrics, such as average nucleotide identity and digital DNA-DNA hybridization, and phylogenomic analysis based on multiple genes supported its taxonomic status as a distinct species in the genus Lactobacillus. The availability of a reliable genome sequence will aid future investigations on the industrial applications of L. parakefiri in functional foods such as kefir grains.
著者
Yutaka Ueno Kiyoshi Asai Masanori Arita
出版者
日本バイオインフォマティクス学会
雑誌
Genome Informatics (ISSN:09199454)
巻号頁・発行日
vol.10, pp.166-175, 1999 (Released:2011-07-11)
参考文献数
16

We have constructed a general framework for integrating application programs with control through a local Web browser. This method is based on a simple inter-process message function from an external process to application programs. Commands to a target program areprepared in a script file, which is parsed by a message dispatcher program. When it is used as ahelper application to a Web browser, these messages will be sent from the browser by clicking a hyper-link in a Web document. Our framework also supports pluggable extension-modules for application programs by means of dynamic linking. A prototype system is implemented on our molecular structure-viewer program, MOSBY. It successfully featured a function to load an extension-module required for the docking study of molecular fragments from a Web page. Our simple framework facilitates the concise configuration of Web softwares without complicated knowledge on network computation and security issues. It is also applicable for a wide range of network computations processing private data using a Web browser.