著者
Eli Kaminuma Yukino Baba Masahiro Mochizuki Hirotaka Matsumoto Haruka Ozaki Toshitsugu Okayama Takuya Kato Shinya Oki Takatomo Fujisawa Yasukazu Nakamura Masanori Arita Osamu Ogasawara Hisashi Kashima Toshihisa Takagi
出版者
The Genetics Society of Japan
雑誌
Genes & Genetic Systems (ISSN:13417568)
巻号頁・発行日
pp.19-00034, (Released:2020-03-26)
参考文献数
37
被引用文献数
3

Recently, the prospect of applying machine learning tools for automating the process of annotation analysis of large-scale sequences from next-generation sequencers has raised the interest of researchers. However, finding research collaborators with knowledge of machine learning techniques is difficult for many experimental life scientists. One solution to this problem is to utilise the power of crowdsourcing. In this report, we describe how we investigated the potential of crowdsourced modelling for a life science task by conducting a machine learning competition, the DNA Data Bank of Japan (DDBJ) Data Analysis Challenge. In the challenge, participants predicted chromatin feature annotations from DNA sequences with competing models. The challenge engaged 38 participants, with a cumulative total of 360 model submissions. The performance of the top model resulted in an area under the curve (AUC) score of 0.95. Over the course of the competition, the overall performance of the submitted models improved by an AUC score of 0.30 from the first submitted model. Furthermore, the 1st- and 2nd-ranking models utilised external data such as genomic location and gene annotation information with specific domain knowledge. The effect of incorporating this domain knowledge led to improvements of approximately 5%–9%, as measured by the AUC scores. This report suggests that machine learning competitions will lead to the development of highly accurate machine learning models for use by experimental scientists unfamiliar with the complexities of data science.
著者
Takuya Kuwahara Yukino Baba Hisashi Kashima Takeshi Kishikawa Junichi Tsurumi Tomoyuki Haga Yoshihiro Ujiie Takamitsu Sasaki Hideki Matsushima
出版者
Information Processing Society of Japan
雑誌
Journal of Information Processing (ISSN:18826652)
巻号頁・発行日
vol.26, pp.306-313, 2018 (Released:2018-03-15)
参考文献数
17
被引用文献数
18

Modern vehicles are equipped with Electronic Control Units (ECUs) and external communication devices. The Controller Area Network (CAN), a widely used communication protocol for ECUs, does not have a security mechanism to detect improper packets; if attackers exploit the vulnerability of an ECU and manage to inject a malicious message, they are able to control other ECUs to cause improper operation of the vehicle. With the increasing popularity of connected cars, it has become an urgent matter to protect in-vehicle networks against security threats. In this paper, we study the applicability of statistical anomaly detection methods for identifying malicious CAN messages in in-vehicle networks. We focus on intrusion attacks of malicious messages. Because the occurrence of an intrusion attack certainly influences the message traffic, we focus on the number of messages observed in a fixed time window to detect intrusion attacks. We formalize features to represent a message sequence that incorporates the number of messages associated with each receiver ID. We collected CAN message data from an actual vehicle and conducted a quantitative analysis of the methods and the features in practical situations. The results of our experiments demonstrated our proposed methods provide fast and accurate detection in various cases.