著者
手良向 聡
出版者
The Biometric Society of Japan
雑誌
計量生物学 (ISSN:09184430)
巻号頁・発行日
vol.44, no.1, pp.35-51, 2023-10-31 (Released:2023-12-06)
参考文献数
51

Determination of the number of subjects to include in a clinical trial is a crucial aspect of experimental design. The standard methodology for sample size determination (SSD) has been established based on a frequentist perspective, while the literature addressing the SSD problem from a Bayesian perspective has increased for the last 20 years. In this paper I discuss the basic concept of Bayesian SSD, with specific focus on an inferential performance-based (non-decision theoretic) approach, using two distinct prior distributions: analysis prior and design prior. The analysis prior formalizes pre-trial information, and it is used to obtain posterior distributions, while the design prior describes a scenario and it is used to obtain prior predictive distributions. In practice, the specification of prior distributions is a key element of Bayesian inference. The prior information may be derived from either expert beliefs or relevant empirical data, and the subjective knowledge derived from an expert elicitation procedure may be useful to define a prior distribution when no or limited data from previous studies is available. In experimental design, the interplay between Bayesian and frequentist methodology is intrinsic. Whichever method is used in SSD, the distinction between demands as expressed in the range of equivalence, and their expectation or beliefs, as represented by the prior information is of paramount importance.
著者
Takashi Yanagawa
出版者
The Biometric Society of Japan
雑誌
計量生物学 (ISSN:09184430)
巻号頁・発行日
vol.40, no.2, pp.69-79, 2020-06-01 (Released:2020-07-21)
参考文献数
15

Reproducibility is the essence of a scientific research. Focusing on two-sample problems we discuss in this paper the reproducibility of statistical test results based on p-values. First, demonstrating large variability of p-values it is shown that p-values lack the reproducibility, in particular, if sample sizes are not enough. Second, a sample size formula is developed to assure the reproducibility probability of p-value at given level by assuming normal distributions with known variance. Finally, the sample size formula for the reproducibility in general framework is shown equivalent to the sample size formula that has been developed in the Neyman-Pearson type testing statistical hypothesis by employing the level of significance and size of power.
著者
Tasuku Okui Yutaka Matsuyama Shigeyuki Nakaji
出版者
The Biometric Society of Japan
雑誌
計量生物学 (ISSN:09184430)
巻号頁・発行日
vol.39, no.2, pp.55-84, 2019-01-31 (Released:2019-05-11)
参考文献数
59

Nowadays, many methods that employ the 16S ribosomal RNA gene (16S rRNA sequencing data) have been proposed for the analysis of gut microbial compositional data. 16S rRNA sequencing data is statistically multivariate count data. When multivariate data analysis methods are used for association analysis with a disease, 16S rRNA sequencing data is generally normalized before analysis models are fitted, because the total sequence read counts of the subjects are different. However, proper methods for normalization have not yet been discussed or proposed. Rarefying is one such normalization method that equals the total counts of subjects by subsampling a certain amount of counts from each subject. It was thought that if rarefying were combined with ensemble learning, performance improvement could be achieved. Then, we proposed an association analysis method by combining rarefying with ensemble learning and evaluated its performance by simulation experiment using several multivariate data analysis methods. The proposed method showed superior performance compared with other analysis methods, with regard to the identification ability of response-associated variables and the classification ability of a response variable. We also used each evaluated method to analyze the gut microbial data of Japanese people, and then compared these results.
著者
津本 周作 平野 章二 津本 優子
出版者
The Biometric Society of Japan
雑誌
計量生物学 (ISSN:09184430)
巻号頁・発行日
vol.29, pp.S155-S176, 2008-12-15
被引用文献数
1

Hosptial information system (HIS) collects all the data from all the branches of departments in a hospital, including laboratory tests, physiological tests, electronic patient records. Thus, HIS can be viewed as a large heterogenous database, which stores chronological changes in patients' status. This paper overviews three applications of data mining and statistical methods to HIS. First, clustering of temporal sequences based on multiscale matching was applied for grouping chronic hepatitis. Second, decision tree method was used for detection of risk factors, which was successfully used to prevent nursing medication errors. Finally, several linear models were applied for hospital management data. These results show that data mining methods, including decision tree mining, temporal data mining, are useful for detection of risk factors from large distributed data such as HIS, whose process can be called risk mining.