- 著者
-
Sai YAO
Daichi KITAHARA
Hiroki KURODA
Akira HIRABAYASHI
- 出版者
- The Institute of Electronics, Information and Communication Engineers
- 雑誌
- IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
- 巻号頁・発行日
- vol.E106-A, no.2, pp.106-123, 2023-02-01
The mean, median, and mode are usually calculated from univariate observations as the most basic representative values of a random variable. To measure the spread of the distribution, the standard deviation, interquartile range, and modal interval are also calculated. When we analyze continuous relations between a pair of random variables from bivariate observations, regression analysis is often used. By minimizing appropriate costs evaluating regression errors, we estimate the conditional mean, median, and mode. The conditional standard deviation can be estimated if the bivariate observations are obtained from a Gaussian process. Moreover, the conditional interquartile range can be calculated for various distributions by the quantile regression that estimates any conditional quantile (percentile). Meanwhile, the study of the modal interval regression is relatively new, and spline regression models, known as flexible models having the optimality on the smoothness for bivariate data, are not yet used. In this paper, we propose a modal interval regression method based on spline quantile regression. The proposed method consists of two steps. In the first step, we divide the bivariate observations into bins for one random variable, then detect the modal interval for the other random variable as the lower and upper quantiles in each bin. In the second step, we estimate the conditional modal interval by constructing both lower and upper quantile curves as spline functions. By using the spline quantile regression, the proposed method is widely applicable to various distributions and formulated as a convex optimization problem on the coefficient vectors of the lower and upper spline functions. Extensive experiments, including settings of the bin width, the smoothing parameter and weights in the cost function, show the effectiveness of the proposed modal interval regression in terms of accuracy and visual shape for synthetic data generated from various distributions. Experiments for real-world meteorological data also demonstrate a good performance of the proposed method.