著者
山田 憲嗣 高橋 秀也 志水 英二
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
電子情報通信学会論文誌 D (ISSN:09151923)
巻号頁・発行日
vol.J80-D2, no.11, pp.2986-2994, 1997-11-25

本論文では,短波長域における結像法として利用されている符号化開口法を可視光域に適用し,物体の位置と物体表面の3次元形状を検出する手法について述べる.本手法は物体から開口面までの距離により,投影面上に映る開口面の大きさが変化する特性を用い,開口面から物体までの距離を検出する.この距離検出を対象物体の表面の各画素に拡張することで物体表面の3次元形状を検出する.実際に,短波長域で用いる符号化開口法を可視光域で用いることができる条件を考察し,試作システムを構築して3次元物体の形状を検出した.提案する検出法は,両眼視法とは異なり,物体の反射光だけを利用した単眼視法であるので,簡単な測定システムで3次元形状検出を実現することができる.また,光だけでなく波動の性質をもつものであれば可視でも不可視でも本手法を利用し,3次元形状検出を行うことが可能である.
著者
Daisuke SAITO Nobuaki MINEMATSU Keikichi HIROSE
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E103.D, no.6, pp.1395-1405, 2020-06-01 (Released:2020-06-01)
参考文献数
28

This paper describes a novel approach to flexible control of speaker characteristics using tensor representation of multiple Gaussian mixture models (GMM). In voice conversion studies, realization of conversion from/to an arbitrary speaker's voice is one of the important objectives. For this purpose, eigenvoice conversion (EVC) based on an eigenvoice GMM (EV-GMM) was proposed. In the EVC, a speaker space is constructed based on GMM supervectors which are high-dimensional vectors derived by concatenating the mean vectors of each of the speaker GMMs. In the speaker space, each speaker is represented by a small number of weight parameters of eigen-supervectors. In this paper, we revisit construction of the speaker space by introducing the tensor factor analysis of training data set. In our approach, each speaker is represented as a matrix of which the row and the column respectively correspond to the dimension of the mean vector and the Gaussian component. The speaker space is derived by the tensor factor analysis of the set of the matrices. Our approach can solve an inherent problem of supervector representation, and it improves the performance of voice conversion. In addition, in this paper, effects of speaker adaptive training before factorization are also investigated. Experimental results of one-to-many voice conversion demonstrate the effectiveness of the proposed approach.
著者
生川 菜々 小川 晃一
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
電子情報通信学会論文誌 B (ISSN:13444697)
巻号頁・発行日
vol.J103-B, no.2, pp.79-89, 2020-02-01

本論文では,32×32 MIMO (Multiple Input Multiple Output)・AOA (Angle of Arrival)複合アンテナに関し,MIMO・AOAアンテナ間の電磁結合による到来波推定機能の位相摂動を緩和させるため,リアクタンス素子の装荷を図り,測角精度の改善方法ついて検討した.32×32 MIMOシステムの位相モノパルスAOAアンテナは多数のMIMOサブアレーに囲まれているため,電磁結合が生じ到来波方向推定が困難になる.この問題を解決するため,負荷インピーダンスのリアクタンス値を制御する方法を提案し,高精度な到来波方向推定が可能となることを示した.解析結果より,負荷インピーダンスを最適化することによって,0.2度以下の推定角度誤差が得られた.直接波レベルが小さいKファクターが-10dBのライスフェージング環境下においても5度以下の測角精度が達成できることを明らかにした.更に,負荷インピーダンスを装荷し,ライスフェージング環境下のOTA (Over-The-Air)実験を行った.その結果,リアクタンス負荷を装荷することにより測角誤差を低減できることを確認した.更に,測角誤差低減メカニズムを素子上の電流分布から考察した.
著者
Kanta MATSUURA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
vol.E102-A, no.1, pp.3-10, 2019-01-01
被引用文献数
6

Financial Technology (FinTech) is considered a taxonomy that describes a wide range of ICT (information and communications technology) associated with financial transactions and related operations. Improvement of service quality is the main issue addressed in this taxonomy, and there are a large number of emerging technologies including blockchain-based cryptocurrencies and smart contracts. Due to its innovative nature in accounting, blockchain can also be used in lots of other FinTech contexts where token models play an important role for financial engineering. This paper revisits some of the key concepts accumulated behind this trend, and shows a generalized understanding of the technology using an adapted stochastic process. With a focus on financial instruments using blockchain, research directions toward stable applications are identified with the help of a newly proposed stabilizer: interpretation function of token valuation. The idea of adapted stochastic process is essential for the stabilizer, too.
著者
Keisuke UTSU Osamu UCHIDA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
pp.2020EAL2040, (Released:2020-05-20)
被引用文献数
5

The 2019 Typhoon Hagibis (No. 19) caused widespread destruction in eastern Japan. During the disaster, many tweets including rescue request hashtags such as #救助(meaning #Rescue) and #救助要請(meaning #Rescue_request) were posted on Twitter. An official disaster information account of the Nagano Prefectural Government asked the public to provide information in the form of damage reports and rescue requests using the hashtag #台風19号長野県被害(#Typhoon_No.19_Nagano_Prefecture_damage). As a result, many tweets were posted using this hashtag. Moreover, the account contacted the posters of tweets requesting rescue and delivered the information to the Fire Department. In this study, we analyze the circumstances of the above tweets.
著者
Nobuyuki Shiraki Takefumi Hiraguri Isamu Shitara Naoki Honma
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Communications Express (ISSN:21870136)
巻号頁・発行日
vol.9, no.3, pp.72-76, 2020 (Released:2020-03-01)
参考文献数
3
被引用文献数
1

This paper proposes a route optimization scheme for three-dimensional wireless mesh networks implemented by using formation flying of autonomous unmanned aerial vehicles, also known as drones. Inter-drone communication above the ground can achieve free space propagation as obstacles are eliminated. However, transmission efficiency can be degraded due to sneak path interference inside the drone [1].In this paper, based on formation flight related work [2] suitable for drone communications, we theoretically analyze this interference by directional beams and propose an optimal route selection scheme with high transmission rate.
著者
Jin-Taek SEONG
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E103.D, no.5, pp.1164-1167, 2020-05-01 (Released:2020-05-01)
参考文献数
15

The aim of this paper is to show an upper bound for finding defective samples in a group testing framework. To this end, we exploit minimization of Hamming weights in coding theory and define probability of error for our decoding scheme. We derive a new upper bound on the probability of error. We show that both upper and lower bounds coincide with each other at an optimal density ratio of a group matrix. We conclude that as defective rate increases, a group matrix should be sparser to find defective samples with only a small number of tests.
著者
Mohammed Salah AL-RADHI Tamás Gábor CSAPÓ Géza NÉMETH
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E103-D, no.5, pp.1099-1107, 2020-05-01
被引用文献数
2

In this article, we propose a method called “continuous noise masking (cNM)” that allows eliminating residual buzziness in a continuous vocoder, i.e. of which all parameters are continuous and offers a simple and flexible speech analysis and synthesis system. Traditional parametric vocoders generally show a perceptible deterioration in the quality of the synthesized speech due to different processing algorithms. Furthermore, an inaccurate noise resynthesis (e.g. in breathiness or hoarseness) is also considered to be one of the main underlying causes of performance degradation, leading to noisy transients and temporal discontinuity in the synthesized speech. To overcome these issues, a new cNM is developed based on the phase distortion deviation in order to reduce the perceptual effect of the residual noise, allowing a proper reconstruction of noise characteristics, and model better the creaky voice segments that may happen in natural speech. To this end, the cNM is designed to keep only voice components under a condition of the cNM threshold while discarding others. We evaluate the proposed approach and compare with state-of-the-art vocoders using objective and subjective listening tests. Experimental results show that the proposed method can reduce the effect of residual noise and can reach the quality of other sophisticated approaches like STRAIGHT and log domain pulse model (PML).
著者
Shinya HORIIKE Masanori MORISE
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E103-D, no.5, pp.1199-1202, 2020-05-01

To improve the likability of speech, we propose a voice conversion algorithm by controlling the fundamental frequency (F0) and the spectral envelope and carry out a subjective evaluation. The subjects can manipulate these two speech parameters. From the result, the subjects preferred speech with a parameter related to higher brightness.
著者
上瀧 剛 井尻 善久
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
電子情報通信学会論文誌 D (ISSN:18804535)
巻号頁・発行日
vol.J98-D, no.8, pp.1126-1141, 2015-08-01

物体検出は,画像処理の基礎技術の一つである.物体検出の性能は処理時間と検出率・位置精度で記述することができる.処理時間と検出率を追求した物体検出に関する文献は非常に多い.一方,位置精度を追求した文献も増えつつあるが,実応用上の重要性に比べ,学術界における注目度は低いように思われる.本解説論文においては,実応用上の重要性を述べるとともに,既存研究を総括し,幾つかの手法に対するベンチマーク評価を行った後に,今後の展望についても触れる.
著者
小川 樹 森勢 将雅
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
電子情報通信学会論文誌 D (ISSN:18804535)
巻号頁・発行日
vol.J103-D, no.4, pp.205-214, 2020-04-01

音声合成や声質変換に関する技術は幅広く提案され,既にいくつもの製品が多岐にわたって利用されるようになった.声質変換技術の普及により,音声の加工も誰でも手軽に行えるようになった.音声の加工には,音の3要素と呼ばれる「大きさ」,「高さ」,「音色」をそれぞれ加工する方法が広く用いられている.大きさや高さは,音圧レベルや基本周波数を加工するため,加工の結果の予測が容易である.しかし,音色の加工は,加工に伴う劣化の予測が困難という問題点がある.本研究では,音声の音色加工に伴う劣化を計測する知覚モデルによりこの問題の解決を図る.様々なスペクトル尺度と距離関数の組み合わせと音質の関係を調査し,その結果を用いて知覚モデルを開発した.主観評価実験を実施し,従来法と開発した知覚モデルの間に,主観評価結果との相関係数の有意差があるかの検定を行った.検定の結果,p<0.001で有意な差があり,従来法より開発した知覚モデルが優れていることを示した.
著者
児玉 謙太郎 牧野 遼作 清水 大地
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
電子情報通信学会論文誌 A (ISSN:09135707)
巻号頁・発行日
vol.J102-A, no.2, pp.26-34, 2019-02-01

本研究では,じゃんけんというマルチモーダルなコミュニケーションにおいて,音声による聴覚情報が参与者間の身体の協調・同期に及ぼす影響を実験的に検討した.その際,力学系アプローチという視点からコミュニケーションを自己組織化現象と捉えた.そして,コミュニケーション過程を参与者間の知覚情報を介したリアルタイムな行為の調整過程とみなし,身体協調を非線形時系列解析により評価した.実験により「最初はグーじゃんけんぽん」という掛け声を発する通常条件と掛け声を発しない声なし条件を比較した結果,じゃんけんの最終段階での参与者2名の手の振り降ろしの時間差には,条件間で有意な差はみられなかった.一方,行為の開始から終了に至る過程の参与者の手の協調における安定性と予測可能性に有意な差がみられ,通常条件のほうが,参与者間の手の協調が安定し,予測可能性が高い動きをしていたことが明らかとなった.これらの結果から,ヒトは数秒という短い時間に行われるコミュニケーションであっても,1)聴覚情報が利用できない条件では参与者らはリアルタイムに視覚情報を利用し,結果的に同期を達成できるよう柔軟に振る舞うこと,ただし,2)その行為の調整過程での身体の協調には聴覚情報が影響すること,が示唆された.
著者
Yudi ZHANG Debiao HE Xinyi HUANG Ding WANG Kim-Kwang Raymond CHOO Jing WANG
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E103.D, no.2, pp.188-195, 2020-02-01 (Released:2020-02-01)
参考文献数
31
被引用文献数
16

Unlike black-box cryptography, an adversary in a white-box security model has full access to the implementation of the cryptographic algorithm. Thus, white-box implementation of cryptographic algorithms is more practical. Nevertheless, in recent years, there is no white-box implementation for public key cryptography. In this paper, we propose the first white-box implementation of the identity-based signature scheme in the IEEE P1363 standard. Our main idea is to hide the private key to multiple lookup tables, so that the private key cannot be leaked during the algorithm executed in the untrusted environment. We prove its security in both black-box and white-box models. We also evaluate the performance of our white-box implementations, in order to demonstrate utility for real-world applications.
著者
Kazunori HAYASHI Masaaki NAGAHARA Toshiyuki TANAKA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Communications (ISSN:09168516)
巻号頁・発行日
vol.E96-B, no.3, pp.685-712, 2013-03-01
被引用文献数
177

This survey provides a brief introduction to compressed sensing as well as several major algorithms to solve it and its various applications to communications systems. We firstly review linear simultaneous equations as ill-posed inverse problems, since the idea of compressed sensing could be best understood in the context of the linear equations. Then, we consider the problem of compressed sensing as an underdetermined linear system with a prior information that the true solution is sparse, and explain the sparse signal recovery based on 1 optimization, which plays the central role in compressed sensing, with some intuitive explanations on the optimization problem. Moreover, we introduce some important properties of the sensing matrix in order to establish the guarantee of the exact recovery of sparse signals from the underdetermined system. After summarizing several major algorithms to obtain a sparse solution focusing on the 1 optimization and the greedy approaches, we introduce applications of compressed sensing to communications systems, such as wireless channel estimation, wireless sensor network, network tomography, cognitive radio, array signal processing, multiple access scheme, and networked control.
著者
Luis Rafael MARVAL-PÉREZ Koichi ITO Takafumi AOKI
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences (ISSN:09168508)
巻号頁・発行日
vol.E102-A, no.10, pp.1351-1363, 2019-10-01
被引用文献数
1

Access control and surveillance applications like walking-through security gates and immigration control points have a great demand for convenient and accurate biometric recognition in unconstrained scenarios with low user cooperation. The periocular region, which is a relatively new biometric trait, has been attracting much attention for recognition of an individual in such scenarios. This paper proposes a periocular recognition method that combines Phase-Based Correspondence Matching (PB-CM) with a texture enhancement technique. PB-CM has demonstrated high recognition performance in other biometric traits, e.g., face, palmprint and finger-knuckle-print. However, a major limitation for periocular region is that the performance of PB-CM degrades when the periocular skin has poor texture. We address this problem by applying texture enhancement and found out that variance normalization of texture significantly improves the performance of periocular recognition using PB-CM. Experimental evaluation using three public databases demonstrates the advantage of the proposed method compared with conventional methods.
著者
松永 悟行 大谷 大和 平原 達也
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
電子情報通信学会論文誌 D (ISSN:18804535)
巻号頁・発行日
vol.J102-D, no.10, pp.721-729, 2019-10-01

Deep Neural Network(DNN)を用いた音声合成の基本的な構成は,文章を構成する情報を数値で表現した言語特徴量を入力して音声を合成するための特徴量を出力するものである.これらの入出力特徴量は,DNNに適するように学習データを用いて正規化や標準化することが多い.しかし,自由文章から音声を合成する場合には,この正規化の範囲や標準化の分布から外れる値が言語特徴量に含まれる可能性がある.そして,この外れ値はDNNの外挿能力が十分でないために適切に補間されないまま伝搬して出力特徴量に誤差を生じさせる.本論文では,言語特徴量の外れ値の問題を解決するために,一発話内の閉じた条件における正規化手法を提案し,日本語の音声合成で重要な要素の一つである基本周波数について,予測誤差と合成音声の聴取による評価を行った.その結果,提案した正規化手法では,従来の正規化手法で発生していた外れ値は発生しないこと,正規化した値が基本周波数に適したものになったことにより少量の学習データでも予測誤差は従来よりも小さくなり,安定した予測が可能になることがわかった.
著者
品川 政太朗 吉野 幸一郎 サクティ サクリアニ 鈴木 優 中村 哲
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
電子情報通信学会論文誌 D (ISSN:18804535)
巻号頁・発行日
vol.J102-D, no.8, pp.514-529, 2019-08-01

自然言語から思い通りの画像を生成するシステムは,コンピュータによるデザイン作成支援に有用であると期待されている.本研究では,自然言語によってインタラクティブな画像編集を行うことを指向して,自然言語の指示によって直前にシステムからユーザへ共有された生成画像を操作することで,より意図に沿った画像を新たに生成するフレームワークを提案する.具体的には,修正元となる画像データをConvolutional neural networks (CNNs)によって埋め込んだベクトルと,画像に対する自然言語の修正指示文をLong short-term memory neural networks (LSTM)によって埋め込んだベクトルを入力とし,敵対的学習によって指示通りに修正された画像の生成を行う枠組みを提案した.実験では,手書き数字操作データセットを用いた単純なタスクにより,提案モデルが学習した画像編集タスクにおける振る舞いについて分析した.また,実際に人手で付与した指示文によってアバター画像を修正し,意図に沿った編集を行うことができることを確認した.
著者
Hiroki NAKAHARA Haruyoshi YONEKAWA Tomoya FUJII Masayuki SHIMODA Shimpei SATO
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E102.D, no.5, pp.1003-1011, 2019-05-01 (Released:2019-05-01)
参考文献数
38
被引用文献数
16

The GUINNESS (GUI based binarized neural network synthesizer) is an open-source tool flow for a binarized deep neural network toward FPGA implementation based on the GUI including both the training on the GPU and inference on the FPGA. Since all the operation is done on the GUI, the software designer is not necessary to write any scripts to design the neural network structure, training behavior, only specify the values for hyperparameters. After finishing the training, it automatically generates C++ codes to synthesis the bit-stream using the Xilinx SDSoC system design tool flow. Thus, our tool flow is suitable for the software programmers who are not familiar with the FPGA design. In our tool flow, we modify the training algorithms both the training and the inference for a binarized CNN hardware. Since the hardware has a limited number of bit precision, it lacks minimal bias in training. Also, for the inference on the hardware, the conventional batch normalization technique requires additional hardware. Our modifications solve these problems. We implemented the VGG-11 benchmark CNN on the Digilent Inc. Zedboard. Compared with the conventional binarized implementations on an FPGA, the classification accuracy was almost the same, the performance per power efficiency is 5.1 times better, as for the performance per area efficiency, it is 8.0 times better, and as for the performance per memory, it is 8.2 times better. We compare the proposed FPGA design with the CPU and the GPU designs. Compared with the ARM Cortex-A57, it was 1776.3 times faster, it dissipated 3.0 times lower power, and its performance per power efficiency was 5706.3 times better. Also, compared with the Maxwell GPU, it was 11.5 times faster, it dissipated 7.3 times lower power, and its performance per power efficiency was 83.0 times better. The disadvantage of our FPGA based design requires additional time to synthesize the FPGA executable codes. From the experiment, it consumed more three hours, and the total FPGA design took 75 hours. Since the training of the CNN is dominant, it is considerable.
著者
菊地 真人 川上 賢十 吉田 光男 梅村 恭司
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
電子情報通信学会論文誌 D (ISSN:18804535)
巻号頁・発行日
vol.J102-D, no.4, pp.289-301, 2019-04-01

データを確率的に取り扱う問題において,統計的尺度の推定は手法の構成やデータ分析の基盤的役割を担う.本論文では統計的尺度の一つであるゆう度比を,離散的な標本空間から得た観測頻度をもとに推定する問題を扱う.素朴な推定方法は,ゆう度比の定義に従い,ゆう度比を構成する二つの確率分布を最ゆう推定して,その比を取ることである.しかし,低頻度からゆう度比を求めるとき,この方法は推定量を不当に高く見積もってしまう場合がある.そこで,ゆう度比の直接推定法uLSIFを応用し,ゆう度比を低めに(保守的に)推定する方法を提案する.提案手法は,最ゆう推定によって求めたゆう度比を正則化パラメータによって調整する枠組みである.実験では提案手法の振る舞いを明らかにし,その有効性を示した.更に,自然言語処理におけるブートストラップ法を利用した実験も行い,提案手法の実用性も示した.
著者
Akinori HIDAKA Takio KURITA
出版者
The Institute of Electronics, Information and Communication Engineers
雑誌
IEICE Transactions on Information and Systems (ISSN:09168532)
巻号頁・発行日
vol.E99.D, no.11, pp.2734-2744, 2016-11-01 (Released:2016-11-01)
参考文献数
26
被引用文献数
1

Kernel discriminant analysis (KDA) is the mainstream approach of nonlinear discriminant analysis (NDA). Since it uses the kernel trick, KDA does not consider its nonlinear discriminant mapping explicitly. In this paper, another NDA approach where the nonlinear discriminant mapping is analytically given is developed. This study is based on the theory of optimal nonlinear discriminant analysis (ONDA) of which the nonlinear mapping is exactly expressed by using the Bayesian posterior probability. This theory indicates that various NDA can be derived by estimating the Bayesian posterior probability in ONDA with various estimation methods. Also, ONDA brings an insight about novel kernel functions, called discriminant kernel (DK), which is defined by also using the posterior probabilities. In this paper, several NDA and DK derived from ONDA with several posterior probability estimators are developed and evaluated. Given fine estimation methods of the Bayesian posterior probability, they give good discriminant spaces for visualization or classification.