著者
Atsushi Hashimoto Nagisa Ishiura
出版者
一般社団法人 情報処理学会
雑誌
IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日
vol.9, pp.21-29, 2016 (Released:2016-02-12)
参考文献数
17
被引用文献数
9

This paper presents new methods of detecting missed arithmetic optimization opportunities for C compilers by random testing. For each iteration of random testing, two equivalent programs are generated, where the arithmetic expressions in the second program are more optimized in the C program level. By comparing the two assembly codes compiled from the two C programs, lack of optimization on either of the programs is detected. This method is further extended for detecting erroneous or insufficient optimization involving volatile variables. Two random programs differing only on the initial values for volatile variables are generated, and the resulting assembly codes are compared. Random test systems implemented based on the proposed methods have detected missed optimization opportunities on several compilers, including the latest development versions of GCC-5.0.0 and LLVM/Clang-3.6.
著者
Motoki Amagasaki Hiroki Oyama Yuichiro Fujishiro Masahiro Iida Hiroaki Yasuda Hiroto Ito
出版者
Information Processing Society of Japan
雑誌
IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日
vol.13, pp.69-71, 2020 (Released:2020-08-13)
参考文献数
7

Graph neural networks are a type of deep-learning model for classification of graph domains. To infer arithmetic functions in a netlist, we applied relational graph convolutional networks (R-GCN), which can directly treat relations between nodes and edges. However, because original R-GCN supports only for node level labeling, it cannot be directly used to infer set of functions in a netlist. In this paper, by considering the distribution of labels for each node, we show a R-GCN based function inference method and data augmentation technique for netlist having multiple functions. According to our result, 91.4% accuracy is obtained from 1, 000 training data, thus demonstrating that R-GCN-based methods can be effective for graphs with multiple functions.
著者
Matthias Jung Christian Weis Norbert Wehn
出版者
一般社団法人 情報処理学会
雑誌
IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日
vol.8, pp.63-74, 2015 (Released:2015-08-01)
参考文献数
53
被引用文献数
42

In systems ranging from mobile devices to servers, Dynamic Random Access Memories (DRAM) have a big impact on performance and contributes a significant part of the total consumed power. Conventional DDR3-based solutions are stretched thin as their maximum bandwidth is limited by the I/O count and interface speed. As new solutions are coming onto the market (JEDEC DDR4, JEDEC WIDE I/O, Micron's hybrid memory cube: HMC or JEDEC's high bandwidth memory: HBM) it is critical to evaluate the performance of these solutions and assess their suitability for specific applications. Furthermore, in systems with 3D stacking, the challenges of high power densities and thermal dissipation are exacerbated. It is crucial to have a flexible and holistic DRAM subsystem framework for exhaustive design space explorations, which can handle all this different types of memories, as well as the aspects of performance, power and temperature.
著者
Nobutaka Kito Kazuyoshi Takagi Naofumi Takagi
出版者
Information Processing Society of Japan
雑誌
IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日
vol.12, pp.78-80, 2019 (Released:2019-08-01)
参考文献数
11
被引用文献数
4

A conversion method of a netlist consisting of conventional logic gates for superconducting rapid single flux quantum (RSFQ) circuit realization is proposed. The method detects OR gates which can be replaced with confluence buffers (CBs) which converge their input pulses into their outputs. The detection problem of replaceable OR gates is treated as a SAT problem. Each OR gate requires clock input in RSFQ circuits. By replacing OR gates with CBs, wiring for clocking those OR gates are eliminated and the number of active devices known as Josephson junctions is reduced.
著者
Johann Knechtel Ozgur Sinanoglu Ibrahim (Abe) M. Elfadel Jens Lienig Cliff C. N. Sze
出版者
Information Processing Society of Japan
雑誌
IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日
vol.10, pp.45-62, 2017 (Released:2017-08-02)
参考文献数
162
被引用文献数
33

Three-dimensional (3D) integration of electronic chips has been advocated by both industry and academia for many years. It is acknowledged as one of the most promising approaches to meet ever-increasing demands on performance, functionality, and power consumption. Furthermore, 3D integration has been shown to be most effective and efficient once large-scale integration is targeted for. However, a multitude of challenges has thus far obstructed the mainstream transition from “classical 2D chips” to such large-scale 3D chips. In this paper, we survey all popular 3D integration options available and advocate that using an interposer as system-level integration backbone would be the most practical for large-scale industrial applications and design reuse. We review major design (automation) challenges and related promising solutions for interposer-based 3D chips in particular, among the other 3D options. Thereby we outline (i) the need for a unified workflow, especially once full-custom design is considered, (ii) the current design-automation solutions and future prospects for both classical (digital) and advanced (heterogeneous) interposer stacks, (iii) the state-of-art and open challenges for testing of 3D chips, and (iv) the challenges of securing hardware in general and the prospects for large-scale and trustworthy 3D chips in particular.
著者
Jung Matthias Weis Christian Wehn Norbert
出版者
一般社団法人 情報処理学会
雑誌
IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日
vol.8, pp.63-74, 2015
被引用文献数
42

In systems ranging from mobile devices to servers, Dynamic Random Access Memories (DRAM) have a big impact on performance and contributes a significant part of the total consumed power. Conventional DDR3-based solutions are stretched thin as their maximum bandwidth is limited by the I/O count and interface speed. As new solutions are coming onto the market (JEDEC DDR4, JEDEC WIDE I/O, Microns hybrid memory cube: HMC or JEDECs high bandwidth memory: HBM) it is critical to evaluate the performance of these solutions and assess their suitability for specific applications. Furthermore, in systems with 3D stacking, the challenges of high power densities and thermal dissipation are exacerbated. It is crucial to have a flexible and holistic DRAM subsystem framework for exhaustive design space explorations, which can handle all this different types of memories, as well as the aspects of performance, power and temperature.
著者
Farhan Shafiq Tsuyoshi Isshiki Dongju Li
出版者
一般社団法人 情報処理学会
雑誌
IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日
vol.10, pp.13-27, 2017 (Released:2017-02-03)
参考文献数
38

Accurate and fast performance estimation methods for modern and future multi-core systems are the focal point of much research due to the complexity associated with such architectures. The communication architecture of such systems has a huge impact on the performance and power of the whole system. Architects need to explore many design possibilities by using performance estimation techniques at early stages of design to make design decisions earlier in the design cycle. While software developers need to develop and test applications for the target architecture and gather performance measurements as early in the design cycle as possible. Full system simulation techniques provide accurate performance values but are extremely time consuming. Static analysis techniques are fast but cannot capture the dynamic behavior associated with shared resource contention and arbitration. Moreover, synthetic traffic patterns have been used to analyze the communication architecture however, such patterns are not realistic enough. We propose a statistical based model to predict the dynamic cost of bus arbitration on the performance of a bus architecture. The proposed model uses workload trace of the actual applications and benchmarks to capture the real application traffic behavior. Statistics on the traffic patterns are collected and input to the analytical model which calculates performance values for the communication architecture under consideration. By knowing the performance measures, designers can avoid over and under-design of the communication architecture. This paper builds up on a previously developed performance estimation model. The previous work modeled single and burst bus-transfers, however only one interfering bus master at a time for each blocked bus request was considered. The proposed, improved accuracy model considers multiple interfering masters for each blocked request hence improving the estimation accuracy especially for traffic intensive applications and many PE architectures. Experiments are performed for two different architectures i.e., 4 processing elements connected via a shared bus and 8 processing elements connected via a shared bus. Results show no significant difference in accuracy compared to previously developed model, for low traffic applications SPARSE and ROBOT however notable accuracy improvement for traffic intensive applications. Maximum estimation error is reduced from 1.75% to 0.6% for FPPPP and from maximum 13.91% to 8.8% for FFT on the 4PE architecture. On the 8PE architecture, maximum estimation error is reduced from 11.8% to 2.7% for the FPPP benchmark. Moreover simulation speed-up for the proposed technique over simulation method is reported.