文献一覧: IPSJ Transactions on System LSI Design Methodology (雑誌)

33 0 0 0 OA Detecting Arithmetic Optimization Opportunities for C Compilers by Randomly Generated Equivalent Programs

著者: Atsushi Hashimoto Nagisa Ishiura
出版者: 一般社団法人情報処理学会
雑誌: IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日: vol.9, pp.21-29, 2016 (Released:2016-02-12)
参考文献数: 17
被引用文献数: 10

This paper presents new methods of detecting missed arithmetic optimization opportunities for C compilers by random testing. For each iteration of random testing, two equivalent programs are generated, where the arithmetic expressions in the second program are more optimized in the C program level. By comparing the two assembly codes compiled from the two C programs, lack of optimization on either of the programs is detected. This method is further extended for detecting erroneous or insufficient optimization involving volatile variables. Two random programs differing only on the initial values for volatile variables are generated, and the resulting assembly codes are compared. Random test systems implemented based on the proposed methods have detected missed optimization opportunities on several compilers, including the latest development versions of GCC-5.0.0 and LLVM/Clang-3.6.

2016-02-23 17:17:04
1 Delicious
32 + 0 Twitter

2 0 0 0 OA R-GCN Based Function Inference for Gate-level Netlist

著者: Motoki Amagasaki Hiroki Oyama Yuichiro Fujishiro Masahiro Iida Hiroaki Yasuda Hiroto Ito
出版者: Information Processing Society of Japan
雑誌: IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日: vol.13, pp.69-71, 2020 (Released:2020-08-13)
参考文献数: 7

Graph neural networks are a type of deep-learning model for classification of graph domains. To infer arithmetic functions in a netlist, we applied relational graph convolutional networks (R-GCN), which can directly treat relations between nodes and edges. However, because original R-GCN supports only for node level labeling, it cannot be directly used to infer set of functions in a netlist. In this paper, by considering the distribution of labels for each node, we show a R-GCN based function inference method and data augmentation technique for netlist having multiple functions. According to our result, 91.4% accuracy is obtained from 1, 000 training data, thus demonstrating that R-GCN-based methods can be effective for graphs with multiple functions.

2020-08-24 10:24:38
2 + 0 Twitter

2 0 0 0 OA DRAMSys: A Flexible DRAM Subsystem Design Space Exploration Framework

著者: Matthias Jung Christian Weis Norbert Wehn
出版者: 一般社団法人情報処理学会
雑誌: IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日: vol.8, pp.63-74, 2015 (Released:2015-08-01)
参考文献数: 53
被引用文献数: 44

In systems ranging from mobile devices to servers, Dynamic Random Access Memories (DRAM) have a big impact on performance and contributes a significant part of the total consumed power. Conventional DDR3-based solutions are stretched thin as their maximum bandwidth is limited by the I/O count and interface speed. As new solutions are coming onto the market (JEDEC DDR4, JEDEC WIDE I/O, Micron's hybrid memory cube: HMC or JEDEC's high bandwidth memory: HBM) it is critical to evaluate the performance of these solutions and assess their suitability for specific applications. Furthermore, in systems with 3D stacking, the challenges of high power densities and thermal dissipation are exacerbated. It is crucial to have a flexible and holistic DRAM subsystem framework for exhaustive design space explorations, which can handle all this different types of memories, as well as the aspects of performance, power and temperature.

2016-04-26 10:22:37
2 + 1 Twitter

1 0 0 0 OA Conversion of Logic Gates in Netlists for Rapid Single Flux Quantum Circuits Utilizing Confluence of Pulses

著者: Nobutaka Kito Kazuyoshi Takagi Naofumi Takagi
出版者: Information Processing Society of Japan
雑誌: IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日: vol.12, pp.78-80, 2019 (Released:2019-08-01)
参考文献数: 11
被引用文献数: 4

A conversion method of a netlist consisting of conventional logic gates for superconducting rapid single flux quantum (RSFQ) circuit realization is proposed. The method detects OR gates which can be replaced with confluence buffers (CBs) which converge their input pulses into their outputs. The detection problem of replaceable OR gates is treated as a SAT problem. Each OR gate requires clock input in RSFQ circuits. By replacing OR gates with CBs, wiring for clocking those OR gates are eliminated and the number of active devices known as Josephson junctions is reduced.

2019-08-01 22:28:31
1 + 0 Twitter

1 0 0 0 OA Large-Scale 3D Chips: Challenges and Solutions for Design Automation, Testing, and Trustworthy Integration

著者: Johann Knechtel Ozgur Sinanoglu Ibrahim (Abe) M. Elfadel Jens Lienig Cliff C. N. Sze
出版者: Information Processing Society of Japan
雑誌: IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日: vol.10, pp.45-62, 2017 (Released:2017-08-02)
参考文献数: 162
被引用文献数: 34

Three-dimensional (3D) integration of electronic chips has been advocated by both industry and academia for many years. It is acknowledged as one of the most promising approaches to meet ever-increasing demands on performance, functionality, and power consumption. Furthermore, 3D integration has been shown to be most effective and efficient once large-scale integration is targeted for. However, a multitude of challenges has thus far obstructed the mainstream transition from “classical 2D chips” to such large-scale 3D chips. In this paper, we survey all popular 3D integration options available and advocate that using an interposer as system-level integration backbone would be the most practical for large-scale industrial applications and design reuse. We review major design (automation) challenges and related promising solutions for interposer-based 3D chips in particular, among the other 3D options. Thereby we outline (i) the need for a unified workflow, especially once full-custom design is considered, (ii) the current design-automation solutions and future prospects for both classical (digital) and advanced (heterogeneous) interposer stacks, (iii) the state-of-art and open challenges for testing of 3D chips, and (iv) the challenges of securing hardware in general and the prospects for large-scale and trustworthy 3D chips in particular.

2019-07-25 12:22:16
1 + 1 Twitter

1 0 0 0 DRAMSys: A Flexible DRAM Subsystem Design Space Exploration Framework

著者: Jung Matthias Weis Christian Wehn Norbert
出版者: 一般社団法人情報処理学会
雑誌: IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日: vol.8, pp.63-74, 2015
被引用文献数: 44

In systems ranging from mobile devices to servers, Dynamic Random Access Memories (DRAM) have a big impact on performance and contributes a significant part of the total consumed power. Conventional DDR3-based solutions are stretched thin as their maximum bandwidth is limited by the I/O count and interface speed. As new solutions are coming onto the market (JEDEC DDR4, JEDEC WIDE I/O, Microns hybrid memory cube: HMC or JEDECs high bandwidth memory: HBM) it is critical to evaluate the performance of these solutions and assess their suitability for specific applications. Furthermore, in systems with 3D stacking, the challenges of high power densities and thermal dissipation are exacerbated. It is crucial to have a flexible and holistic DRAM subsystem framework for exhaustive design space explorations, which can handle all this different types of memories, as well as the aspects of performance, power and temperature.

2017-12-01 05:15:15
1 + 0 Twitter

1 0 0 0 OA An Accurate and Fast Trace-aware Performance Estimation Model For Prioritized MPSoC Bus With Multiple Interfering Bus-Masters

著者: Farhan Shafiq Tsuyoshi Isshiki Dongju Li
出版者: 一般社団法人情報処理学会
雑誌: IPSJ Transactions on System LSI Design Methodology (ISSN:18826687)
巻号頁・発行日: vol.10, pp.13-27, 2017 (Released:2017-02-03)
参考文献数: 38

Accurate and fast performance estimation methods for modern and future multi-core systems are the focal point of much research due to the complexity associated with such architectures. The communication architecture of such systems has a huge impact on the performance and power of the whole system. Architects need to explore many design possibilities by using performance estimation techniques at early stages of design to make design decisions earlier in the design cycle. While software developers need to develop and test applications for the target architecture and gather performance measurements as early in the design cycle as possible. Full system simulation techniques provide accurate performance values but are extremely time consuming. Static analysis techniques are fast but cannot capture the dynamic behavior associated with shared resource contention and arbitration. Moreover, synthetic traffic patterns have been used to analyze the communication architecture however, such patterns are not realistic enough. We propose a statistical based model to predict the dynamic cost of bus arbitration on the performance of a bus architecture. The proposed model uses workload trace of the actual applications and benchmarks to capture the real application traffic behavior. Statistics on the traffic patterns are collected and input to the analytical model which calculates performance values for the communication architecture under consideration. By knowing the performance measures, designers can avoid over and under-design of the communication architecture. This paper builds up on a previously developed performance estimation model. The previous work modeled single and burst bus-transfers, however only one interfering bus master at a time for each blocked bus request was considered. The proposed, improved accuracy model considers multiple interfering masters for each blocked request hence improving the estimation accuracy especially for traffic intensive applications and many PE architectures. Experiments are performed for two different architectures i.e., 4 processing elements connected via a shared bus and 8 processing elements connected via a shared bus. Results show no significant difference in accuracy compared to previously developed model, for low traffic applications SPARSE and ROBOT however notable accuracy improvement for traffic intensive applications. Maximum estimation error is reduced from 1.75% to 0.6% for FPPPP and from maximum 13.91% to 8.8% for FFT on the 4PE architecture. On the 8PE architecture, maximum estimation error is reduced from 11.8% to 2.7% for the FPPP benchmark. Moreover simulation speed-up for the proposed technique over simulation method is reported.

2017-02-23 16:24:11
1 + 0 Twitter