Real-time waveform matching with a digitizer at 10 GS/s. - In: IEEE Xplore digital library, ISSN 2473-2001, (2022), S. 94-100
Side-Channel Analysis (SCA) requires the detection of the specific time frame within which Cryptographic Operations (COs) take place in the side-channel signal. In laboratory conditions with full control over the Device under Test (DuT), dedicated trigger signals can be implemented to indicate the start and end of COs. For real-world scenarios, waveform-matching techniques have been established which compare the side-channel signal with a template of the CO's pattern in real time to detect the CO in the side channel. State-of-the-art approaches are implemented on Field-Programmable Gate Arrays (FPGAs). However, current waveform-matching designs process the samples from Analog-to-Digital Converters (ADCs) sequentially and can only work with low sampling rates due to the limited clock speed of FPGAs. This makes it increasingly difficult to apply existing techniques on modern DuTs that operate with clock speeds in the GHz range. In this paper, we present a parallel waveform-matching architecture that is capable of performing waveform matching at the speed of fast ADCs. We implement the proposed architecture in a high-end FPGA-based digitizer and deploy it to detect AES COs from the side channel of a single-board computer operating at 1 GHz. Our implementation allows for waveform matching at 10 GS/s with high accuracy, thus offering a speedup of 50× compared to the fastest state-of-the-art implementation known to us.
Using look up table content as signatures to identify IP cores in modern FPGAs. - In: Architecture of computing systems, (2022), S. 132-147
The increasing amount of logic resources in FPGA architectures has enabled the realization of larger and more complex designs. Today, most of the large-scale designs rely heavily on off-the-shelf Intellectual Property Cores (IP Cores) to ease their development. This dependency raises an important issue: the unlicensed use of IP Cores. In this paper, we utilize LUT contents, which represent the functionality of an IP Core, as a signature to determine if a core might be part of an accused design. For this, we present a technique to reconstruct the contained LUT contents from modern FPGA configurations which not only contain 6-input one-output LUTs but also 5-input two-output LUTs. By making use of LUT decomposition together with a fast Boolean matching algorithm, we consolidate the work for commercial architectures. The proposed method is evaluated using 8 IP Cores to find in 4 different designs using two different architectures. Our findings show a 100% identification rate with no false-positives or false-negatives for all experiments carried out. Especially the presence of larger cores can be established with a difference of at least 10% between true and false positives.
Putting IMT to the test: revisiting and expanding interval matching techniques and their calibration for SCA. - In: ASHES '22, (2022), S. 65-74
Side-Channel Analysis (SCA) requires the detection of the specific time frame Cryptographic Operations (COs) take place in the side-channel signal. Under laboratory conditions with full control over the Device under Test (DuT), dedicated trigger signals can be implemented to indicate the start and end of COs. For real-world scenarios, waveform-matching techniques have been established which compare the side-channel signal with a template of the CO's pattern in real time to detect the CO in the side channel. State-of-the-Art approaches describe implementations based on Field-Programmable Gate Arrays (FPGAs). However, the maximal length of the template is restricted by the resources available on an FPGAs. Particularly, for high sampling rates the recording of an entire CO may need more samples than the maximum template length supported by a waveform-matching system. Consequently, the template has to be reduced such that it fits the resources while still containing all features relevant for detecting the COs via waveform matching. In this paper, we introduce a generic interval-matching technique which provides several degrees of freedom for fine-tuning it to the statistical deviations of waveform measurements of COs. Moreover, we introduce a novel calibration method that finds the best parameters automatically based on statistical analysis of training data. Furthermore, we investigate a technique to reduce the number of features used for the interval matching by utilizing machine-learning-based feature extraction to find the most important samples in a template. Finally, we evaluate the state-of-the-art interval matching and our expansions during calibration and during the application on a test set. The results show, that a reliable reduction to 10% of the original template size is possible with a reduction method from literature for our example. However, the combination of our proposed methods can reliably work with only 1.5% of the original size and is less volatile than the state-of-the-art approach for reducing the number of features.
Design and error analysis of accuracy-configurable sequential multipliers via segmented carry chains. - In: Information technology, ISSN 2196-7032, Bd. 64 (2022), 3, S. 89-98
We present the design and a closed-form error analysis of accuracy-configurable multipliers via segmented carry chains. To address this problem, we model the approximate partial-product accumulations as a sequential process. According to a given splitting point of the carry chains, the technique herein discussed allows varying the quality of the accumulations and, consequently, the overall product. Due to these shorter critical paths, such kinds of approximate multipliers can trade-off accuracy for an increased performance whilst exploiting the inherent area savings of sequential over combinatorial approaches. We implemented multiple architectures targeting FPGAs and ASICs with different bit-widths and accuracy configurations to 1) estimate resources, power consumption, and delay, as well as to 2) evaluate those error metrics that belong to the so-called #P-complete class.
Raw filtering of JSON data on FPGAs. - In: Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE 2022), (2022), S. 250-255
3D INS/UWB based real time sensor fusion indoor position tracking architecture. - In: 2022 IEEE 13th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), (2022), S. 94-101
Accurate indoor position tracking and analysis of the movement dynamics of autonomous driving systems are important challenges when it comes to automatize industrial processing, supply chains or warehouses. In this paper, the authors present an indoor position tracking architecture with a novel sensor fusion approach for autonomous robots in three-dimensional space. For robots to be able to drive autonomous, they need information of their position, speed and orientation in 3D-space. With the presented architecture, position information is provided by the Indoor Positioning System (IPS) and orientation information as well as velocity are determined by the Inertial Navigation System (INS). The proposed tracking architecture combines those informations with a sensor fusion approach, thus enabling the autonomous driving system.
The benefits and costs of netlist randomization based side-channel countermeasures: an in-depth evaluation. - In: Journal of Low Power Electronics and Applications, ISSN 2079-9268, Bd. 12 (2022), 3, 42, S. 1-17
Exchanging FPGA-based implementations of cryptographic algorithms during run-time using netlist randomized versions has been introduced recently as a unique countermeasure against side channel attacks. Using partial reconfiguration, it is possible to shuffle between structurally different but functionally similar versions of a cryptographic implementation. The resulting varying power profile enhances the resistance against power-based side channel attacks. While side channel leakage is reduced, costs in terms of additional resources and/or lowered throughput are often increased due to the overheads of the required online partial reconfiguration. In this work, we provide an in-depth evaluation of the leakage-area-throughput trade-off.
Security by Reconfiguration (SecRec) - Physikalische Sicherheit durch dynamische Hardware-Rekonfiguration, Teilvorhaben: Schutz vor Reverse-Engineering und Fehlerinjektionsangriffen durch dynamische Hardware-Rekonfiguartion : Abschlussbericht : Projektstart: 1. Januar 2017, Laufzeit: 48 Monate : Berichtszeitraum: 1. Januar 2017-30. September 2020. - Ilmenau : [Technische Universität Ilmenau]. - 1 Online-Ressource (17 Seiten, 390,27 KB)Förderkennzeichen BMBF 16KIS0609
Increasing flexibility of FPGA-based CNN accelerators with dynamic partial reconfiguration. - In: 2021 31st International Conference on Field-Programmable Logic and Applications, (2021), S. 306-311
Convolutional Neural Networks (CNN) are widely used for image classification and have achieved significantly accurate performance in the last decade. However, they require computationally intensive operations for embedded applications. In recent years, FPGA-based CNN accelerators have been proposed to improve energy efficiency and throughput. While dynamic partial reconfiguration (DPR) is increasingly used in CNN accelerators, the performance of dynamically reconfigurable accelerators is usually lower than the performance of pure static FPGA designs. This work presents a dynamically reconfigurable CNN accelerator architecture that does not sacrifice throughput performance or classification accuracy. The proposed accelerator is composed of reconfigurable macroblocks and dynamically utilizes the device resources according to model parameters. Moreover, we devise a novel approach, to the best of our knowledge, to hide the computations of the pooling layers inside the convolutional layers, thereby further improving throughput. Using the proposed architecture and DPR, different CNN architectures can be realized on the same FPGA with optimized throughput and accuracy. The proposed architecture is evaluated by implementing two different LeNet CNN models trained by different datasets and classifying different classes. Experimental results show that the implemented design achieves higher throughput than current LeNet FPGA accelerators.
KOI: an architecture and framework for industrial and academic machine learning applications. - In: Modelling and development of intelligent systems, (2021), S. 113-128