Who am I?
Welcome to my personal webpage!
I’m Thomas, a PhD student working at the Computer Systems Lab, Department of Electronics and Information Systems in the Faculty of Engineering and Architecture of Ghent University.
My research interests include compilers, software security, and high-level abstractions for GPU programming.
The topic of my Master’s dissertation was improving the performance of applications that involve matrix multiplication in some shape or form, thus allowing researchers in the Machine Learning and High Performance Computing fields to perform their research more efficiently. To achieve this, I implemented and evaluated the software package GemmKernels.jl for the popular high-level programming language Julia. In the course of my thesis, I also made the necessary adaptations to the Julia compiler back-end, and the GPU-related packages in the Julia ecosystem in order to support the features required by GemmKernels.jl.
In my PhD project, I am researching and developing automated compilation techniques that aim to hinder Man-At-The-End attacks on software.
Personal information
Experience
PhD Student
November 2020 – present
Research Foundation Flanders - FWO
Ghent, Belgium
FWO-funded PhD student at Ghent University
PhD Student
July 2020 – present
Faculty of Engineering and Architecture, Ghent University
Ghent, Belgium
Engineering Intern
July 2019 – August 2019
Sigasi
Gentbrugge, Belgium
Integration of Sigasi in the Vim text editor using the Language Server Protocol
Education
Doctor of Computer Science Engineering
In progress
July 2020 – present
Faculty of Engineering and Architecture, Ghent University
Ghent, Belgium
Master of Science in Computer Science Engineering
Obtained summa cum laude
September 2018 – June 2020
Faculty of Engineering and Architecture, Ghent University
Ghent, Belgium
Thesis: Flexible Matrix Multiplication Kernels on GPUs
Bachelor of Science in Computer Science Engineering
Obtained summa cum laude
September 2015 – June 2018
Faculty of Engineering and Architecture, Ghent University
Ghent, Belgium
Thesis: A scriptable guitar pedal
Awards
Awarded to the student who has achieved the highest grade, considered over the entire duration of his/her education within the Faculty of Engineering and Architecture.
Languages
Projects
Register Allocation Visualisation Tool — Visualisation of the graph colouring-based register allocation algorithm
This web application was initially written for the exercise sessions of the Compilers course at Ghent University when I became TA for that course. It is a playground for experimenting with the algorithm for register allocation based on graph colouring.
GemmKernels.jl — Flexible and performant GEMM kernels in Julia
Originally written for my master’s thesis, this Julia package contains a framework to instantiate flexible and performant matrix-multiplication kernels and variants thereof on GPUs.
vim-lsp-snippets, vim-lsp-ultisnips, and vim-lsp-neosnippet — Language Server Protocol snippets in vim using vim-lsp and UltiSnips/NeoSnippet
A set of three small Vim plugins that integrate vim-lsp (a popular LSP plugin for Vim) with either UltiSnips or neosnippet.vim (two popular snippet plugins for Vim) to add LSP snippet support to Vim. vim-lsp-ultisnips is responsible for the integration with UltiSnips, whereas vim-lsp-neosnippet handles the integration with neosnippet.vim. vim-lsp-snippets contains the code common to vim-lsp-ultisnips and vim-lsp-neosnippet.
OpenGL Lighting Tool — A playground for the Phong lighting model
A demo application for the course Computer Graphics at Ghent University, based on my previously written application OpenGL EduTool. It allows the user to change the parameters of the Phong lighting model (i.e. material and light properties of ambient, diffuse, and specular components), and its implementation (using the vertex shader or the fragment shader), and see the impact on the rendered image in real time. The core of the application was written in C++ using OpenGL, and the GUI was implemented using the Qt toolkit.
arm-neon-complex — Performant complex arithmetic using ARM Neon SIMD Extensions
A small project written in ARM Assembly to perform a performant multiply-accumulate operation of complex numbers using the SIMD instructions from the ARM Neon extensions.
Scriptable Guitar Pedal — A digital guitar pedal where you can write your own effects
My bachelor’s thesis project, performed in a group of four students. We decided to turn a BeagleBone Black into our own digital guitar pedal, and allow everyone to write effects for it in Lua. My responsibility was designing and implementing the architecture that allowed effects to be chained and changed dynamically, interfacing with Lua to load the effects programmed by the user, and implementing several built-in effects in real time. The latter included a multi-threaded implementation of FFT-based convolution, using ARM Neon SIMD extensions.
win-us-intl-altgr — United States (International) keyboard layout for Windows with AltGr dead keys only
A Windows port of the keyboard layout that I use on Linux systems, so that I can use my favourite layout on Windows as well. Implemented using the Microsoft Keyboard Layout Creator (MSKLC).
OpenGL EduTool — An application which aims to demonstrate how linear algebra is involved in 3D CGI
The OpenGL EduTool is an application I wrote to complement my research paper in the last year of secondary school. Its aim was to visually show how matrices (and linear algebra in general) are used to render 3D images on a computer screen. The application was developed in C++ and OpenGL, and included a graphical user interface written in Qt.
Publications
K-Hunt++: Improved Dynamic Cryptographic Key Extraction — Conference Article
We identified several weaknesses in the state-of-the-art cryptographic key extraction algorithm, K-Hunt. It cannot handle code in which key loading and use are spread apart, has problems with modes such as AES CBC that use small data buffers of constant size, and with complex apps in which functionality handles both the key and data. K-Hunt++ overcomes those weaknesses. We demonstrate it on two apps that trigger them and present an ablation study and qualitative analysis of its robustness in the face of obfuscation.
Tools and Models for Software Reverse Engineering Research — Conference Article
Software protection researchers often struggle with the evaluation of MATE software protections and attacks. Evaluations often are incomplete and not representative of the practice. This can in part be explained by a lack of standardized, generally applicable models, tools, and methodologies for evaluating how reverse engineering attack strategies are executed. The framework of related components proposed in this paper is an attempt to provide exactly that. It includes a meta-model and supporting tools for modeling the knowledge that reverse engineers acquire as they execute their strategies, a meta-model to estimate the required effort of those strategies, and tools to capture strategic activities from data streams collected during human reverse engineering experiments. Their use is demonstrated on three example reverse engineering strategies.
Flexible Performant GEMM Kernels on GPUs — Journal Article
General Matrix Multiplication or GEMM kernels take centre place in high performance computing and machine learning. Recent NVIDIA GPUs include GEMM accelerators, such as NVIDIA’s Tensor Cores. Their exploitation is hampered by the two-language problem: it requires either low-level programming which implies low programmer productivity or using libraries that only offer a limited set of components. Because rephrasing algorithms in terms of established components often introduces overhead, the libraries’ lack of flexibility limits the freedom to explore new algorithms. Researchers using GEMMs can hence not enjoy programming productivity, high performance, and research flexibility at once. In this paper we solve this problem. We present three sets of abstractions and interfaces to program GEMMs within the scientific Julia programming language. The interfaces and abstractions are co-designed for researchers’ needs and Julia’s features to achieve sufficient separation of concerns and flexibility to easily extend basic GEMMs in many different ways without paying a performance price. Comparing our GEMMs to state-of-the-art libraries cuBLAS and CUTLASS, we demonstrate that our performance is in the same ballpark of the libraries, and in some cases even exceeds it, without having to write a single line of code in CUDA C++ or assembly, and without facing flexibility limitations.
Flexible Matrix Multiplication Kernels on GPUs — Master's Dissertation
GEMM (General Matrix Multiplication) kernels are at the core of many computations in the fields of HPC (High Performance Computing) and ML (Machine Learning). GEMM is so prevalent that NVIDIA’s latest GPUs (Graphics Processing Units) include Tensor Cores, a special type of processing cores that are designed to accelerate matrix multiplications. As the fields of HPC and ML evolve, we notice two trends. First, the low-level programming language C++ that is traditionally used for high-performance applications, is being replaced by higher level languages such as Python or Julia. Through the use of the Julia package CUDAnative, one can even program GPUs directly in Julia. The second trend is the increasing need for flexibility in GEMM kernels. State-of-the-art GEMM libraries typically contain a limited set of monolithic kernels, and thus lack flexibility.
In this thesis, we will design, implement, and evaluate a GEMM framework for Julia that allows users to create GEMM kernels that are tailored to their use case. Given that NVIDIA’s Tensor Cores are already extensively used to accelerate ML and HPC workloads, we will mainly target GEMM using Tensor Cores. Our framework consists of three different APIs (Application Programming Interfaces). The first API provides an interface to access Tensor Cores from within Julia. The second API facilitates writing algorithms that use tiling techniques to improve performance, such as GEMM kernels. The final API consists of a set of customisable components that can be combined to implement a GEMM kernel for a specific purpose.