cv
This is an overview of my work life. You can also download the pdf with the following button to access the more academic and complete version.
Basics
| Name | Louis Ledoux |
| Label | Philosophiae Doctor |
| i.f.lledoux[at]gmail.com | |
| Phone | +33 [seven] 70 49 11 98 |
| Url | https://bynaryman.github.io/ |
| Summary | A Computer Architect with a focus on Arithmetic and Floating-Points. |
Work
-
2025.01 - Now Postdoctoral Researcher
INSA/INRIA – Emeraude
Postdoctoral research on arithmetic-aware compilation integrating MLIR and FloPoCo.
- Multi-level arithmetic transformations from real to fixed-point approximations for ASIC/FPGA.
- Custom MLIR dialects for arithmetic abstraction, reasoning, and approximation.
- Automated generation of hardware architectures from DSP-oriented code (e.g., Faust).
-
2018.08 - 2024.12 Researcher
Barcelona Supercomputing Center (BSC) - RoMoL/CAOS/SONAR
Conducting research in computer architecture, arithmetic, and HPC. Exploring co-designed hardware/software acceleration of posit arithmetic, developing Kulisch/Quire accumulators, and designing Systolic Array architecture for HPC workloads.
- Thesis: Floating-Point Arithmetic Paradigms for High-Performance Computing: Software Algorithms and Hardware Designs.
- Co-designed hardware/software acceleration of posit arithmetic.
- Developed Kulisch/Quire accumulators for any floating-point representation.
- Designed Systolic Array architecture for HPC workloads.
- Explored very slow but very little floating-point division designs for SIMD/Vector paradigms.
-
2017.08 - 2018.07 Hardware Engineer
b<>com
Engaged in R&D focused on FPGA acceleration in the cloud. Successfully integrated an IP for real-time SDR to HDR video conversion, developed the IP integration using HDLs, and tweaked PCI-e drivers to maximize bandwidth.
- Deployed a custom IP core for real-time SDR to HDR video conversion on cloud-based FPGAs.
- Optimized PCI-e drivers, achieving sustained data transfer rates of up to 15.8 GB/s, maximizing hardware utilization and performance.
- Evaluated nascent FPGA cloud platforms such as Amazon AWS f1 with a focus on virtualization and partial reconfiguration.
- Integrated with OpenCL with pipelining of nvme writing/reading, FPGA writing/reading with multithreaded FIFOs.
-
2017.07 - 2017.08 Back End Developer
WaryMe
Developed the entire back end of a people security application. Ensured secure data transmission and deployed the application on AWS.
- Developed backend services and APIs.
- Deployed and managed the application on AWS.
-
2016.07 - 2016.07 Back End Developer
ASKIA
During this summer internship, I developed an automated CLI tool for publishing surveys on popular platforms.
- Designed a REST API in Node.js to handle event-driven, asynchronous processes efficiently.
- Implemented Test-Driven Development (TDD) using frameworks like Jasmine, ensuring flow verification in an environment-agnostic manner with a focus on mock and stub methodologies.
- Enhanced security by deploying HTTPS with Let's Encrypt for secure data transmission.
-
2014.07 - 2014.07 Electronics Technician
Radio Electronique Rennaise (R.E.R)
Responsible for repairing various electronic devices, with an emphasis on audio equipment.
- Repaired various electronic devices, focusing on audio equipmenters.
- Soldered and reverse engineered amplifier circuits.
Education
-
2018.08 - 2024.08 Barcelona, Spain
-
2015.09 - 2018.06 Rennes, France
-
2013.09 - 2015.06 Rennes, France
Publications
-
2025 Towards Multi-Level Arithmetic Optimizations
EuroLLVM 2025
Poster on multi-level arithmetic optimization strategies integrating compiler and circuit design.
-
2025 Towards Optimized Arithmetic Circuits with MLIR
28th Euromicro Conference Series on Digital System Design (DSD) 2025
Presents MLIR-based approaches for optimizing arithmetic circuits across multiple abstraction levels.
-
2025 Design-Space Exploration of Serialized Floating-Point Division for DLP Architectures
28th Euromicro Conference Series on Digital System Design (DSD) 2025
Explores serialized floating-point division units tailored for data-level parallel architectures.
-
2024.05 LLMMMM: Large Language Models Matrix-Matrix Multiplications Characterization on Open Silicon
2024 11th BSCSymposium
Characterization of matrix-matrix multiplications for large language models using open silicon platforms.
-
2024.03.25 The Grafted Superset Approach: Bridging Python to Silicon with Asynchronous Compilation and Beyond
2024 4th Workshop on Open-Source Design Automation (OSDA), hosted at DATE
The Grafted Superset Approach aims to bridge Python programming with silicon-level execution through asynchronous compilation.
-
2023.09 An Open-Source Framework for Efficient Numerically-Tailored Computations
2023 33rd International Conference on Field-Programmable Logic and Applications (FPL)
An open-source framework aimed at enhancing the efficiency of numerically-tailored computations.
-
2023.05 Open-Source GEMM Hardware Kernels Generator: Toward Numerically-Tailored Computations
2023 10th BSCSymposium
A generator for open-source GEMM hardware kernels aimed at enabling numerically-tailored computations.
-
2022.05 A Generator of Numerically-Tailored and High-Throughput Accelerators for Batched GEMMs
2022 IEEE 30th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)
A generator for creating numerically-tailored and high-throughput accelerators specifically designed for batched GEMMs.
-
2019.10 Accelerating DL inference with (Open)CAPI and posit numbers
OpenPOWER summit 2019, Lyon, France: Linux Foundation
Methods to accelerate deep learning inference using (Open)CAPI and posit numbers.
Projects
-
MPW 5
Taped out a Systolic Array for Matrix Multiplication with Posit numbers and Quires accumulators.
- Posit Numbers
- Quires Accumulators
-
Artistic OpenROAD
Modified the OpenROAD replace tool to visualize electrostatic fields and explore chip sonification; taught concepts in Matt Venn’s Zero to ASIC course.
- OpenROAD
- Electrostatic Visualization
- Chip Sonification
-
MPW 1
My first taped-out chip on the first-ever open shared Multi-Project Wafer in collaboration with Google, SkyWater, and Efabless.
- Open Source Silicon
-
SUF
Developed a Python-to-ASIC compiler with a focus on arithmetic, including novel placement visualization.
- Python-to-ASIC Compiler
- Placement Visualization
-
OSFNTC
Developed an Open-Source Framework for Efficient Numerically-Tailored Computation, systematically improving energy efficiency and accuracy.
- Numerically-Tailored Computation
- Energy Efficiency
-
POF
Designed the Posit Operators Framework, a comprehensive SW/HW co-designed library for arithmetic computations using the Posit numerical format on FPGAs.
- Posit Numerical Format
- FPGA
-
VH2V
Designed a VHDL-to-Verilog translation tool tailored to convert FloPoCo outputs to OpenLane/OpenROAD inputs.
- VHDL-to-Verilog Translation
-
Synthesizers
Crafted analog and digital synthesizers for modular synthesis and Eurorack systems.
- PCB Manufacturing
- Digital Design
Languages
| French | |
| Native speaker |
| Spanish | |
| Native speaker (with an honest French accent) |
| English | |
| Full Proficiency |
Skills
| Programming and Computer Science | |
| C | |
| C++ | |
| Java | |
| Scala | |
| Algorithm Complexity | |
| Pipeline Overlapping | |
| Parallel Computing | |
| Hardware Acceleration | |
| Performance Optimization | |
| Numerical Methods | |
| High-Performance Computing (HPC) | |
| Low-Level Programming |
| Computer Architecture | |
| Execution Stage | |
| Floating-Point Unit | |
| Kulisch Accumulators | |
| Design-Specific Architecture | |
| Power/energy Budgeting | |
| Data-Aware Designs | |
| Workload-Accuracy tailored circuits | |
| SIMD | |
| Vector | |
| VLIW | |
| Systolic Arrays | |
| Near-/In-Memory Computing | |
| Processor Design | |
| Out-of-Order | |
| RISC-V |
| Scripting | |
| Python | |
| Bash | |
| Shell | |
| Linux | |
| Tcl |
| Dissemination | |
| LaTeX | |
| Matplotlib | |
| Inkscape | |
| Top-tier conferences article |
| FPGA | |
| AMD | |
| Altera | |
| VHDL | |
| Verilog | |
| SystemVerilog | |
| Manual Floorplaning | |
| AmaranthHDL | |
| Automated Pipeline | |
| Automated circuit generation | |
| FloPoCo | |
| SDAccel | |
| AWS F1 | |
| PCIe |
| GPU | |
| CUDA 8 | |
| CUDA 9 | |
| OpenCL | |
| Warp | |
| MIMD | |
| SIMT | |
| Branch divergence | |
| Coalesced Access Patterns | |
| PTX | |
| Tensor Cores |
| Version Control | |
| Git | |
| GitHub | |
| GitLab | |
| SVN | |
| Pull Requests | |
| branches | |
| rebases |