Publications

2024


  • SilvanForge: A Schedule-Guided Retargetable Compiler for Decision Tree Inference [PDF]
    Ashwin Prasad, Sampath Rajendra, Kaushik Rajan, R Govindarajan, and Uday Bondhugula.
    ACM Symposium on Operating Systems Principles (SOSP), 2024.

2023


  • HIR: An MLIR-based Intermediate Representation for Hardware Accelerator Description [PDF]
    Kingshuk Majumder and Uday Bondhugula.
    ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023.

2022


  • Treebeard: An Optimizing Compiler for Decision Tree Based ML Inference [pdf, code]
    Ashwin Prasad, Sampath Rajendra, Kaushik Rajan, R Govindarajan, Uday Bondhugula
    55th IEEE/ACM International Symposium on Microarchitecture, October 2022.
  • MLIR-based code generation for GPU tensor cores [pdf, code]
    Navdeep Katel, Vivek Khandelwal, Uday Bondhugula
    CC 2022: ACM SIGPLAN International Conference on Compiler Construction, March 2022.

2021


  • High Performance GPU Code Generation for Matrix-Matrix Multiplication using MLIR: Some Early Results [pdf]
    Navdeep Katel, Vivek Khandelwal, Uday Bondhugula
    arXiv report, Aug 2021.
  • HIR: An MLIR-based Intermediate Representation for Hardware Accelerator Description [pdf]
    Kingshuk Majumder, Uday Bondhugula
    arXiv report, Feb 2021.
  • Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache, and Oleksandr Zinenko.
    ACM International Symposium on Code Generation and Optimization (CGO), 12 pages, Feb 2021.

2020


  • Effective Loop Fusion in Polyhedral Compilation using Fusion Conflict Graphs [pdf]
    Aravind Acharya, Uday Bondhugula, Albert Cohen
    ACM Transactions on Architecture and Code Optimization (TACO), Volume 17 Issue 4, Article No. 26, September 2020.
  • Optimizing the Linear Fascicle Evaluation Algorithm for Multi-Core and Many-Core Systems
    Karan Aggarwal, Uday Bondhugula
    ACM Transactions on Parallel Computing (TOPC), accepted in Jun 2020.

2019


  • A Flexible FPGA Accelerator for Convolutional Neural Networks [pdf]
    Kingshuk Majumder, Uday Bondhugula
    arXiv report, Dec 2019.
  • Optimizing the Linear Fascicle Evaluation Algorithm for Many-Core Systems [pdf,code]
    Karan Aggarwal, Uday Bondhugula
    ACM International Conference on Supercomputing (ICS), June 2019.

2018


  • Polyhedral Auto-transformation with No Integer Linear Programming [pdf]
    Aravind Acharya, Uday Bondhugula, Albert Cohen
    ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 2018.
  • An Effective Fusion and Tile Size Model for Optimizing Image Processing Pipelines
    Abhinav Jangda, Uday Bondhugula
    ACM SIGPLAN symposium on Principles and Practice of Parallel Programming (PPoPP), Feb 2018.

2017


2016


  • A DSL Compiler for Accelerating Image Processing Pipelines on FPGAs
    Nitin Chugh, Vinay Vasista, Suresh Purini, Uday Bondhugula
    IEEE International conference on Parallel Architectures and Compilation Techniques (PACT 2016), Sep 2016.
  • Compiling Affine Loop Nests for a Dynamic Scheduling Runtime on Shared and Distributed Memory
    Roshan Dathathri, Ravi Teja Mullapudi, Uday Bondhugula
    ACM Transactions on Parallel Computing (TOPC), vol 3, issue 2, Jul 2016.
  • SMO: An Integrated Approach to Intra-Array and Inter-Array Storage Optimization [pdf, slides, bibtex]
    Somashekaracharya Bhaskaracharya, Uday Bondhugula, Albert Cohen
    ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), Jan 2016.
  • Automatic Storage Optimization for Arrays [pdf, bibtex]
    Somashekaracharya Bhaskaracharya, Uday Bondhugula, Albert Cohen
    ACM Transactions on Programming Languages and Systems (TOPLAS), vol 38, issue 3, April 2016.
    Selected for presentation at ACM SIGPLAN PLDI'16, June 2016.
  • The Pluto+ Algorithm: A Practical Approach for Parallelization and Locality Optimization of Affine Loop Nests [pdf, bibtex]
    Uday Bondhugula, Aravind Acharya, Albert Cohen
    ACM Transactions on Programming Languages and Systems (TOPLAS), vol 38, issue 3, Apr 2016.

2015


  • An Optimizing Code Generator for a Class of Lattice-Boltzmann Computations [PDF]
    Irshad Pananilath, Aravind Acharya, Vinay Vasista, Uday Bondhugula
    ACM Transactions on Architecture and Code Optimization (TACO), Volume 12 Issue 2, Article No. 14, July 2015.
  • PolyMage: Automatic Optimization for Image Processing Pipelines [PDF, project page, preprint, bibtex]
    Ravi Teja Mullapudi, Vinay Vasista, Uday Bondhugula
    International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar 2015, Istanbul, Turkey.
  • Pluto+: Near-Complete Modeling of Affine Transformations for Parallelism and Locality [PDF, tool, bibtex, slides, code for experiments]
    Aravind Acharya, Uday Bondhugula
    ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Feburary 2015, San Fransisco, USA.

2014


  • Tiling and Optimizing Time-Iterated Computations over Periodic Domains [PDF, code]
    Uday Bondhugula, Vinayak Bandishti, Albert Cohen, Guillain Potron, and Nicolas Vasilache
    International Conference on Parallel Architectures and Compilation Techniques (PACT), Aug 2014, Canada.
    Best paper award nomination.
  • Effective Automatic Computation Placement and Data allocation for Parallelization of Regular Programs [PDF, code]
    Chandan Reddy, Uday Bondhugula
    ACM International Conference on Supercomputing (ICS), Jun 2014, Munich, Germany.

2013


  • Automatic Data Allocation and Buffer Management for Multi-GPU Machines [PDF, slides]
    Thejas Ramashekar, Uday Bondhugula
    ACM Transactions on Architecture and Code Optimization, Vol. 10, No. 4, Article 60, Publication date: December 2013
    Selected for presentation at HiPEAC '14, Jan 2014, Vienna, Austria.
  • Compiling affine loop nests for distributed-memory parallel architectures [PDF, tool, slides]
    Uday Bondhugula
    ACM/IEEE Supercomputing (SC '13), Nov 2013.
  • Generating Efficient Data Movement Code for Heterogeneous Architectures with Distributed-Memory [PDF, slides, bibtex, errata ]
    Roshan Dathathri, Chandan Reddy, Thejas Ramashekar, and Uday Bondhugula
    Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT), September 2013.
  • PolyGLoT: A Polyhedral Loop Transformation Framework for a Graphical Dataflow Language [PDF, bibtex]
    Somashekaracharya Bhaskaracharya, Uday Bondhugula
    International conference on Compiler Construction (CC 2013), Mar 2013, Rome, Italy.

2012


  • Tiling Stencil Computations to Maximize Parallelism [ PDF, bibtex, code (tarball), errata ]
    Vinayaka Bandishti, Irshad Pananilath, and Uday Bondhugula
    ACM/IEEE Supercomputing (SC'12), Nov 2012, Salt lake city, Utah, USA.

2011


  • Automatic Distributed Memory Code Generation using the Polyhedral Framework [ PDF ]
    Uday Bondhugula
    IISc Research Report, IISc-CSA-TR-2011-3, Sep 2011