SCA/HPCAsia 2026

MENU

Papers

Accepted Papers

We are pleased to announce the accepted papers. Stay tuned for the full program with schedule and room details.
NOTE: Paper titles are subject to change before the camera-ready submission.

Toward unprivileged, portable and generic network topology discovery
Pepin, Jaeger, Mercier, Goglin
Integrating Quantum Software Tools with(in) MLIR
Hopf, Stade, Rovara, Burgholzer, Quetschlich, Florea, Lopez, Izaac, Wille
ChatMPI: LLM-Driven MPI Code Generation for HPC Workloads
Valero-Lara, Young, Naughton III, Engelmann, Geist, Vetter, Teranishi, Godoy
Optimization of a GEMM Implementation using Intel AMX
Endo, Ohshima, Nanri
Modeling the Potential of Message-Free Communication via CXL.mem
Vanecek, Turner, Gajbe, Wolf, Schulz
Exploring User Heterogeneity-Aware Differentiated Token Pricing for On-Premises Large Language Models
Peng, He, Lv, Wu, Yu, Shi, Zhai, Sheng, Wang, Wei
Towards Unified Acceleration: Weight-Stationary GEMM on HPC-oriented Elastic CGRAs
Shi, Adhi, Teng, Liu, Miwa, Sano
High-performance in-situ ML Inference with dalotia: A Lightweight Tensor Loader API for Science Codes
Pollinger, Domke
ROIX-Comp: Optimizing X-ray Computed Tomography Imaging Strategy for Data Reduction and Reconstruction
Singh, Sato, Yoshida, Uesugi, Joti, Hatsui, Rubio Proaño
ClinTwin PINN Real Time Patient Specific Cardiopulmonary Digital Twin via Meshless Physics Informed Neural Fields on Heterogeneous HPC
Maulana, Pratiwi
PRISM: Profiling-Free Symbolic Memory-Driven Strategy Planner for Large DNN Model Training
Wang, Fang, Li, Tachon, Appuswamy
TRIOS: Reducing File-System Contention through Predictive Time-Resolved I/O Simulation in Job Scheduling
Tseng, Kawai, Takahashi, Takizawa
The X Quantum Software Stack: Connecting End Users, Integrating Diverse Quantum Technologies, Accelerating HPC
Burgholzer, Echavarria, Hopf, Stade, Rovara, Schmid, Kaya, Mete, Farooqi, Chung, De Pascale, Schulz, Schulz, Wille
Guaranteed DGEMM Accuracy While Using Reduced Precision Tensor Cores Through Extensions of the Ozaki Scheme
Schwarz, Anders, Brower, Bayraktar, Gunnels, Clark, G. Xu, Rodriguez, Cayrols, Tabaszewski, Podlozhnyuk
Performance analysis of Arm-based processors across multiple compilers for HPC workloads
Kobayashi, Ando, Yamaura, Inoue, Murai
Rankmap optimization for large scale HPC applications with simulated annealing based on MPI trace information
Kuroda, Nakamura, Ando, Murai, Kato
Improved Implementation of Number Theoretic Transform on NVIDIA GPU with Tensor Cores
Sugizaki, Takahashi
EmuPlat: A Framework-Agnostic Platform for Quantum Hardware Emulation with Validated Transpiler-to-Pulse Pipeline
Ye, Khoo
Optimizing Intra-Layer Parallel Communication for LLM Training on Systems with Fully-Connected Mesh GPU Topology
Hosoki, Sato, Endo, Bigot, Audit
QPU Micro-Kernels for Stencil Computation
Markidis, Netzer, Pennati, Peng
Scalable QRAM with Superposition-Based Data Loading for Noise-Resilient Quantum Machine Learning on NISQ Devices
Sajadimanesh, Atoofian
Tensor-Core-Optimized Strategies for BLR × Tall-Skinny Matrix Multiplication in BEM
IDA, Goto, Yokota, Hiraishi, Hanawa, Iwashita, Kawai, Ohshima, Hoshino
Enhancing Stability and Optimizing Implmentation of Mixed-Precision Block $\epsilon$-Circulant Preconditioned Solvers for Parallelization-in-Time
Yoda, Bolten
GPU Partitioning, Power, and Performance of the AMD MI300A
Abouelmagd, Boehme, Brink, Burmark, McKinsey, Skjellum, Pearce
Mixed-precision Interpolative Decomposition on GPUs [Best Paper Finalist]
Ma, Imamura
Fusing Sequence Motifs and Pan-Genomic Features: Antimicrobial Resistance Prediction using an Explainable Lightweight 1D CNN - XGBoost Ensemble
Siddiqui, Tarannum
Beyond Exascale: Dataflow Domain Translation on a Cerebras Cluster [Best Paper Finalist]
Oppelstrup, Giamblanco, Kalchev, Sharapov, Taylor, Van Essendelft, Rajamanickam, James
Cloud-Hardware Co-Design for Memory Bandwidth-Bound HPC Workloads: Performance and Characteristics of Azure HBv5 Virtual Machines
Rastegari, Kovouri, Cui, Naz, Fleischman, Gupta, Harwani, Loh, Greenseid, Burness, Ram, Ringenburg
A Multi-ROI Camera Motion Exploration Approach for Enhancing Image-based Smart In-Situ Visualization
Matsushima, Adachi, Sakamoto, Nonaka
A Matrix-Free Algebraic hp-Multigrid Method for Computational Fluid Dynamics Applications [Best Paper Finalist]
Ohm, Harper, Jansson
Scalable eVTOL Aerodynamics Simulations on Heterogeneous HPC Platforms with Minimal-Invasive GPU Porting
Ohm, Takii, Ando, Bale, Tsubokura
Deterministic Quantum Search for Index Retrieval: Algorithm Design and Implementation
Mishra, Balasubramanyam, Raghava
GCAMPS: A Scalable Classical Simulator for Qudit Systems
Harper, Nakhl, Quella, Sevior, Usman
Task-decomposed Overlapped Preconditioner for Sustained Strong Scalability on Accelerated Exascale Systems
Jansson, Karp, Páll, Markidis, Schlatter
What Will the Grace Hopper-Powered Jupiter Supercomputer Bring for Sparse Linear Algebra?
Tsai, Bode, Anzt
Revisiting Communication Software Offloading for MPI+Threads: Reducing Contention and Improving Overlap on Many-Core Systems
Breiter, Chung, Fürlinger, Weidendorfer, Kranzlmüller
Deep Learning-Integrated Pairwise-Qubit Subsystems for Highly Efficient Quantum Circuit Simulation
Pradata, Amrizal, Suryanto, Nugraha, Takizawa

-> Go back to the Papers page

  • Host Organization

    SupercomputingAsia 2026

    The International Conference on High Performance Computing in Asia-Pacific Region 2026

  • Secretariat of SCA/HPCAsia 2026
    c/o Convention Linkage, Inc.
    11F PIAS TOWER 3-19-3, Toyosaki, Kita-ku, Osaka-city, Osaka 531-0072, Japan
    Email:sca_hpcasia_2026@c-linkage.co.jp Tel:+81-6-6377-2188