Abstract: General-purpose Graphics Processing Unit (G PG PU) has become the most popular platform for accelerating modern applications such as Large Language Models and Generative AI, while the lack ...