Page 1
Compiler Optimizations in the AI Error.
Research Interest: Software Engineering
DouBou ( ByteDance’s gpt)
teXT-TO-SPEECH ENGINE
int max (int a, int b){
if(a > b) reuturn a;
return b;
}Intermediate Representations
We can solve any problem by introducing an extra level of indirection.
Compiler Frontend - parses code into itneremdiate repreented
Compiler Backend Turns IR into targeted assembly code
Optimizations happen here.
ASTs are not made for optimizations,
hard to trap independencies and too like your code.
IR uses specific structure, that you can easily capture data dependencies.
tO oPTIMIZE mACHINE ir/asm
Compiler Middle-End
Platform-Independent Optimizations to OPtimize IR
Now adays, keras, and tensorflow, if translated to IR, we lose a lot of info if we optimize at the end.
Focused on Compiler Middle-End


IR translated from soruce code.
Thus why we store and load the same values repeatedly?
When it compilers, all it cares about is correction and translateion as simple as possible.
There exists a translation path that removes store and redundant redundants.
SSA
Step single Assembly
sgt = step greater than
SimplifyCFG, eliinates Branching.
Works like ternary operators.
InstCombine
Arithimitic optimizations.
Original, line has 20 lines, now we have 1.
that's fucking insane.


we note that
fhas redundant code, how does CLANG handle this?Sometiems tho compielrs are just stupid.

How can we fix this??
InstantaCombine Pass is continuously evolving, around 50,000 lines of code.
Detecting Missing PeeopHole OPtimizations
Manual INspections
Rlies on domain expertise
Labor-intensive and not scalable
Ad-hoc.
Some developers can't generalize.
Differential Testing
Requires Two langauge implementations
Cannot find new optimizations.
Superoptimization
Comptuationally expensive
Mostly supports a limited set of instructions.



there exist feedback loops to further optimize it !
To appear in ASLOS '26



In IR, it may have less lines, but once translated to ASSEMBLY, it could be longer than necessary, thus tehre's stuff and help with that.
Undeifned behaviour
tHIS PROGRAM CAN DO ANYTHING
If source program does implied behaviour
Either they are undefined behaviour or targetted error.

the most important part of this research is evaluation.
We can provide more formal moments for LLM
For now we only edxtact one basic seqeunce from one block/one label ( research more )
Consider larger scopes ( cross basic

Alive2 can timeout because of complexity.
I.e. if a lot of inputs SMT solver may brute force and thus time out.

LLM
Calling APIs in parallel
In future, improve efficiency
Proposing to implementations in CI
Still costly.
Local Models....
Last updated