Advanced AI is coming.
Let’s make sure it goes well.

Monash AI Alignment is a research group at Monash University working on the technical problems of making advanced AI systems safe, honest, and controllable.

Our research ↓

About

We are a group of Masters students, PhD candidates, and researchers based at Monash, led by Dr. Trung Le, with contributions from faculty including Prof. Mehrtash Harandi. Our work focuses on the technical foundations of AI alignment — understanding what frontier models are doing internally, and developing the tools to keep them aligned with human intent as their capabilities grow.

We meet regularly to discuss papers, share work-in-progress, and collaborate on research projects. New members are welcome.

Research areas

Feature decomposition

Sparse autoencoders and related methods for decomposing model internals into interpretable features — the foundation for understanding what neural networks have learned.
Safety steering

Activation-level interventions and steering vectors that shape model behaviour without retraining, building on advances in interpretability.
Watermarking & fingerprinting

Provenance techniques for large language models that remain robust under fine-tuning, quantisation, and adversarial removal.
Hallucinations

Understanding and mitigating the conditions under which language models generate confident but ungrounded outputs.

People

The group is led by Dr. Trung Le, with active involvement from Prof. Mehrtash Harandi and a growing cohort of Masters and PhD researchers across the Faculty of Information Technology and the Department of Electrical and Computer Systems Engineering.

If you’re a Monash student or researcher interested in joining, please get in touch by email.

Get involved

Discussion, reading-group sessions, and project coordination happen on our internal forum. The forum is publicly readable, but participation is limited to invited members.

Read the forum Email

Advanced AI is coming. Let’s make sure it goes well.

About

Research areas

Feature decomposition

Safety steering

Watermarking & fingerprinting

Hallucinations

People

Get involved

Advanced AI is coming.
Let’s make sure it goes well.