Publications

Home Publications Blogs About me

Welcome to my site!
I am a PhD. student @ Westlake University. My research interests include sound event detection ; semi-supervised / self-supervised learning in audio processing.
Feel free to contact me: sao_year@126.com

ICASSP 2024 2023-09-20

Fine-tune the Pretrained ATST Model for Sound Event Detection

Fine-tuning ATST-Frame for sound event detection and achieving new SOTA results on DESED development set.

ICASSP 2024 2023-09-01

Frame-wise Streaming End-to-end Speaker Diarization with Non-autoregressive Self-attention-based Attractors

A low-latency online EEND framework with causal encoding and non-autoregressive attractor decoding.

TASLP 2023-06-30

Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks

ATST-Clip and ATST-Frame for unified clip-level and frame-level tasks with broad SOTA gains.

DCASE 2022 Tech Report 2022-06-01

ATST Self-supervised plus RCT Semi-supervised Sound Event Detection: Submission to DCASE 2022 Challenge Task 4

Integrating ATST-Clip with RCT-CRNN, ranking 4th in DCASE 2022 Task 4 single-model setting.

Interspeech 2022 2022-04-01

RCT: Random Consistency Training for Sound Event Detection

Random Consistency Training for SED with clear improvements across CRNN-based systems.

Knowledge-Based Systems 2022-01-01

Mixhead: Breaking the low-rank bottleneck in multi-head attention language models

Learnable head-mixing to mitigate low-rank bottlenecks in attention for language modeling and GLUE.