I am a forth year PhD student at UCSB NLP group , co-advised by Prof. William Wang and Prof. Lei Li. I am currently a visiting scholar at CMU Language Technologies Institute (Email: wendaxu@ucsb.edu, wendaxu@andrew.cmu.edu).
Research Interests
My major research interests lie in the area of text generation evaluation and large language model (LLM) alignment. In one sentence, I want to design methods to enable LLM learn to generate actionable feedback (both in the form of quality score or natural language diagnostic report) and use actionable feedback to align LLM with human principles.
Currently, I actively work on the text generation evaluation (both in quality and interpretability). I am the first author of SEScore1&2 and InstructScore (Best Unsupervised Text Generation metrics at WMT22 shared task). I am interested
in the fine-grained feedback guided text generation and extend this generic pipeline into multilingual, multimodality content generation.
Self-feedback display model's bias towards their own outputs. We find that self-bias is prevalent in all examined LLMs across multiple languages and tasks (6 LLMs, 4 languages, 3 tasks). To mitigate such biases, we discover that larger mode size and external feedback with accurate assessment can significantly reduce bias in the self-refine pipeline, leading to actual performance improvement in downstream tasks.
Can we not criticize LLM but pinpoint errors it makes and automatically guide it with fine-grained actionable feedback? Can we formulate iterative refinement into a local search problem, simulated annealing? This is the work after my prior work InstructScore, where I really think about how to incorporate fine-grained actionable feedback to guide text generation. InstructScore not only offers quality judgements but actionable feedback to improve LLM!
InstructScore is an explainable text generation metric, which instead of outputing a scalar score, it outputs error location, error type and severity measures to candidate text. It achieves high correlation to human on four text generation tasks: Translation, table-to-text, captioning and commonsense generation and generalizes to unseen text generation task: keyword-to-text.
SESCORE2, is a SSL method to train a metric for general text generation tasks without human ratings. We develop a technique to synthesize candidate sentences with varying levels of mistakes for training. To make these self-constructed samples realistic, we introduce retrieval augmented synthesis on anchor text;
It outperforms SEScore in four text generation tasks with three languages (The overall kendall correlation improves 14.3%).
SEScore is a reference-based text-generation evaluation metric that requires no pre-human-annotated error data. It develops a novel stratified error synthesis
to synthesize diverse errors with varying severity levels. Its effectiveness over prior methods like BLEU, BERTScore, COMET and BLEURT has been demonstrated on various NLG tasks.
We develop a self-supervised approach to perform expert layman text style transfer. We propose a novel
SSL task knowledge base assimilation to inject knowledge into pretraining. We achieve amazing performance
in human evaluation.
Introduce a novel paradigm that leverages machine-generated images to guide open-ended text generation. This endows the machines with the ability of creative visualization that human writers often demonstrate.
Propose a patch-based approach, BrainSec, to classify the GM/WM/background regions. Integrated BrainSec with an Amyloid- β pathology classification model to identify pathologies distributions and quantify them in segmented GM/WM regions, respectively.
Fun Fact: What is the meaning of Wenda(闻达)?
I add this because Starbucks people keep putting down "Wendy" or "Wanda"
The origin of word "Wenda(闻达)" was from a conversation between Confucius and his student. Here is the English translation:
Zi Zhang asked, "What makes a scholar truly accomplished ('Da' means 'accomplished', 达)?"
Confucius asked, "Define 'accomplished'?"
Zi Zhang replied, "To be renowned in the states of feudal lords, and to be renowned in the lands of ministers."
Confucius said, "This is more about fame ('Wen' means 'fame', 闻) than accomplishment. True accomplishment is about honesty, love for righteousness, understanding others, and modesty. Such a person will succeed anywhere. Those who seek fame may pretend to be virtuous, but their actions betray them, leading to hollow fame regardless of where they are."