Haining

Greetings👋 I'm a Ph.D. Candidate in Information Science at Indiana University Bloomington. I'm now actively seeking a tenure-track assistant professorship for a start date in Fall 2025.

My broad scholarly inquiry concerns the dynamics within the humanities and social sciences. My ongoing projects can be categorized as natural language processing, computational humanities, and quantitative science studies. I'm also passionate about developing large language models for the public and community good, including defending against authorship identification attacks for whistleblowers and making scientific knowledge more accessible for laypersons and young readers.

News

(Sep 28, 2024) I presented Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning at the ILS Doctoral Research Forum 2024 and won 🥇.
(Aug 7, 2024) Our new preprint Simplifying Scholarly Abstracts for Accessible Digital Libraries is now online. We introduced a novel corpus designed to simplify scholarly abstracts and demonstrated that mainstream LLMs perform just fine with straightforward supervised fine-tuning. Although the improvement doesn't yet make the content fully understandable for a middle-school audience, these models provide a strong baseline for further enhancement.
(May 1, 2024) I joined Dr. Jing Su's team as an intern to work on the GAIPA project (Graph Artificial Intelligence for Precision Identification of Alcohol Use Disorder) at the Department of Biostatistics and Health Data Science, Indiana University School of Medicine. Let's push the precision identification of alcohol use disorder to the next level💪.
(April 30, 2024) I passed my proposal defense for my PhD dissertation, titled Defending Against Authorship Attribution Attacks With Large Language Models. I am now actively seeking a tenure-track assistant professorship. Let me know if your department needs an NLP guy who can research and teach🤠.
(Apr 15, 2024) I presented the manuscript titled A Content-Based Novelty Measure for Scholarly Publications: A Proof of Concept at iConference 2024.
(Mar 13, 2024) I delivered a presentation titled AI4Library at Nankai University, covering the fundamentals of LLMs, their capabilities, surrounding hype, and their applications within librarianship.
(Jan 31, 2024) I joined the Center for Antique Book Conservation and Restoration Research at Wuhan University (武汉大学古籍保护暨文献修复研究中心) as a Research Affiliate. I will be working with Dr. Xincai Wang to improve the discoverability of historical collections using Retrieval-Augmented Generation, empowered by state-of-the-art LLMs.
(Jan 29, 2024) I joined Digital Humanities Quarterly as Data Analytics Editor.
(Jan 18, 2024) My invited column Defending Against Authorship Identification Attacks has been published by the Montreal AI Ethics Institute. Check out the article here.
(Jan 8, 2024) The manuscript for NovEval, A Content-Based Novelty Measure for Scholarly Publications: A Proof of Concept, is up. I will be presenting it at iConference 2024.
(Oct 7, 2023) NovEval (pronounced as "Nawv-Ee-val") demo is now online! This GPT-2 based model is designed to evaluate scientific novelty automatically, and its assessments have been proven to align with human evaluation. It's still in beta, and I would love to hear your feedback😉!
(Oct 5, 2023) I successfully completed my qualifying defense🎉. The committee consisted of Dr. Allen Riddell (Chair), Dr. Xiaozhong Liu, and Dr. Staša Milojević (Minor Advisor).
(Oct 4, 2023) Our new papers are available on arXiv. Check them out!
(May 19, 2023) I delivered a lightning talk introducing our jargon-busting AI at LEADING Forum 2022. You can access our poster titled Science Out of the Ivory Tower: Scientific Abstract Simplification for Everyone here.
(May 10, 2023) I uploaded a LaTeX poster template to Overleaf. The template, based on the Gemini, is minimal and modern, and features Indiana University's official color palette. You can find the template at this link, or simply search for "iu poster" in the Overleaf gallery.
(Apr 14, 2023) I gave a lightning talk and presented a poster titled The Many Voices of the Detached: Revisiting the Disputed Writings of Lu Xun and Zhou Zuoren at the IDAH HASTAC Symposium 2023.
(Apr 13, 2023) I gave a guest lecture on Authorship Attribution: An Introduction at Allen's Digital Humanities.
(Jan 9, 2023) I will be working with Allen in IARPA HIATUS (Human Interpretable Attribution of Text Using Underlying Structure) Task Three.