Talk on MLLM

"Multimodal Large Language Model"

Next Speakers

A visual–language foundation model for pathology image analysis using medical Twitter

Dr. Zhi Huang  | Stanford University 

 Talk in English | 9AM (GMT-7) @Stanford | 5PM(GMT+1) @London| 16 Sept. 2023 | Zoom Link


The lack of annotated publicly available medical images is a major barrier for computational research and education innovations. At the same time, many de-identified images and much knowledge are shared by clinicians on public forums such as medical Twitter. Here we harness these crowd platforms to curate OpenPath, a large dataset of 208,414 pathology images paired with natural language descriptions. We demonstrate the value of this resource by developing PLIP, a multimodal AI with both image and text understanding, which is trained on OpenPath. PLIP achieves state-of-the-art performances for classifying new pathology images across four external datasets: For zero-shot classification, PLIP achieves F1 scores of 0.565 to 0.832 compared to F1 scores of 0.030 to 0.481 for previous contrastive language-image pre-trained model. Training a simple supervised classifier on top of PLIP embeddings also achieves 2.5% improvement in F1 scores compared to using other supervised model embeddings. Moreover, PLIP enables users to retrieve similar cases by either image or natural language search, greatly facilitating knowledge sharing. Our approach demonstrates that publicly shared medical information is a tremendous resource that can be harnessed to develop medical AI for enhancing diagnosis, knowledge sharing and education.

Interactive AI Systems Specialized in Social Influence

Dr. Weiyan Shi  | Stanford University NLP Group & Northeastern University  

Talk in English | 7PM (GMT-7) @Stanford | 4 Oct. 2023| 3AM (GMT+1) @London| 5 Oct. 2023 | Zoom Link


Dr. Weiyan Shi is an incoming Assistant Professor at Northeastern University starting in 2024. She will spend 2023-2024 as a postdoc at Stanford NLP. Her research interests are in Natural Language Processing (NLP), especially in social influence dialogue systems such as persuasion, negotiation, and recommendation. She has also worked on privacy-preserving NLP applications. She is recognized as a Rising Star in Machine Learning by the University of Maryland. Her work on personalized persuasive dialogue systems was nominated for ACL 2019 best paper. She was also a core team member behind a Science publication on the first negotiation AI agent, Cicero, that achieves a human level in the game of Diplomacy. This work has been featured in The New York Times, The Washington Post, MIT Technology Review, Forbes, and other major media outlets.

Dr. Weiyan Shi is looking for Master/PhD/Internship students and  to join her lab - CHATS Lab (Conversation, Human-AI Tech, Security). More infromation is here


AI research has so far focused on modeling common human skills, such as building systems to see, read, or talk. As these systems gradually achieve a human level in standard benchmarks, it is increasingly important to develop next-generation interactive AI systems with more advanced human skills, to function in realistic and critical applications such as providing personalized emotional support. In this talk, I will cover (1) how to build such expert-like AI systems specialized in social influence that can persuade, negotiate, and cooperate with other humans during conversations. (2) I will also discuss how humans perceive such specialized AI systems. This study validates the necessity of Autobot Law and proposes guidance to regulate such systems. (3) As these systems become more powerful, they are also more prone to leak users' private information. So I will describe our proposed new privacy notion, Selective Differential Privacy, and an algorithm to train privacy-preserving models with high utilities. Finally, I will conclude with my long-term vision to build a natural interface between human intelligence and machine intelligence via dialogues, from a multi-angel approach that combines Artificial Intelligence, Human-Computer Interaction, and social sciences, to develop expert AI systems for everyone.

GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding

Dr. Jia-Chen Gu  | University of Science and Technology of China 

 Date (TBD) | Meeting Link (coming soon)

Obstinate robustness for language models

Yimu Wang  | University of Waterloo

 Date (TBD) | Meeting Link (coming soon)

Abstract: We study the problem of generating obstinate (over-stability) adversarial examples by word substitution in NLP, where input text is meaningfully changed but the model's prediction does not, even though it should. Previous word substitution approaches have predominantly focused on manually designed antonym-based strategies for generating obstinate adversarial examples, which hinders its application as these strategies can only find a subset of obstinate adversarial examples and require human efforts. To address this issue, in this paper, we introduce a novel word substitution method named GradObstinate, a gradient-based approach that automatically generates obstinate adversarial examples without any constraints on the search space or the need for manual design principles. To empirically evaluate the efficacy of GradObstinate, we conduct comprehensive experiments on five representative models (Electra, ALBERT, Roberta, DistillBERT, and CLIP) finetuned on four NLP benchmarks (SST-2, MRPC, SNLI, and SQuAD) and a language-grounding benchmark (MSCOCO). Extensive experiments show that our proposed GradObstinate generates more powerful obstinate adversarial examples, exhibiting a higher attack success rate compared to antonym-based methods. Furthermore, to show the transferability of obstinate word substitutions found by GradObstinate, we replace the words in four representative NLP benchmarks with their obstinate substitutions. Notably, obstinate substitutions exhibit a high success rate when transferred to other models in black-box settings, including even GPT-3 and ChatGPT.

We welcome you to discover more about the robustness of NLP and language-grounding tasks.

Join us to be next speaker