Yuhao Zhang



Hey there, welcome!

I am currently a scientist in the founding team of Samaya AI. We are on a journey to improve knowledge discovery by harnessing the power of large language models.

Before Samaya, I was a scientist at Amazon AWS AI where I worked on core AWS services relevant to enterprise search. I obtained my PhD degree from Stanford University, where I was jointly advised by Prof. Chris Manning in the Stanford NLP Group and Prof. Curtis Langlotz in the Stanford AIMI Center. My PhD work has focused on natural language processing and its applications in medicine.

Before that, I obtained a M.S. degree in the Computer Science Department at Stanford University, and a bachelor’s degree from the Department of Electronic Engineering at Tsinghua University, China.

research interest

I care about NLP systems and their impact in real-world applications. My work has covered the following areas:

  • retrieval and retrieval-augmented generation;
  • information extraction;
  • summarization;
  • multimodal learning;
  • syntactic analysis and open-source NLP toolkit (I am a co-author of the widely used Stanza NLP library).


You can reach me now at {first-name} ~at~ cs.stanford.edu. You can also find my various social accounts at the bottom of this page.

selected publications

For a complete list, see the publications page, or my google scholar page.

(*=equal contribution)

  1. ACL Findings
    RobustQA: Benchmarking the Robustness of Domain Adaptation for Open-domain Question Answering
    Rujun Han , Peng Qi , Yuhao Zhang, and 6 more authors
    In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) , 2023
  2. ACL Findings
    Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
    Jifan Chen , Yuhao Zhang, Lan Liu , and 5 more authors
    In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) , 2023
  3. EMNLP Findings
    Tokenization Consistency Matters for Generative Models on Extractive NLP Tasks
    Kaiser Sun , Peng Qi , Yuhao Zhang, and 3 more authors
    In Findings of EMNLP , 2023
  4. MLHC
    Contrastive Learning of Medical Visual Representations from Paired Images and Text
    Yuhao Zhang, Hang Jiang , Yasuhide Miura , and 2 more authors
    In Proceedings of the 7th Machine Learning for Healthcare Conference , 2022
  5. Thesis
    Deep Understanding and Generation of Medical Text and Beyond
    Yuhao Zhang
    Stanford University PhD Thesis, 2021
  6. JAMIA
    Biomedical and Clinical English Model Packages for the Stanza Python NLP Library
    Yuhao Zhang, Yuhui Zhang , Peng Qi , and 2 more authors
    Journal of the American Medical Informatics Association, 2021
  7. ACL
    Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
    Peng Qi* , Yuhao Zhang*, Yuhui Zhang , and 2 more authors
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations , 2020
    Universal Dependency Parsing from Scratch
    Peng Qi* , Timothy Dozat* , Yuhao Zhang*, and 1 more author
    In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018