Yuhao Zhang

张宇浩

yuhao-zoomed.jpg

Hey there, welcome!

I am currently a scientist in the founding team of Samaya AI. We are on a journey to improve knowledge discovery by harnessing the power of large language models.

Before Samaya, I was a scientist at Amazon AWS AI where I worked on core AWS services relevant to enterprise search. I obtained my PhD degree from Stanford University, where I was jointly advised by Prof. Chris Manning in the Stanford NLP Group and Prof. Curtis Langlotz in the Stanford AIMI Center. My PhD work has focused on natural language processing and its applications in medicine.

Before that, I obtained a M.S. degree in the Computer Science Department at Stanford University, and a bachelor’s degree from the Department of Electronic Engineering at Tsinghua University, China.

research interest

I care about NLP systems and their impact in real-world applications. My work has covered the following areas:

  • retrieval and retrieval-augmented generation;
  • information extraction;
  • summarization;
  • multimodal learning;
  • syntactic analysis and open-source NLP toolkit (I am a co-author of the widely used Stanza NLP library).

contact

You can reach me now at {first-name} ~at~ cs.stanford.edu. You can also find my various social accounts at the bottom of this page.

selected publications

For a complete list, see the publications page, or my google scholar page.

(*=equal contribution)

  1. arXiv
    Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
    Orion Weller , Benjamin Van Durme , Dawn Lawrie , and 3 more authors
    arXiv preprint arXiv:2409.11136, 2024
  2. EMNLP
    Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models
    Zhengxuan Wu , Yuhao Zhang, Peng Qi , and 6 more authors
    In EMNLP , 2024
  3. EMNLP
    RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering
    Rujun Han , Yuhao Zhang, Peng Qi , and 6 more authors
    In EMNLP , 2024
  4. ACL Findings
    RobustQA: Benchmarking the Robustness of Domain Adaptation for Open-domain Question Answering
    Rujun Han , Peng Qi , Yuhao Zhang, and 6 more authors
    In Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) , 2023
  5. MLHC
    Contrastive Learning of Medical Visual Representations from Paired Images and Text
    Yuhao Zhang, Hang Jiang , Yasuhide Miura , and 2 more authors
    In Proceedings of the 7th Machine Learning for Healthcare Conference , 2022
  6. Thesis
    Deep Understanding and Generation of Medical Text and Beyond
    Yuhao Zhang
    Stanford University PhD Thesis, 2021
  7. JAMIA
    Biomedical and Clinical English Model Packages for the Stanza Python NLP Library
    Yuhao Zhang, Yuhui Zhang , Peng Qi , and 2 more authors
    Journal of the American Medical Informatics Association, 2021
  8. ACL
    Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
    Peng Qi* , Yuhao Zhang*, Yuhui Zhang , and 2 more authors
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL): System Demonstrations , 2020
  9. EMNLP-CoNLL
    Universal Dependency Parsing from Scratch
    Peng Qi* , Timothy Dozat* , Yuhao Zhang*, and 1 more author
    In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018