Hello! 👋‍
William P. Hogan
Ph.D. candidate studying Natural Language Processing at UC San Diego

About


I'm a Ph.D. Candidate studying NLP at UCSD. My advisor is Jingbo Shang. My research interests lie in extracting structured knowledge from unstructured data using minimal supervision. Recently, I constructed a pipeline leveraging LLMs to automatically extract linking relationships between bacteria and diseases from biomedical texts. I'm passionate about using my skills to improve people's lives everywhere.

When I'm not writing papers, you can find me 🏄, riding my 🚲, getting into a 📘, or trying something new like making soap or gin from scratch.

Publications


Experience


July—December 2024

AI Research Intern

GE HealthCare, San Ramon, CA

  • Developed an AI-powered slideshow generator, leveraging a custom agentic LLM pipeline to automate presentation creation
  • Built an internal LLM coding assistant using Retrieval-Augmented Generation (RAG) improving developer efficiency
  • Optimized documentation workflows with an LLM-driven pipeline, significantly reducing manual processing time
January—June 2024

Graduate Student Researcher

Joan & Irwin Jacobs Center for Health Innovation, UC San Diego

  • Led research on patient safety and infection prevention using AI-driven methods to analyze Electronic Health Records (EHR)
  • Developed LLM-based solutions to extract actionable insights from complex EHR data, uncovering previously unknown safety issues
Summers 2022 & 2023

Research Data Scientist Graduate Intern

Dell Technologies, Round Rock, TX

  • Built a state-of-the-art text-to-SQL language model to access data within a database
  • Worked on interdisciplinary team to prototype methods aimed at making dense data more accessible
  • Applied state-of-the-art NLP and computer vision methods to identify fraudulent purchase orders
  • Designed novel algorithm that prevented $2.1M in company losses
2019—2022

Graduate Student Researcher

Center for Microbiome Innovation, UC San Diego

  • Researcher within the UCSD-IBM Artificial Intelligence for Healthy Living program using deep learning to extract microbiome knowledge from biomedical literature
  • Co-created and maintained web-based annotator tool to test NLP models
  • Developed novel models for relation extraction, acronym resolution, and bacteria normalization
2015—2019

Co-owner, Full-stack Developer

Design Action Collective, Oakland, CA

  • Lead developer on over 30 websites and apps while also co-managing a web development company
  • Improved department-wide workflow to create cleaner, more efficient code
  • Improved internal standards for code commenting, git usage, pair programming, and website accessibility
2011—2015

Digital Media Specialist

DataCenter, Oakland, CA

  • Collaboratively designed and developed infographics of research findings
  • Developed an interactive and educational online game about how to conduct a community-lead research project

Education


University of California Seal
2008

Bachelor of Science in Electrical Engineering

University of California, Santa Cruz

University of California Seal
2021

Master of Science in Computer Science

University of California, San Diego

Specialization in machine learning and NLP

University of California Seal
March 2025 (expected)

Doctor of Philosophy in Computer Science

University of California, San Diego

Specialization in NLP and weakly supervised methods

Projects


March, 2021

Classification of Wine Varieties in a High Dimensional Setting

Designed a model to predict wine varieties based on a combination of textual features. We experiment with various combinations of features and three loss functions: negative log-likelihood, KL-divergence, and a custom LASSO loss function.

picture_as_pdf terminal
December, 2020

Generating Position-specific Scoring Matrices for Protein Secondary Structure Prediction

Designed and built a transformer to generate position-specific scoring matrices for protein sequences.

picture_as_pdf terminal
December, 2020

Expanding News Timeline Summarization

Improved on existing state-of-the-art date-wise and clustering news timeline summarization (TLS) approaches, introduced more representative evaluation metrics, and expanded the available datasets to train news TLS models.

picture_as_pdf terminal
June, 2020

8-state Protein Secondary Structure Prediction

Built a convolutional, residual, and recurrent neural network (CRRNN) that uses protein sequences and corresponding position-specific scoring matrices to predict protein secondary structures.

picture_as_pdf terminal
March, 2020

Deep Photo Style Transfer

Reproduced results from recent works in image style transfer using convolutional neural networks.

picture_as_pdf terminal

Teaching, Service, and Volunteering


Teaching Assistant, Winter 2024

Teaching Assistant for Introduction to Data Mining at UCSD

Teaching Assistant, Spring 2023

Lead Teaching Assistant for the graduate level course Advanced Data-driven Text Mining at UCSD

Program Committee Member, EMNLP 2022

Program committee member for the Unsupervised and Weakly-Supervised Methods in NLP workshop

Program Committee Member, BioNLP 2022–Present

Program committee member for the BioNLP workshop, co-located at ACL

GradPal Mentor, UCSD 2021–Present

Mentor for incoming Computer Science and Engineering students

Developer Mentor, Design Action Collective 2016–2019

Mentored junior web developers. Conducted code reviews and developed curricula to address gaps in understanding.