Hello! 👋‍
William Hogan
Ph.D. candidate studying Natural Language Processing at UC San Diego

About


I'm a Ph.D. Candidate studying NLP at UCSD. My advisor is Jingbo Shang. My research interests lie in extracting structured knowledge from unstructured data using minimal supervision. Recently, I constructed an NLP pipeline to automatically extract linking relationships between bacteria and diseases from biomedical texts. I'm passionate about using my skills to improve people's lives everywhere.

When I'm not writing papers, you can find me 🏄, riding my 🚲, getting into a 📘, or trying something new like making soap or gin from scratch.

Publications


Experience


January 2024—Present

Graduate Student Researcher

Joan & Irwin Jacobs Center for Health Innovation, UC San Diego

  • Lead researcher on patient safety and infection prevention project
  • Develop LLMs and automated methods to process EHR text to uncover insights on patient safety issues that were previously unknown or difficult to understand
Summer 2023

Research Data Scientist Graduate Intern

Dell Technologies, Round Rock, TX

  • Built a state-of-the-art text-to-SQL language model to access data within a database
  • Worked on interdisciplinary team to prototype methods aimed at making dense data more accessible
Summer 2022

Research Data Scientist Graduate Intern

Dell Technologies, Round Rock, TX

  • Applied state-of-the-art NLP and computer vision methods to identify fraudulent purchase orders
  • Designed novel algorithm that prevented $2.1M in company losses
2019—2022

Graduate Student Researcher

Center for Microbiome Innovation, UC San Diego

  • Researcher within the UCSD-IBM Artificial Intelligence for Healthy Living program using deep learning to extract microbiome knowledge from biomedical literature
  • Co-created and maintained web-based annotator tool to test NLP models
  • Developed novel models for relation extraction, acronym resolution, and bacteria normalization
2015—2019

Co-owner, Full-stack Developer

Design Action Collective, Oakland, CA

  • Lead developer on over 30 websites and apps while also co-managing a web development company
  • Improved department-wide workflow to create cleaner, more efficient code
  • Improved internal standards for code commenting, git usage, pair programming, and website accessibility
2011—2015

Digital Media Specialist

DataCenter, Oakland, CA

  • Collaboratively designed and developed infographics of research findings
  • Developed an interactive and educational online game about how to conduct a community-lead research project

Education


University of California Seal
2008

Bachelor of Science in Electrical Engineering

University of California, Santa Cruz

University of California Seal
2021

Master of Science in Computer Science

University of California, San Diego

Specialization in machine learning and NLP

University of California Seal
Dec. 2024 (expected)

Doctor of Philosophy in Computer Science

University of California, San Diego

Specialization in NLP and weakly supervised methods

Projects


March, 2021

Classification of Wine Varieties in a High Dimensional Setting

Designed a model to predict wine varieties based on a combination of textual features. We experiment with various combinations of features and three loss functions: negative log-likelihood, KL-divergence, and a custom LASSO loss function.

picture_as_pdf terminal
December, 2020

Generating Position-specific Scoring Matrices for Protein Secondary Structure Prediction

Designed and built a transformer to generate position-specific scoring matrices for protein sequences.

picture_as_pdf terminal
December, 2020

Expanding News Timeline Summarization

Improved on existing state-of-the-art date-wise and clustering news timeline summarization (TLS) approaches, introduced more representative evaluation metrics, and expanded the available datasets to train news TLS models.

picture_as_pdf terminal
June, 2020

8-state Protein Secondary Structure Prediction

Built a convolutional, residual, and recurrent neural network (CRRNN) that uses protein sequences and corresponding position-specific scoring matrices to predict protein secondary structures.

picture_as_pdf terminal
March, 2020

Deep Photo Style Transfer

Reproduced results from recent works in image style transfer using convolutional neural networks.

picture_as_pdf terminal

Teaching, Service, and Volunteering


Teaching Assistant, Winter 2024

Teaching Assistant for Introduction to Data Mining at UCSD.

Teaching Assistant, Spring 2023

Lead Teaching Assistant for the graduate level course Advanced Data-driven Text Mining at UCSD.

Program Committee Member, EMNLP 2022

Participated as a program committee member for the Unsupervised and Weakly-Supervised Methods in NLP workshop.

Program Committee Member, BioNLP 2022–2023

Participated as a program committee member for the 21st BioNLP workshop, co-located with ACL, 2022.

GradPal Mentor, UCSD 2021–2023

Welcomed incoming students to campus and the Computer Science and Engineering program.

Developer Mentor, Design Action Collective 2016–2019

Mentored junior web developers on coding best practices. Conducted code reviews and developed curricula to address gaps in understanding.