Leveraging LLMs such as ChatGPT to discover hidden gems in Pharma data

The pharmaceutical industry possesses rich sources of unstructured data, including market research interview transcripts, inbound patient calls, sales rep notes, research articles, emails, and public web posts from physicians and patients. Exploring and extracting key insights from this data is a challenging and tedious task. Although techniques like topic modeling and LDA have been attempted in the past, the contextual nature of language used has made it difficult to extract meaningful insights.

Recent advancements in Large Language Models (LLMs) such as ChatGPT and BARD have significantly improved machines’ ability to understand context. By leveraging LLMs as foundational building blocks, we now have the capability to create powerful technology that can effectively summarize unstructured text.

As an example, we developed the SetuChat summarization engine, which collected ~500 tweets from physicians discussing immunotherapy. Using this engine, we extracted key discussion topics from the tweets and organized them in a tree structure, facilitating interactive exploration of the content. The tool empowers users to delve into the tree to discover essential points and even view the original raw comments interactively.

For a more detailed demonstration, please watch the video below showcasing how SetuChat effectively summarizes the tweets and facilitates interactive exploration of the extracted insights.