Practical Workshop: LLMs in Clinical Applications

📋 Workshop Overview

This hands-on session will guide you through exploring Large Language Models (LLMs) in clinical contexts. We’ll walk through the entire process: from installing a local model to implementing real clinical applications.

  1. Setting up a local Large Language Model (LLM)
  2. Testing clinical prompts with the model
  3. Connecting to the model via API in R
  4. Applying LLMs to clinical data extraction tasks

📥 Part 1: Setting up Local LLM (30 minutes)

Why use local models? Local models offer greater privacy for sensitive patient data, work offline, and have no recurring costs. They’re ideal for clinical applications where confidentiality is critical.

Setting up LM Studio

  1. Download and Install LM Studio

    • Visit https://lmstudio.ai/ and download the appropriate version
    • Follow installation instructions for your operating system

    LM Studio is a desktop application that simplifies running language models locally. It works on Windows, macOS, and Linux, and provides an intuitive graphical interface for interacting with models.

  2. Launch LM Studio and Download a Model

    • Recommended models:
      • Qwen3 0.6B
      • Llama 3 8B Instruct
      • Gemma 7B Instruct
    • These models balance capability and reasonable hardware requirements

    The model size (7B, 8B) refers to the number of parameters in billions. Larger models are generally more capable but require more hardware resources. For medical contexts, models from the Mistral and Llama families tend to perform well even in more compact versions.

  3. Start the Local Server

    • In LM Studio, select your downloaded model
    • Click “Local Server” in the sidebar
    • Click “Start Server”
    • Note the URL (typically http://localhost:1234)

    The local server exposes an OpenAI-compatible API, allowing us to interact with the model programmatically. This is a key feature we’ll leverage in the R portion of the workshop.

  4. Test the model

    • In the Chat interface, ask: "What is hepatitis?"
    • Verify you get a reasonable medical response

    This step verifies that the model is functioning correctly and has basic medical knowledge. Recent models should provide accurate information about types of hepatitis, modes of transmission, and clinical manifestations.

🗣️ Part 2: Testing Chat Mode (40 minutes)

Goal of this section: This part of the workshop is dedicated to understanding the different ways of interacting with an LLM through clinical prompts. We’ll explore how prompt formulation affects the quality and usefulness of responses.

Basic Clinical Prompts (5 minutes)

Use the LM Studio chat interface to test this prompt:

A patient presents with elevated AST and ALT. List possible causes.

This is an example of a basic clinical prompt that asks the model to list possible causes of a common laboratory finding. Evaluate the response based on: - Completeness of the list (does it include common causes like viral hepatitis, steatosis, medications?) - Logical organization (are causes grouped by categories?) - Clinical accuracy (are the listed causes actually associated with the presentation?)

Multi-turn Clinical Conversation (10 minutes)

Create a dialogue:

  1. Initial prompt:
A 45-year-old man presents with chest pain. What initial questions would you ask?
  1. Follow up with:
The pain worsens with exertion, improves at rest.

This is a simulation of a clinical conversation that tests the model’s ability to: 1. Generate clinically relevant questions for a common symptom 2. Use additional information to refine diagnostic reasoning

In clinical practice, LLMs could assist in structured history taking or suggest relevant questions based on initial symptoms.

Chain-of-Thought Reasoning (10 minutes)

Test this prompt:

A patient has jaundice, right upper quadrant pain, and fever. Think step-by-step: what is the most likely diagnosis?

The “Think step-by-step” prompt triggers chain-of-thought reasoning, a technique that encourages the model to show its reasoning process. It’s particularly useful in medicine because: - It allows you to see how the model arrives at the diagnosis - It highlights potential logical errors or gaps - It simulates human clinical reasoning

Observe how the model: 1. Analyzes individual symptoms 2. Identifies syndromic patterns 3. Considers differential diagnoses 4. Reaches a conclusion based on highest probability

Role Play Specialist Mode (10 minutes)

You are an infectious disease specialist. A 35-year-old man has persistent fever despite antibiotic therapy. Provide differential diagnoses considering his travel history to Africa.

Assigning a specialist role to the model (“You are an infectious disease specialist”) is an advanced prompting technique that: - Focuses the response on a specific domain - Activates specialist knowledge - Improves the quality and depth of responses

In this example, the infectious disease specialist role should lead to consideration of tropical infections (malaria, typhoid fever, dengue, etc.) and conditions related to travel in Africa.

Critique Mode (5 minutes)

  1. First prompt:
Summarize the key findings of this lab report: WBC 12,000/μL with left shift, hemoglobin 10.2 g/dL, platelets 450,000/μL, CRP 75 mg/L.
  1. Then ask:
Now review your own summary. Identify possible mistakes or missing information.

This critique mode leverages the self-evaluation capability of recent LLMs. The two-phase process allows for: 1. Getting an initial interpretation of laboratory data 2. Prompting the model to critically reexamine its own analysis

It’s a particularly useful technique in clinical contexts where the initial interpretation might overlook important correlations or clinical significance of values.

Using Complex Clinical Documents in Chat Mode

For this exercise, try using the clinical discharge summary PDF in LM Studio’s chat interface:

  1. Open LM Studio and select the Chat tab with your model
  2. Click on the document upload feature (📎 icon or similar)
  3. Navigate to and select the data/complex_clinical_discharge.pdf file
  4. Wait for the document to be processed by the model

Try the following prompts with the uploaded document:

Create a concise one-paragraph summary of this patient's hospital course.
What medications should be monitored most carefully given this patient's conditions?
Identify three potential medication interactions in the discharge medication list.
Create a follow-up checklist for the primary care physician.

This exercise demonstrates how LLMs can analyze complex clinical documents and extract relevant information through an intuitive chat interface. The PDF upload feature available in most modern LLM interfaces makes it easy to work with clinical documents in their native format. Note how different the experience is compared to programmatic extraction - each approach has different advantages.

Ready-to-use Clinical Prompts Set

  1. “Patient with fever, cough, and dyspnea: suggest differential diagnoses.”
  2. “Elevated D-dimer and chest pain: what would you recommend?”
  3. “Interpret these liver function test results: AST 80, ALT 90, GGT 200.”
  4. “List common causes of microcytic anemia.”
  5. “Summarize the main treatments for community-acquired pneumonia.”

These additional prompts can be used to further explore the model’s capabilities in various clinical contexts. They are examples of typical queries that might be useful in daily practice.

Discussion Points

  • Note the model’s reasoning capabilities
  • Identify any hallucinations or inconsistencies
  • Observe how specifying roles affects responses
  • Evaluate the model’s ability to self-critique

Key Concept: “Hallucinations”

Hallucinations are information generated by the model that seems plausible but is false or made up. It’s crucial to identify them, especially in clinical contexts. Common examples include: - Citation of non-existent clinical studies - Reference to inaccurate guidelines - Creation of false clinical correlations

Hallucinations represent one of the main obstacles to the safe implementation of LLMs in medicine.

🧪 Part 3: Using the Local API in R (30 minutes)

The importance of programmatic integration: While the chat interface is useful for one-off interactions, the API allows integration of LLMs into existing workflows and clinical applications. In this section, we’ll learn how to connect R to our local model.

🖥️ Terminal‐Based LLM API Call

# 1. Install dependencies
# macOS (Homebrew): installs curl for HTTP requests
brew install curl

# Ubuntu/Debian: same installation via apt
sudo apt-get update && sudo apt-get install -y curl

# Windows: install git bash and open it

https://git-scm.com/downloads/win

## from terminal or PowerShell
winget install --id Git.Git -e --source winget
winget install --id=cURL.cURL  -e #(if necessary)

# 2. Send request to local model
curl -X POST "http://localhost:1234/v1/chat/completions" \
  -H "Content-Type: application/json" \
  -d "{\"model\":\"local-model\",\"messages\":[{\"role\":\"user\",\"content\":\"What is anemia?\"}],\"tools\":[{\"type\":\"function\",\"function\":{\"name\":\"get_definition\",\"description\":\"Retrieve the definition of a term from a medical dictionary\",\"parameters\":{\"type\":\"object\",\"properties\":{\"term\":{\"type\":\"string\",\"description\":\"Medical term to define\"}},\"required\":[\"term\"],\"additionalProperties\":false}}}]}"

🐍 Python-Based LLM API Call

# 1. Install dependencies
# macOS / Linux:
pip3 install requests

# Windows:
python -m pip install requests
#!/usr/bin/env python3
import requests  # HTTP client

# Define API endpoint and request payload
url = "http://localhost:1234/v1/chat/completions"     # local model endpoint
payload = {
    "model": "local-model",                           # identifier of the local model
    "messages": [
        {"role": "user", "content": "What is anemia?"}  # user’s query
    ],
    "temperature": 0.7,                               # controls randomness
    "max_tokens": 1500                                # caps response length
}

# Send POST request and ensure success
resp = requests.post(url, json=payload)               # send JSON body
resp.raise_for_status()                               # error if non-200 status

# Parse JSON and print only the model’s reply
data = resp.json()                                    # parse response
print(data["choices"][0]["message"]["content"])       # extract and display content

📊 RStudio-Based LLM API Call

The httr and jsonlite packages are essential for communicating with the API. httr handles HTTP POST requests, while jsonlite converts R data structures to JSON and vice versa.

# Install required packages
install.packages(c("httr", "jsonlite", "tibble", "dplyr", "stringr"))

library(httr)      # For HTTP requests
library(jsonlite)  # For JSON handling

Creating a Helper Function to Call the LM Studio API

# Create a function to communicate with the LM Studio API
call_lm_studio <- function(prompt,
                           system_prompt = NULL,
                           api_url = "http://localhost:1234/v1/chat/completions",
                           model = "local-model",
                           temperature = 0.7,
                           max_tokens = 1000,
                           raw_response = FALSE,
                           remove_thinking = TRUE) {
  
  messages <- list()
  if (!is.null(system_prompt))
    messages <- list(list(role = "system", content = system_prompt))
  messages <- append(messages, list(list(role = "user", content = prompt)))
  
  res <- httr::POST(
    api_url,
    body = jsonlite::toJSON(
      list(model = model, messages = messages,
           temperature = temperature, max_tokens = max_tokens),
      auto_unbox = TRUE
    ),
    httr::add_headers("Content-Type" = "application/json"),
    encode = "json"
  )
  if (httr::status_code(res) != 200)
    return(paste0("Error: ", httr::status_code(res)))
  
  txt <- httr::content(res, "text", encoding = "UTF-8")
  if (raw_response) return(txt)
  
  # evita semplificazioni: mantiene struttura annidata
  parsed <- jsonlite::fromJSON(txt, simplifyVector = FALSE)
  
  # estrai contenuto
  first <- parsed$choices[[1]]
  content <- first$message$content %||% first$text
  if (is.null(content))
    return("Error: Unable to extract content")
  
  if (remove_thinking)
    content <- trimws(gsub("(?s)<think>.*?</think>", "", content, perl = TRUE))
  
  content
}

Function Explanation:

This concise wrapper function communicates with the local model’s API and processes its response. Key components:

  1. Message Structure: Formats the system prompt (optional) and user prompt in the OpenAI API format
  2. API Parameters:
    • temperature: Controls response randomness (higher values increase creativity)
    • max_tokens: Sets the maximum response length
  3. Response Handling:
    • Maintains nested JSON structure with simplifyVector = FALSE
    • Uses the null coalescing operator %||% to handle different response formats
  4. Post-Processing:
    • Removes chain-of-thought sections (text between <think>...</think> tags)
    • Regex with perl = TRUE ensures multi-line matching works correctly
  5. Debug Options:
    • raw_response = TRUE: Returns the raw JSON for API debugging
    • remove_thinking = FALSE: Preserves reasoning process in the output

The system prompt can define the model’s behavior (e.g., “You are a medical assistant”) to guide responses in specific domains.

Testing the API Connection

# Test API connection with a simple clinical prompt
# Example 1: Show the raw response for debugging purposes
raw_result <- call_lm_studio(
  prompt = "What is anemia?",
  system_prompt = "You are a helpful medical assistant.",
  raw_response = TRUE
)

# Print raw response
cat("1. Raw API response (for debugging):\n")
cat(raw_result)
cat("\n\n---\n\n")

# Example 2: Show response with chain-of-thought thinking visible
thinking_result <- call_lm_studio(
  prompt = "What is anemia?",
  system_prompt = "You are a helpful medical assistant.",
  remove_thinking = FALSE
)

# Print content with thinking visible
cat("2. Response with chain-of-thought thinking visible:\n")
cat(thinking_result)
cat("\n\n---\n\n")

# Example 3: Show only the clean content (default behavior)
clean_result <- call_lm_studio(
  prompt = "Summarize these lab results: Hb 10.5, MCV 70, RDW high.",
  system_prompt = "You are a helpful medical assistant."
)

# Print only the clean content
cat("3. Clean content only (default):\n")
cat(clean_result)

This multi-part test demonstrates three different ways to interact with the LLM API:

  1. Raw Response Mode: Returns the complete JSON structure from the API
    • Useful for debugging API integration issues
    • Shows exactly what the server returns, including metadata
  2. Chain-of-Thought Mode: Shows the model’s internal reasoning process
    • Reveals how the model approaches the problem
    • Maintains the <think>...</think> sections that show reasoning steps
    • Helpful for evaluating clinical reasoning quality
  3. Clean Content Mode (Default): Returns only the final, polished answer
    • Removes all reasoning artifacts
    • Ideal for production applications
    • Professional presentation for end users

The lab results example demonstrates how the model interprets values consistent with microcytic anemia (low Hb, low MCV, high RDW) - a practical application in clinical laboratory interpretation.

Making Complex Medical Queries

# Example of a more complex query
clinical_query <- "
A 67-year-old female presents with the following lab values:
- Hemoglobin: 9.8 g/dL (ref: 12.0-15.5)
- MCV: 76 fL (ref: 80-100)
- Ferritin: 15 ng/mL (ref: 30-400)
- TIBC: 450 μg/dL (ref: 250-370)

Provide an interpretation of these results and suggest possible diagnoses.
"

interpretation <- call_lm_studio(
  prompt = clinical_query,
  system_prompt = "You are a hematology specialist. Provide concise and evidence-based interpretations."
)

# Display just the model's interpretation, without any raw API response details
cat(interpretation)

This more complex example shows how to structure a detailed clinical query and use a specialist system prompt. Important elements:

  1. Structured Format: Laboratory data is presented clearly with reference values
  2. Specific Request: Both interpretation and possible diagnoses are requested
  3. Specialist System Prompt: Tells the model to respond as a hematology specialist

The case presented is consistent with iron deficiency anemia (low hemoglobin, low MCV, low ferritin, elevated TIBC), which in an elderly woman might warrant investigation for occult bleeding.

📊 Part 4: Analyzing Complex Clinical Documents (30 minutes)

From Theory to Practice: In this final section, we’ll put together what we’ve learned to create practical applications. First, we’ll analyze a complex discharge summary, and then extract structured data from clinical trial descriptions.

Complex Clinical Document Analysis

In this exercise, we’ll work with a detailed discharge summary containing multiple diagnoses, medications, and instructions - similar to what you might encounter in clinical practice.

# Load the complex discharge summary with relative path using here package
# Install if needed: install.packages("here")
library(here)

# Use relative path
discharge_summary <- readLines(here("data", "complex_clinical_discharge.Rmd"), warn = FALSE)
# Skip YAML header (first 14 lines)
discharge_summary <- discharge_summary[15:length(discharge_summary)]
discharge_summary <- paste(discharge_summary, collapse = "\n")

# Display first few lines
cat(substr(discharge_summary, 1, 500), "...\n")

Let’s use the LLM to analyze this document in various ways:

# 1. Extract key diagnoses and create a problem list
diagnoses_prompt <- "Extract the main discharge diagnoses from this clinical document and create a prioritized problem list. For each diagnosis, include its current status (improved, stable, worsened, etc.)."

diagnoses_result <- call_lm_studio(
  prompt = paste0(diagnoses_prompt, "\n\n", discharge_summary),
  system_prompt = "You are a clinical documentation specialist. Extract relevant clinical information accurately and concisely."
)

cat("EXTRACTED PROBLEM LIST:\n\n")
cat(diagnoses_result)
# 2. Create a medication reconciliation with dosages and purpose
medications_prompt <- "Create a complete medication reconciliation list from this document. Format as a table with columns for: Medication Name, Dosage, Frequency, and Purpose/Indication."

medications_result <- call_lm_studio(
  prompt = paste0(medications_prompt, "\n\n", discharge_summary),
  system_prompt = "You are a clinical pharmacist. Extract medication information with precision."
)

cat("\n\nMEDICATION RECONCILIATION:\n\n")
cat(medications_result)
# 3. Generate a concise clinical summary for referring physician
summary_prompt <- "Create a concise summary (maximum 250 words) of this patient case for a referring physician. Include only the most clinically relevant information about presentation, hospital course, interventions, and follow-up plan."

summary_result <- call_lm_studio(
  prompt = paste0(summary_prompt, "\n\n", discharge_summary),
  system_prompt = "You are an attending physician communicating with colleagues. Be precise, professional, and focus on clinically relevant details."
)

cat("\n\nCLINICAL SUMMARY FOR REFERRAL:\n\n")
cat(summary_result)

Clinical Document Analysis Relevance:

This exercise demonstrates how LLMs can help with several time-consuming clinical documentation tasks:

  1. Problem List Generation: Extracting and organizing medical problems from lengthy documents
  2. Medication Reconciliation: Creating structured medication lists with relevant details
  3. Clinical Summarization: Condensing detailed information for efficient communication

These capabilities directly address documentation burden in healthcare, potentially saving clinicians significant time while improving information organization.

Mini-project: Clinical Trial Data Extraction

Next, we’ll extract structured information from unstructured clinical trial descriptions.

library(tibble)
library(dplyr)
library(stringr)

# Create a dataframe of clinical trial descriptions
trials_df <- tibble::tibble(
  trial_id = c("Trial 1", "Trial 2", "Trial 3"),
  text = c(
    "In a recent clinical trial, 200 patients were randomized to receive Drug A or placebo. The Drug A group had a 35% reduction in risk of hospitalization, whereas the placebo group showed no significant change. Adverse events occurred in 10% of the Drug A group and 5% of the placebo group.",
    
    "A double-blind study involving 150 patients compared Drug B against placebo. Patients on Drug B showed a 25% decrease in hospital admissions, while adverse events were reported in 12% of patients treated with Drug B versus 7% in the placebo group.",
    
    "Another randomized trial evaluated Drug C in 180 patients. Drug C recipients experienced a 40% lower hospitalization rate compared to placebo. However, adverse effects were slightly higher at 15% versus 8% in the placebo cohort."
  )
)

Project Context: We’ve created a dataframe containing textual descriptions of three fictional clinical trials. Each study reports: - Number of participants - Intervention (drug vs placebo) - Effect on reducing hospitalizations - Adverse event rate

Our goal is to use the LLM to automatically extract this data in a structured format, simulating one of the most time-consuming activities in preparing meta-analyses.

# Function to extract table from a single trial text
extract_table_from_trial <- function(trial_text) {
  # Create a prompt that specifies exactly what to extract
  prompt <- paste0(
    "Extract the information from the following study and format it into a markdown table.\n\n",
    trial_text,
    "\n\n",
    "Create columns: Treatment, Sample Size, Effect on Hospitalization (%), Adverse Events (%). Output only the markdown table without any additional text."
  )
  
  # Call the LLM
  result <- call_lm_studio(
    prompt = prompt,
    system_prompt = "You are a research assistant. Extract data precisely from clinical trials and format it as requested."
  )
  
  return(result)
}

This function transforms the unstructured text of a study into a markdown table. Key elements:

  1. Specific Prompt: Indicates exactly what information to extract and the desired format
  2. Targeted System Prompt: Defines the model’s role as a research assistant
  3. Clear Structure: Specifically requests a markdown table with predefined columns

The ability of LLMs to understand context and structure output makes them particularly suited for this type of data extraction task.

# Initialize list to collect parsed tables
all_rows <- list()

# Iterate over each trial
for (i in seq_len(nrow(trials_df))) {
  cat(paste0("\nProcessing ", trials_df$trial_id[i], "...\n"))
  
  # Get the markdown table
  markdown_table <- extract_table_from_trial(trials_df$text[i])
  cat(markdown_table, "\n")
  
  # Parse markdown table into a dataframe
  # Match each row of the table
  rows <- str_match_all(markdown_table, "\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|")[[1]]
  
  if (nrow(rows) > 0) {
    # Skip header and separator rows (first two rows)
    data_rows <- rows[-(1:2), , drop = FALSE]
    
    if (nrow(data_rows) > 0) {
      # Create a temporary dataframe
      df_temp <- tibble::tibble(
        Trial = trials_df$trial_id[i],
        Treatment = data_rows[, 2],
        Sample_Size = data_rows[, 3],
        Effect_on_Hospitalization = data_rows[, 4],
        Adverse_Events = data_rows[, 5]
      )
      
      # Add to list
      all_rows[[length(all_rows) + 1]] <- df_temp
    }
  }
}

# Combine all tables
final_table <- bind_rows(all_rows)

# View combined final table
final_table

Processing and Analyzing Results:

The code above: 1. Iterates through each study description 2. Extracts a markdown table using the LLM 3. Converts the markdown table to an R dataframe using regular expressions 4. Combines all results into a final table

This process demonstrates how the LLM can serve as an “intelligent bridge” between unstructured and structured data, an application with enormous potential in clinical and research settings. The regex \\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\| captures the content of each cell in the markdown table.

Additional Ideas for Clinical Applications

  1. Label Lab Results

    • Extract abnormal results and tag them accordingly

    Application: Analyze complete laboratory reports to automatically highlight abnormal values, indicating whether they are elevated or reduced and their potential clinical significance.

  2. Summarize Patient History

    • Create concise summaries from longer patient notes

    Application: Condense lengthy medical records into structured summaries highlighting active problems, current therapies, allergies, and critical information.

  3. Generate Differential Diagnoses

    • Based on symptoms, vital signs, and lab values

    Application: Support clinical reasoning by suggesting differential diagnoses based on structured (symptoms, vital signs) and unstructured (history) data.

  4. Extract Medical Entities

    • Identify diagnoses, treatments, medications from text

    Application: Transform clinical notes into structured databases by extracting entities such as diagnoses, medications, dosages, procedures, and results.

🔍 Discussion Points

  1. Model Limitations

    • Identify when the model makes errors or hallucinates information
    • Determine what types of queries produce the most accurate responses

    Critical Point: LLMs are not infallible and can generate plausible but incorrect answers. Recognizing the model’s limitations is essential for safe use. Basic factual queries (e.g., “What are the symptoms of pneumonia?”) tend to be more reliable than complex or rare scenarios.

  2. Clinical Safety

    • How would you validate the model’s outputs?
    • What safeguards would you implement in a clinical setting?

    Safety Strategies:

    • Implement a “human-in-the-loop” system with human review
    • Limit the scope of application to specific, well-defined tasks
    • Provide references and justifications for recommendations
    • Integrate automatic verification systems against codified guidelines
  3. Comparison with Commercial Models

    • How do these local models compare to commercial ones like GPT-4?
    • In what scenarios might a local model be preferable?

    Trade-offs: Commercial models (like GPT-4, Claude) are generally more capable, but have disadvantages in terms of costs, internet dependency, and privacy. Local models are preferable when:

    • Patient data should not leave the local system
    • An offline solution is needed
    • Solutions without variable query costs are required
    • Greater control over configuration and operation is desired
  4. Potential Applications

    • Brainstorm useful applications in your specific clinical setting
    • What workflows could benefit most from LLM assistance?

    Promising Applications:

    • Support for ICD coding for billing
    • Automatic generation of structured clinical notes
    • Assistance in searching for relevant literature
    • Pre-screening of imaging reports for prioritization
    • Support for interpretation of genetic results

📚 Resources

Additional Resources: - NeurIPS Clinical LLM Challenge - Contains datasets and examples of clinical prompts - MedPaLM 2 - Research on medicine-specific LLMs - Chatbots for Clinical Informatics - Paper on clinical use of LLMs