This hands-on session will guide you through exploring Large Language Models (LLMs) in clinical contexts. We’ll walk through the entire process: from installing a local model to implementing real clinical applications.
Why use local models? Local models offer greater privacy for sensitive patient data, work offline, and have no recurring costs. They’re ideal for clinical applications where confidentiality is critical.
Download and Install LM Studio
LM Studio is a desktop application that simplifies running language models locally. It works on Windows, macOS, and Linux, and provides an intuitive graphical interface for interacting with models.
Launch LM Studio and Download a Model
The model size (7B, 8B) refers to the number of parameters in billions. Larger models are generally more capable but require more hardware resources. For medical contexts, models from the Mistral and Llama families tend to perform well even in more compact versions.
Start the Local Server
The local server exposes an OpenAI-compatible API, allowing us to interact with the model programmatically. This is a key feature we’ll leverage in the R portion of the workshop.
Test the model
"What is hepatitis?"This step verifies that the model is functioning correctly and has basic medical knowledge. Recent models should provide accurate information about types of hepatitis, modes of transmission, and clinical manifestations.
Goal of this section: This part of the workshop is dedicated to understanding the different ways of interacting with an LLM through clinical prompts. We’ll explore how prompt formulation affects the quality and usefulness of responses.
Use the LM Studio chat interface to test this prompt:
A patient presents with elevated AST and ALT. List possible causes.
This is an example of a basic clinical prompt that asks the model to list possible causes of a common laboratory finding. Evaluate the response based on: - Completeness of the list (does it include common causes like viral hepatitis, steatosis, medications?) - Logical organization (are causes grouped by categories?) - Clinical accuracy (are the listed causes actually associated with the presentation?)
Create a dialogue:
A 45-year-old man presents with chest pain. What initial questions would you ask?
The pain worsens with exertion, improves at rest.
This is a simulation of a clinical conversation that tests the model’s ability to: 1. Generate clinically relevant questions for a common symptom 2. Use additional information to refine diagnostic reasoning
In clinical practice, LLMs could assist in structured history taking or suggest relevant questions based on initial symptoms.
Test this prompt:
A patient has jaundice, right upper quadrant pain, and fever. Think step-by-step: what is the most likely diagnosis?
The “Think step-by-step” prompt triggers chain-of-thought reasoning, a technique that encourages the model to show its reasoning process. It’s particularly useful in medicine because: - It allows you to see how the model arrives at the diagnosis - It highlights potential logical errors or gaps - It simulates human clinical reasoning
Observe how the model: 1. Analyzes individual symptoms 2. Identifies syndromic patterns 3. Considers differential diagnoses 4. Reaches a conclusion based on highest probability
You are an infectious disease specialist. A 35-year-old man has persistent fever despite antibiotic therapy. Provide differential diagnoses considering his travel history to Africa.
Assigning a specialist role to the model (“You are an infectious disease specialist”) is an advanced prompting technique that: - Focuses the response on a specific domain - Activates specialist knowledge - Improves the quality and depth of responses
In this example, the infectious disease specialist role should lead to consideration of tropical infections (malaria, typhoid fever, dengue, etc.) and conditions related to travel in Africa.
Summarize the key findings of this lab report: WBC 12,000/μL with left shift, hemoglobin 10.2 g/dL, platelets 450,000/μL, CRP 75 mg/L.
Now review your own summary. Identify possible mistakes or missing information.
This critique mode leverages the self-evaluation capability of recent LLMs. The two-phase process allows for: 1. Getting an initial interpretation of laboratory data 2. Prompting the model to critically reexamine its own analysis
It’s a particularly useful technique in clinical contexts where the initial interpretation might overlook important correlations or clinical significance of values.
For this exercise, try using the clinical discharge summary PDF in LM Studio’s chat interface:
data/complex_clinical_discharge.pdf fileTry the following prompts with the uploaded document:
Create a concise one-paragraph summary of this patient's hospital course.
What medications should be monitored most carefully given this patient's conditions?
Identify three potential medication interactions in the discharge medication list.
Create a follow-up checklist for the primary care physician.
This exercise demonstrates how LLMs can analyze complex clinical documents and extract relevant information through an intuitive chat interface. The PDF upload feature available in most modern LLM interfaces makes it easy to work with clinical documents in their native format. Note how different the experience is compared to programmatic extraction - each approach has different advantages.
These additional prompts can be used to further explore the model’s capabilities in various clinical contexts. They are examples of typical queries that might be useful in daily practice.
Key Concept: “Hallucinations”
Hallucinations are information generated by the model that seems plausible but is false or made up. It’s crucial to identify them, especially in clinical contexts. Common examples include: - Citation of non-existent clinical studies - Reference to inaccurate guidelines - Creation of false clinical correlations
Hallucinations represent one of the main obstacles to the safe implementation of LLMs in medicine.
The importance of programmatic integration: While the chat interface is useful for one-off interactions, the API allows integration of LLMs into existing workflows and clinical applications. In this section, we’ll learn how to connect R to our local model.
# 1. Install dependencies
# macOS (Homebrew): installs curl for HTTP requests
brew install curl
# Ubuntu/Debian: same installation via apt
sudo apt-get update && sudo apt-get install -y curl
# Windows: install git bash and open it
https://git-scm.com/downloads/win
## from terminal or PowerShell
winget install --id Git.Git -e --source winget
winget install --id=cURL.cURL -e #(if necessary)
# 2. Send request to local model
curl -X POST "http://localhost:1234/v1/chat/completions" \
-H "Content-Type: application/json" \
-d "{\"model\":\"local-model\",\"messages\":[{\"role\":\"user\",\"content\":\"What is anemia?\"}],\"tools\":[{\"type\":\"function\",\"function\":{\"name\":\"get_definition\",\"description\":\"Retrieve the definition of a term from a medical dictionary\",\"parameters\":{\"type\":\"object\",\"properties\":{\"term\":{\"type\":\"string\",\"description\":\"Medical term to define\"}},\"required\":[\"term\"],\"additionalProperties\":false}}}]}"# 1. Install dependencies
# macOS / Linux:
pip3 install requests
# Windows:
python -m pip install requests#!/usr/bin/env python3
import requests # HTTP client
# Define API endpoint and request payload
url = "http://localhost:1234/v1/chat/completions" # local model endpoint
payload = {
"model": "local-model", # identifier of the local model
"messages": [
{"role": "user", "content": "What is anemia?"} # user’s query
],
"temperature": 0.7, # controls randomness
"max_tokens": 1500 # caps response length
}
# Send POST request and ensure success
resp = requests.post(url, json=payload) # send JSON body
resp.raise_for_status() # error if non-200 status
# Parse JSON and print only the model’s reply
data = resp.json() # parse response
print(data["choices"][0]["message"]["content"]) # extract and display contentThe
httrandjsonlitepackages are essential for communicating with the API.httrhandles HTTP POST requests, whilejsonliteconverts R data structures to JSON and vice versa.
# Create a function to communicate with the LM Studio API
call_lm_studio <- function(prompt,
system_prompt = NULL,
api_url = "http://localhost:1234/v1/chat/completions",
model = "local-model",
temperature = 0.7,
max_tokens = 1000,
raw_response = FALSE,
remove_thinking = TRUE) {
messages <- list()
if (!is.null(system_prompt))
messages <- list(list(role = "system", content = system_prompt))
messages <- append(messages, list(list(role = "user", content = prompt)))
res <- httr::POST(
api_url,
body = jsonlite::toJSON(
list(model = model, messages = messages,
temperature = temperature, max_tokens = max_tokens),
auto_unbox = TRUE
),
httr::add_headers("Content-Type" = "application/json"),
encode = "json"
)
if (httr::status_code(res) != 200)
return(paste0("Error: ", httr::status_code(res)))
txt <- httr::content(res, "text", encoding = "UTF-8")
if (raw_response) return(txt)
# evita semplificazioni: mantiene struttura annidata
parsed <- jsonlite::fromJSON(txt, simplifyVector = FALSE)
# estrai contenuto
first <- parsed$choices[[1]]
content <- first$message$content %||% first$text
if (is.null(content))
return("Error: Unable to extract content")
if (remove_thinking)
content <- trimws(gsub("(?s)<think>.*?</think>", "", content, perl = TRUE))
content
}Function Explanation:
This concise wrapper function communicates with the local model’s API and processes its response. Key components:
- Message Structure: Formats the system prompt (optional) and user prompt in the OpenAI API format
- API Parameters:
temperature: Controls response randomness (higher values increase creativity)max_tokens: Sets the maximum response length- Response Handling:
- Maintains nested JSON structure with
simplifyVector = FALSE- Uses the null coalescing operator
%||%to handle different response formats- Post-Processing:
- Removes chain-of-thought sections (text between
<think>...</think>tags)- Regex with
perl = TRUEensures multi-line matching works correctly- Debug Options:
raw_response = TRUE: Returns the raw JSON for API debuggingremove_thinking = FALSE: Preserves reasoning process in the outputThe system prompt can define the model’s behavior (e.g., “You are a medical assistant”) to guide responses in specific domains.
# Test API connection with a simple clinical prompt
# Example 1: Show the raw response for debugging purposes
raw_result <- call_lm_studio(
prompt = "What is anemia?",
system_prompt = "You are a helpful medical assistant.",
raw_response = TRUE
)
# Print raw response
cat("1. Raw API response (for debugging):\n")
cat(raw_result)
cat("\n\n---\n\n")
# Example 2: Show response with chain-of-thought thinking visible
thinking_result <- call_lm_studio(
prompt = "What is anemia?",
system_prompt = "You are a helpful medical assistant.",
remove_thinking = FALSE
)
# Print content with thinking visible
cat("2. Response with chain-of-thought thinking visible:\n")
cat(thinking_result)
cat("\n\n---\n\n")
# Example 3: Show only the clean content (default behavior)
clean_result <- call_lm_studio(
prompt = "Summarize these lab results: Hb 10.5, MCV 70, RDW high.",
system_prompt = "You are a helpful medical assistant."
)
# Print only the clean content
cat("3. Clean content only (default):\n")
cat(clean_result)This multi-part test demonstrates three different ways to interact with the LLM API:
- Raw Response Mode: Returns the complete JSON structure from the API
- Useful for debugging API integration issues
- Shows exactly what the server returns, including metadata
- Chain-of-Thought Mode: Shows the model’s internal reasoning process
- Reveals how the model approaches the problem
- Maintains the
<think>...</think>sections that show reasoning steps- Helpful for evaluating clinical reasoning quality
- Clean Content Mode (Default): Returns only the final, polished answer
- Removes all reasoning artifacts
- Ideal for production applications
- Professional presentation for end users
The lab results example demonstrates how the model interprets values consistent with microcytic anemia (low Hb, low MCV, high RDW) - a practical application in clinical laboratory interpretation.
# Example of a more complex query
clinical_query <- "
A 67-year-old female presents with the following lab values:
- Hemoglobin: 9.8 g/dL (ref: 12.0-15.5)
- MCV: 76 fL (ref: 80-100)
- Ferritin: 15 ng/mL (ref: 30-400)
- TIBC: 450 μg/dL (ref: 250-370)
Provide an interpretation of these results and suggest possible diagnoses.
"
interpretation <- call_lm_studio(
prompt = clinical_query,
system_prompt = "You are a hematology specialist. Provide concise and evidence-based interpretations."
)
# Display just the model's interpretation, without any raw API response details
cat(interpretation)This more complex example shows how to structure a detailed clinical query and use a specialist system prompt. Important elements:
- Structured Format: Laboratory data is presented clearly with reference values
- Specific Request: Both interpretation and possible diagnoses are requested
- Specialist System Prompt: Tells the model to respond as a hematology specialist
The case presented is consistent with iron deficiency anemia (low hemoglobin, low MCV, low ferritin, elevated TIBC), which in an elderly woman might warrant investigation for occult bleeding.
From Theory to Practice: In this final section, we’ll put together what we’ve learned to create practical applications. First, we’ll analyze a complex discharge summary, and then extract structured data from clinical trial descriptions.
In this exercise, we’ll work with a detailed discharge summary containing multiple diagnoses, medications, and instructions - similar to what you might encounter in clinical practice.
# Load the complex discharge summary with relative path using here package
# Install if needed: install.packages("here")
library(here)
# Use relative path
discharge_summary <- readLines(here("data", "complex_clinical_discharge.Rmd"), warn = FALSE)
# Skip YAML header (first 14 lines)
discharge_summary <- discharge_summary[15:length(discharge_summary)]
discharge_summary <- paste(discharge_summary, collapse = "\n")
# Display first few lines
cat(substr(discharge_summary, 1, 500), "...\n")Let’s use the LLM to analyze this document in various ways:
# 1. Extract key diagnoses and create a problem list
diagnoses_prompt <- "Extract the main discharge diagnoses from this clinical document and create a prioritized problem list. For each diagnosis, include its current status (improved, stable, worsened, etc.)."
diagnoses_result <- call_lm_studio(
prompt = paste0(diagnoses_prompt, "\n\n", discharge_summary),
system_prompt = "You are a clinical documentation specialist. Extract relevant clinical information accurately and concisely."
)
cat("EXTRACTED PROBLEM LIST:\n\n")
cat(diagnoses_result)# 2. Create a medication reconciliation with dosages and purpose
medications_prompt <- "Create a complete medication reconciliation list from this document. Format as a table with columns for: Medication Name, Dosage, Frequency, and Purpose/Indication."
medications_result <- call_lm_studio(
prompt = paste0(medications_prompt, "\n\n", discharge_summary),
system_prompt = "You are a clinical pharmacist. Extract medication information with precision."
)
cat("\n\nMEDICATION RECONCILIATION:\n\n")
cat(medications_result)# 3. Generate a concise clinical summary for referring physician
summary_prompt <- "Create a concise summary (maximum 250 words) of this patient case for a referring physician. Include only the most clinically relevant information about presentation, hospital course, interventions, and follow-up plan."
summary_result <- call_lm_studio(
prompt = paste0(summary_prompt, "\n\n", discharge_summary),
system_prompt = "You are an attending physician communicating with colleagues. Be precise, professional, and focus on clinically relevant details."
)
cat("\n\nCLINICAL SUMMARY FOR REFERRAL:\n\n")
cat(summary_result)Clinical Document Analysis Relevance:
This exercise demonstrates how LLMs can help with several time-consuming clinical documentation tasks:
- Problem List Generation: Extracting and organizing medical problems from lengthy documents
- Medication Reconciliation: Creating structured medication lists with relevant details
- Clinical Summarization: Condensing detailed information for efficient communication
These capabilities directly address documentation burden in healthcare, potentially saving clinicians significant time while improving information organization.
Next, we’ll extract structured information from unstructured clinical trial descriptions.
library(tibble)
library(dplyr)
library(stringr)
# Create a dataframe of clinical trial descriptions
trials_df <- tibble::tibble(
trial_id = c("Trial 1", "Trial 2", "Trial 3"),
text = c(
"In a recent clinical trial, 200 patients were randomized to receive Drug A or placebo. The Drug A group had a 35% reduction in risk of hospitalization, whereas the placebo group showed no significant change. Adverse events occurred in 10% of the Drug A group and 5% of the placebo group.",
"A double-blind study involving 150 patients compared Drug B against placebo. Patients on Drug B showed a 25% decrease in hospital admissions, while adverse events were reported in 12% of patients treated with Drug B versus 7% in the placebo group.",
"Another randomized trial evaluated Drug C in 180 patients. Drug C recipients experienced a 40% lower hospitalization rate compared to placebo. However, adverse effects were slightly higher at 15% versus 8% in the placebo cohort."
)
)Project Context: We’ve created a dataframe containing textual descriptions of three fictional clinical trials. Each study reports: - Number of participants - Intervention (drug vs placebo) - Effect on reducing hospitalizations - Adverse event rate
Our goal is to use the LLM to automatically extract this data in a structured format, simulating one of the most time-consuming activities in preparing meta-analyses.
# Function to extract table from a single trial text
extract_table_from_trial <- function(trial_text) {
# Create a prompt that specifies exactly what to extract
prompt <- paste0(
"Extract the information from the following study and format it into a markdown table.\n\n",
trial_text,
"\n\n",
"Create columns: Treatment, Sample Size, Effect on Hospitalization (%), Adverse Events (%). Output only the markdown table without any additional text."
)
# Call the LLM
result <- call_lm_studio(
prompt = prompt,
system_prompt = "You are a research assistant. Extract data precisely from clinical trials and format it as requested."
)
return(result)
}This function transforms the unstructured text of a study into a markdown table. Key elements:
- Specific Prompt: Indicates exactly what information to extract and the desired format
- Targeted System Prompt: Defines the model’s role as a research assistant
- Clear Structure: Specifically requests a markdown table with predefined columns
The ability of LLMs to understand context and structure output makes them particularly suited for this type of data extraction task.
# Initialize list to collect parsed tables
all_rows <- list()
# Iterate over each trial
for (i in seq_len(nrow(trials_df))) {
cat(paste0("\nProcessing ", trials_df$trial_id[i], "...\n"))
# Get the markdown table
markdown_table <- extract_table_from_trial(trials_df$text[i])
cat(markdown_table, "\n")
# Parse markdown table into a dataframe
# Match each row of the table
rows <- str_match_all(markdown_table, "\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|")[[1]]
if (nrow(rows) > 0) {
# Skip header and separator rows (first two rows)
data_rows <- rows[-(1:2), , drop = FALSE]
if (nrow(data_rows) > 0) {
# Create a temporary dataframe
df_temp <- tibble::tibble(
Trial = trials_df$trial_id[i],
Treatment = data_rows[, 2],
Sample_Size = data_rows[, 3],
Effect_on_Hospitalization = data_rows[, 4],
Adverse_Events = data_rows[, 5]
)
# Add to list
all_rows[[length(all_rows) + 1]] <- df_temp
}
}
}
# Combine all tables
final_table <- bind_rows(all_rows)
# View combined final table
final_tableProcessing and Analyzing Results:
The code above: 1. Iterates through each study description 2. Extracts a markdown table using the LLM 3. Converts the markdown table to an R dataframe using regular expressions 4. Combines all results into a final table
This process demonstrates how the LLM can serve as an “intelligent bridge” between unstructured and structured data, an application with enormous potential in clinical and research settings. The regex
\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|\\s*(.*?)\\s*\\|captures the content of each cell in the markdown table.
Label Lab Results
Application: Analyze complete laboratory reports to automatically highlight abnormal values, indicating whether they are elevated or reduced and their potential clinical significance.
Summarize Patient History
Application: Condense lengthy medical records into structured summaries highlighting active problems, current therapies, allergies, and critical information.
Generate Differential Diagnoses
Application: Support clinical reasoning by suggesting differential diagnoses based on structured (symptoms, vital signs) and unstructured (history) data.
Extract Medical Entities
Application: Transform clinical notes into structured databases by extracting entities such as diagnoses, medications, dosages, procedures, and results.
Model Limitations
Critical Point: LLMs are not infallible and can generate plausible but incorrect answers. Recognizing the model’s limitations is essential for safe use. Basic factual queries (e.g., “What are the symptoms of pneumonia?”) tend to be more reliable than complex or rare scenarios.
Clinical Safety
Safety Strategies:
- Implement a “human-in-the-loop” system with human review
- Limit the scope of application to specific, well-defined tasks
- Provide references and justifications for recommendations
- Integrate automatic verification systems against codified guidelines
Comparison with Commercial Models
Trade-offs: Commercial models (like GPT-4, Claude) are generally more capable, but have disadvantages in terms of costs, internet dependency, and privacy. Local models are preferable when:
- Patient data should not leave the local system
- An offline solution is needed
- Solutions without variable query costs are required
- Greater control over configuration and operation is desired
Potential Applications
Promising Applications:
- Support for ICD coding for billing
- Automatic generation of structured clinical notes
- Assistance in searching for relevant literature
- Pre-screening of imaging reports for prioritization
- Support for interpretation of genetic results
Additional Resources: - NeurIPS Clinical LLM Challenge - Contains datasets and examples of clinical prompts - MedPaLM 2 - Research on medicine-specific LLMs - Chatbots for Clinical Informatics - Paper on clinical use of LLMs