top of page

Welcome
to NumpyNinja Blogs

NumpyNinja: Blogs. Demystifying Tech,

One Blog at a Time.
Millions of views. 

Explaining Views and Materialized Views in PostgreSQL through a healthcare dataset

In large healthcare datasets—such as cardiac failure studies that track patient demographics, lab values, medical history, and prescriptions—efficient querying and analysis of data are essential. When working with multiple interrelated tables, it’s common to face performance issues due to complex joins, aggregations, and millions of records. Addressing this, Views and Materialized Views in PostgreSQL become powerful optimization tools.


What Is a View?

A view is a virtual table created from a SQL query. It does not store data physically—it fetches real-time results every time it’s queried.


Example:

CREATE VIEW patient_summary AS

SELECT

    d.inpatient_number,

    d.gender,

    d.agecat,

    l.bmi,

    h.diabetes,

    h.congestive_heart_failure

FROM demography d

JOIN patienthistory h ON d.inpatient_number = h.inpatient_number

JOIN labs l ON d.inpatient_number = l.inpatient_number;


This simplifies repeated analysis queries without writing complex joins every time.However, since it recalculates results on every query, it can be slow for large datasets.


 What Is a Materialized View?


A materialized view stores the physical result of a query, making future access much faster—especially when the same query is run multiple times (e.g., dashboards or predictive models).

Example:

CREATE MATERIALIZED VIEW patient_lab_summary AS

SELECT

    d.inpatient_number,

    d.gender,

    l.mean_corpuscular_volume,

    l.hemoglobin,

    l.creatinine_enzymatic_method,

    h.diabetes

FROM demography d

JOIN labs l ON d.inpatient_number = l.inpatient_number

JOIN patienthistory h ON d.inpatient_number = h.inpatient_number;


Now, this precomputed dataset is physically stored in the database.Whenever analysts run reports, PostgreSQL doesn’t need to recompute joins—it just reads the stored results.


 Why It Matters


1. Improves Query Performance

The ‘labs’ table alone in the dataset can contain thousands of rows per patient (e.g., blood pressure, cholesterol, glucose, etc.).By creating a materialized view that aggregates metrics like average creatinine, hemoglobin, or BNP levels per patient, we can reduce query time from seconds or minutes to milliseconds.

CREATE MATERIALIZED VIEW avg_lab_metrics AS

SELECT

    inpatient_number,

    ROUND(AVG(creatinine_enzymatic_method), 2) AS avg_creatinine,

    ROUND(AVG(hemoglobin), 2) AS avg_hemoglobin,

    ROUND(AVG(brain_natriuretic_peptide), 2) AS avg_bnp

FROM labs

GROUP BY inpatient_number;

This allows rapid reporting of renal or cardiac stress markers in patients—essential for clinical analytics.


2. Reduces Load on Base Tables


In healthcare analytics, multiple dashboards and machine learning pipelines may hit the same tables. If every user query re-runs complex joins between demography, labs, and patient history, the database slows down. A materialized view acts like a cache, serving pre-joined results efficiently.


3. Supports Predictive Modeling


When you later run predictive models (like detecting risk of heart failure or mortality), you can pull data directly from a materialized view of pre-cleaned, aggregated features—reducing both ETL time and compute cost.

 Refreshing Strategies

Since the base tables like ‘labs’ or ‘patient history’ can get updated when new hospital data arrives, materialized views must be refreshed:


REFRESH MATERIALIZED VIEW avg_lab_metrics;


We can also automate refreshes daily or weekly using cron jobs or PostgreSQL’s job scheduler.


Example of Optimization


Before Optimization:

SELECT avg(creatinine_enzymatic_method)

FROM labs l

JOIN demography d ON d.inpatient_number = l.inpatient_number

JOIN patienthistory h ON h.inpatient_number = d.inpatient_number

WHERE h.diabetes = 1;

 Takes longer due to on-demand computation.


After Optimization (using Materialized View):

SELECT avg_creatinine

FROM avg_lab_metrics

JOIN patienthistory h USING (inpatient_number)

WHERE h.diabetes = 1;

Much faster — query reads from pre-stored aggregates.


 Regular vs. Materialized Views

Feature

Regular View

Materialized View

Storage

Virtual (no data stored)

Physically stores query results

Performance

Slower for large joins

Faster for repeated queries

Data Freshness

Always up-to-date

Needs manual or scheduled refresh

Best For

Dynamic dashboards, real-time views

Reporting, pre-aggregated analytics

Conclusion

In large biomedical datasets, using materialized views brings measurable improvements in performance and efficiency. They help in :

  • Speed up analytical queries

  • Reduce computation load

  • Simplify reporting pipelines

By strategically combining views (for simplicity) and materialized views (for performance), we can create a robust, scalable data layer ready for advanced clinical analysis and predictive modeling


 
 

+1 (302) 200-8320

NumPy_Ninja_Logo (1).png

Numpy Ninja Inc. 8 The Grn Ste A Dover, DE 19901

© Copyright 2025 by Numpy Ninja Inc.

  • Twitter
  • LinkedIn
bottom of page