Explaining Views and Materialized Views in PostgreSQL through a healthcare dataset

Abeer Zahid
Oct 17
3 min read

In large healthcare datasets—such as cardiac failure studies that track patient demographics, lab values, medical history, and prescriptions—efficient querying and analysis of data are essential. When working with multiple interrelated tables, it’s common to face performance issues due to complex joins, aggregations, and millions of records. Addressing this, Views and Materialized Views in PostgreSQL become powerful optimization tools.

What Is a View?

A view is a virtual table created from a SQL query. It does not store data physically—it fetches real-time results every time it’s queried.

Example:

CREATE VIEW patient_summary AS

SELECT

d.inpatient_number,

d.gender,

d.agecat,

l.bmi,

h.diabetes,

h.congestive_heart_failure

FROM demography d

JOIN patienthistory h ON d.inpatient_number = h.inpatient_number

JOIN labs l ON d.inpatient_number = l.inpatient_number;

This simplifies repeated analysis queries without writing complex joins every time.However, since it recalculates results on every query, it can be slow for large datasets.

What Is a Materialized View?

A materialized view stores the physical result of a query, making future access much faster—especially when the same query is run multiple times (e.g., dashboards or predictive models).

Example:

CREATE MATERIALIZED VIEW patient_lab_summary AS

SELECT

d.inpatient_number,

d.gender,

l.mean_corpuscular_volume,

l.hemoglobin,

l.creatinine_enzymatic_method,

h.diabetes

FROM demography d

JOIN labs l ON d.inpatient_number = l.inpatient_number

JOIN patienthistory h ON d.inpatient_number = h.inpatient_number;

Now, this precomputed dataset is physically stored in the database.Whenever analysts run reports, PostgreSQL doesn’t need to recompute joins—it just reads the stored results.

Why It Matters

1. Improves Query Performance

The ‘labs’ table alone in the dataset can contain thousands of rows per patient (e.g., blood pressure, cholesterol, glucose, etc.).By creating a materialized view that aggregates metrics like average creatinine, hemoglobin, or BNP levels per patient, we can reduce query time from seconds or minutes to milliseconds.

CREATE MATERIALIZED VIEW avg_lab_metrics AS

SELECT

inpatient_number,

ROUND(AVG(creatinine_enzymatic_method), 2) AS avg_creatinine,

ROUND(AVG(hemoglobin), 2) AS avg_hemoglobin,

ROUND(AVG(brain_natriuretic_peptide), 2) AS avg_bnp

FROM labs

GROUP BY inpatient_number;

This allows rapid reporting of renal or cardiac stress markers in patients—essential for clinical analytics.

2. Reduces Load on Base Tables

In healthcare analytics, multiple dashboards and machine learning pipelines may hit the same tables. If every user query re-runs complex joins between demography, labs, and patient history, the database slows down. A materialized view acts like a cache, serving pre-joined results efficiently.

3. Supports Predictive Modeling

When you later run predictive models (like detecting risk of heart failure or mortality), you can pull data directly from a materialized view of pre-cleaned, aggregated features—reducing both ETL time and compute cost.

Refreshing Strategies

Since the base tables like ‘labs’ or ‘patient history’ can get updated when new hospital data arrives, materialized views must be refreshed:

REFRESH MATERIALIZED VIEW avg_lab_metrics;

We can also automate refreshes daily or weekly using cron jobs or PostgreSQL’s job scheduler.

Example of Optimization

Before Optimization:

SELECT avg(creatinine_enzymatic_method)

FROM labs l

JOIN demography d ON d.inpatient_number = l.inpatient_number

JOIN patienthistory h ON h.inpatient_number = d.inpatient_number

WHERE h.diabetes = 1;

Takes longer due to on-demand computation.

After Optimization (using Materialized View):

SELECT avg_creatinine

FROM avg_lab_metrics

JOIN patienthistory h USING (inpatient_number)

WHERE h.diabetes = 1;

Much faster — query reads from pre-stored aggregates.

Regular vs. Materialized Views

Feature	Regular View	Materialized View
Storage	Virtual (no data stored)	Physically stores query results
Performance	Slower for large joins	Faster for repeated queries
Data Freshness	Always up-to-date	Needs manual or scheduled refresh
Best For	Dynamic dashboards, real-time views	Reporting, pre-aggregated analytics

Conclusion

In large biomedical datasets, using materialized views brings measurable improvements in performance and efficiency. They help in :

Speed up analytical queries
Reduce computation load
Simplify reporting pipelines

By strategically combining views (for simplicity) and materialized views (for performance), we can create a robust, scalable data layer ready for advanced clinical analysis and predictive modeling

Welcome to NumpyNinja Blogs

What Is a View?

Example:

CREATE VIEW patient_summary AS

SELECT

d.inpatient_number,

d.gender,

d.agecat,

l.bmi,

h.diabetes,

h.congestive_heart_failure

FROM demography d

JOIN patienthistory h ON d.inpatient_number = h.inpatient_number

JOIN labs l ON d.inpatient_number = l.inpatient_number;

What Is a Materialized View?

Example:

CREATE MATERIALIZED VIEW patient_lab_summary AS

SELECT

d.inpatient_number,

d.gender,

l.mean_corpuscular_volume,

l.hemoglobin,

l.creatinine_enzymatic_method,

h.diabetes

FROM demography d

JOIN labs l ON d.inpatient_number = l.inpatient_number

JOIN patienthistory h ON d.inpatient_number = h.inpatient_number;

Why It Matters

1. Improves Query Performance

CREATE MATERIALIZED VIEW avg_lab_metrics AS

SELECT

inpatient_number,

ROUND(AVG(creatinine_enzymatic_method), 2) AS avg_creatinine,

ROUND(AVG(hemoglobin), 2) AS avg_hemoglobin,

ROUND(AVG(brain_natriuretic_peptide), 2) AS avg_bnp

FROM labs

GROUP BY inpatient_number;

2. Reduces Load on Base Tables

3. Supports Predictive Modeling

REFRESH MATERIALIZED VIEW avg_lab_metrics;

Example of Optimization

Before Optimization:

SELECT avg(creatinine_enzymatic_method)

FROM labs l

JOIN demography d ON d.inpatient_number = l.inpatient_number

JOIN patienthistory h ON h.inpatient_number = d.inpatient_number

WHERE h.diabetes = 1;

After Optimization (using Materialized View):

SELECT avg_creatinine

FROM avg_lab_metrics

JOIN patienthistory h USING (inpatient_number)

WHERE h.diabetes = 1;

Regular vs. Materialized Views

Conclusion

Welcome
to NumpyNinja Blogs