PostgreSQL Time-Series Data: date_trunc Bucketing, Gap-Fill, and Timescale Comparison
Store and query time-series data in PostgreSQL. Covers partitioned tables, date_trunc bucketing, gap-fill with generate_series, moving averages, percentile aggregates, and when to use TimescaleDB.
Most SaaS products generate time-series data — metrics, events, logs, sensor readings — and need to query it efficiently. PostgreSQL handles time-series surprisingly well when structured correctly: partitioned tables for write performance, date_trunc for bucketing, generate_series for gap-fill, and window functions for rolling calculations.
This guide covers the patterns that make PostgreSQL time-series queries fast and the point at which TimescaleDB becomes worth the operational overhead.
Schema Design for Time-Series
-- Generic metric store — partitioned by month for write and query efficiency
CREATE TABLE metrics (
id BIGSERIAL,
workspace_id UUID NOT NULL,
metric_name TEXT NOT NULL, -- 'page_views', 'api_calls', 'revenue_cents'
value NUMERIC NOT NULL,
dimensions JSONB NOT NULL DEFAULT '{}', -- Additional grouping dimensions
recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (id, recorded_at)
) PARTITION BY RANGE (recorded_at);
-- Create monthly partitions (automate with pg_partman in production)
CREATE TABLE metrics_2027_01 PARTITION OF metrics
FOR VALUES FROM ('2027-01-01') TO ('2027-02-01');
CREATE TABLE metrics_2027_02 PARTITION OF metrics
FOR VALUES FROM ('2027-02-01') TO ('2027-03-01');
CREATE TABLE metrics_2027_03 PARTITION OF metrics
FOR VALUES FROM ('2027-03-01') TO ('2027-04-01');
-- Index on each partition (inherited by parent)
CREATE INDEX idx_metrics_workspace_time ON metrics(workspace_id, metric_name, recorded_at DESC);
CREATE INDEX idx_metrics_time ON metrics(recorded_at DESC);
-- Pre-aggregated hourly rollups (for long date ranges)
CREATE TABLE metrics_hourly (
workspace_id UUID NOT NULL,
metric_name TEXT NOT NULL,
hour_bucket TIMESTAMPTZ NOT NULL, -- Truncated to hour
sum_value NUMERIC NOT NULL,
min_value NUMERIC NOT NULL,
max_value NUMERIC NOT NULL,
count_value BIGINT NOT NULL,
avg_value NUMERIC GENERATED ALWAYS AS (sum_value / count_value) STORED,
PRIMARY KEY (workspace_id, metric_name, hour_bucket)
);
CREATE INDEX idx_hourly_lookup ON metrics_hourly(workspace_id, metric_name, hour_bucket DESC);
Core Bucketing Queries
Daily Aggregation with date_trunc
-- Daily page views for a workspace over the last 30 days
SELECT
DATE_TRUNC('day', recorded_at AT TIME ZONE 'UTC' AT TIME ZONE $3) AS day,
SUM(value) AS total_views
FROM metrics
WHERE
workspace_id = $1
AND metric_name = 'page_views'
AND recorded_at >= NOW() - INTERVAL '30 days'
GROUP BY 1
ORDER BY 1;
-- Flexible bucketing: hourly, daily, weekly depending on date range
-- Pass bucket_interval as a parameter
SELECT
DATE_TRUNC($4, recorded_at) AS bucket, -- 'hour', 'day', 'week', 'month'
SUM(value) AS total
FROM metrics
WHERE
workspace_id = $1
AND metric_name = $2
AND recorded_at BETWEEN $3 AND $4
GROUP BY 1
ORDER BY 1;
Choose Bucket Size by Date Range
// lib/analytics/bucket-size.ts
type BucketSize = "hour" | "day" | "week" | "month";
export function getBucketSize(startDate: Date, endDate: Date): BucketSize {
const diffDays = (endDate.getTime() - startDate.getTime()) / (1000 * 60 * 60 * 24);
if (diffDays <= 2) return "hour";
if (diffDays <= 60) return "day";
if (diffDays <= 365) return "week";
return "month";
}
☁️ Is Your Cloud Costing Too Much?
Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.
- AWS, GCP, Azure certified engineers
- Infrastructure as Code (Terraform, CDK)
- Docker, Kubernetes, GitHub Actions CI/CD
- Typical audit recovers $500–$3,000/month in savings
Gap-Fill with generate_series
The hardest part of time-series display: days with zero events don't appear in query results. generate_series fills the gaps:
-- Generate every day in a range, left-join actual data
WITH date_series AS (
SELECT generate_series(
DATE_TRUNC('day', $2::timestamptz),
DATE_TRUNC('day', $3::timestamptz),
'1 day'::interval
) AS day
),
actual_data AS (
SELECT
DATE_TRUNC('day', recorded_at) AS day,
SUM(value) AS total
FROM metrics
WHERE
workspace_id = $1
AND metric_name = 'page_views'
AND recorded_at BETWEEN $2 AND $3
GROUP BY 1
)
SELECT
ds.day,
COALESCE(ad.total, 0) AS total -- Fill gaps with 0
FROM date_series ds
LEFT JOIN actual_data ad USING (day)
ORDER BY ds.day;
// lib/analytics/queries.ts
import { prisma } from "@/lib/prisma";
interface MetricPoint {
bucket: Date;
value: number;
}
export async function getMetricTimeSeries(
workspaceId: string,
metricName: string,
startDate: Date,
endDate: Date,
bucketSize: "hour" | "day" | "week" | "month" = "day"
): Promise<MetricPoint[]> {
const rows = await prisma.$queryRaw<Array<{ bucket: Date; total: bigint }>>`
WITH date_series AS (
SELECT generate_series(
DATE_TRUNC(${bucketSize}, ${startDate}::timestamptz),
DATE_TRUNC(${bucketSize}, ${endDate}::timestamptz),
('1 ' || ${bucketSize})::interval
) AS bucket
),
actual_data AS (
SELECT
DATE_TRUNC(${bucketSize}, recorded_at) AS bucket,
SUM(value)::bigint AS total
FROM metrics
WHERE
workspace_id = ${workspaceId}::uuid
AND metric_name = ${metricName}
AND recorded_at BETWEEN ${startDate} AND ${endDate}
GROUP BY 1
)
SELECT ds.bucket, COALESCE(ad.total, 0) AS total
FROM date_series ds
LEFT JOIN actual_data ad USING (bucket)
ORDER BY ds.bucket
`;
return rows.map((r) => ({
bucket: r.bucket,
value: Number(r.total),
}));
}
Moving Averages and Rolling Windows
-- 7-day rolling average alongside daily values
SELECT
DATE_TRUNC('day', recorded_at) AS day,
SUM(value) AS daily_value,
AVG(SUM(value)) OVER (
ORDER BY DATE_TRUNC('day', recorded_at)
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_7d_avg,
SUM(SUM(value)) OVER (
ORDER BY DATE_TRUNC('day', recorded_at)
ROWS UNBOUNDED PRECEDING
) AS cumulative_total
FROM metrics
WHERE
workspace_id = $1
AND metric_name = 'revenue_cents'
AND recorded_at >= NOW() - INTERVAL '90 days'
GROUP BY DATE_TRUNC('day', recorded_at)
ORDER BY 1;
⚙️ DevOps Done Right — Zero Downtime, Full Automation
Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.
- Staging + production environments with feature flags
- Automated security scanning in the pipeline
- Uptime monitoring + alerting + runbook automation
- On-call support handover docs included
Period-over-Period Comparison
-- This week vs last week comparison
WITH current_period AS (
SELECT
DATE_TRUNC('day', recorded_at) AS day,
SUM(value) AS value
FROM metrics
WHERE
workspace_id = $1
AND metric_name = 'page_views'
AND recorded_at >= DATE_TRUNC('week', NOW())
AND recorded_at < DATE_TRUNC('week', NOW()) + INTERVAL '1 week'
GROUP BY 1
),
previous_period AS (
SELECT
DATE_TRUNC('day', recorded_at) + INTERVAL '1 week' AS day, -- Align to current week
SUM(value) AS value
FROM metrics
WHERE
workspace_id = $1
AND metric_name = 'page_views'
AND recorded_at >= DATE_TRUNC('week', NOW()) - INTERVAL '1 week'
AND recorded_at < DATE_TRUNC('week', NOW())
GROUP BY 1
)
SELECT
COALESCE(c.day, p.day) AS day,
COALESCE(c.value, 0) AS current_week,
COALESCE(p.value, 0) AS previous_week,
CASE WHEN COALESCE(p.value, 0) > 0
THEN ROUND(100.0 * (COALESCE(c.value, 0) - COALESCE(p.value, 0)) / p.value, 1)
ELSE NULL
END AS pct_change
FROM current_period c
FULL OUTER JOIN previous_period p ON c.day = p.day
ORDER BY 1;
Percentile Distribution
-- Response time percentiles by hour (for API monitoring)
SELECT
DATE_TRUNC('hour', recorded_at) AS hour,
COUNT(*) AS requests,
ROUND(AVG(value)) AS avg_ms,
PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY value) AS p50_ms,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY value) AS p95_ms,
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY value) AS p99_ms,
MAX(value) AS max_ms
FROM metrics
WHERE
workspace_id = $1
AND metric_name = 'api_response_ms'
AND recorded_at >= NOW() - INTERVAL '24 hours'
GROUP BY 1
ORDER BY 1;
Automatic Hourly Rollup
For long date ranges, querying raw metrics is slow. Maintain rollup tables:
-- Rollup function: aggregate raw metrics to hourly buckets
-- Run via pg_cron or AWS Lambda on a schedule
CREATE OR REPLACE FUNCTION rollup_metrics_hourly(p_start TIMESTAMPTZ, p_end TIMESTAMPTZ)
RETURNS void
LANGUAGE plpgsql AS $$
BEGIN
INSERT INTO metrics_hourly (workspace_id, metric_name, hour_bucket, sum_value, min_value, max_value, count_value)
SELECT
workspace_id,
metric_name,
DATE_TRUNC('hour', recorded_at) AS hour_bucket,
SUM(value) AS sum_value,
MIN(value) AS min_value,
MAX(value) AS max_value,
COUNT(*) AS count_value
FROM metrics
WHERE recorded_at >= p_start AND recorded_at < p_end
GROUP BY 1, 2, 3
ON CONFLICT (workspace_id, metric_name, hour_bucket) DO UPDATE SET
sum_value = EXCLUDED.sum_value,
min_value = EXCLUDED.min_value,
max_value = EXCLUDED.max_value,
count_value = EXCLUDED.count_value;
END;
$$;
-- Schedule with pg_cron (runs every hour at :05 to catch stragglers)
SELECT cron.schedule('rollup-metrics', '5 * * * *',
$$SELECT rollup_metrics_hourly(
DATE_TRUNC('hour', NOW()) - INTERVAL '2 hours',
DATE_TRUNC('hour', NOW())
)$$
);
Smart Query Router
// lib/analytics/query-router.ts
import { prisma } from "@/lib/prisma";
import { getBucketSize } from "./bucket-size";
// Use rollup table for ranges > 7 days, raw for recent data
export async function getMetricSmart(
workspaceId: string,
metricName: string,
startDate: Date,
endDate: Date
) {
const diffDays = (endDate.getTime() - startDate.getTime()) / (1000 * 60 * 60 * 24);
const bucketSize = getBucketSize(startDate, endDate);
if (diffDays > 7) {
// Use hourly rollups aggregated to the desired bucket
return prisma.$queryRaw`
WITH date_series AS (
SELECT generate_series(
DATE_TRUNC(${bucketSize}, ${startDate}::timestamptz),
DATE_TRUNC(${bucketSize}, ${endDate}::timestamptz),
('1 ' || ${bucketSize})::interval
) AS bucket
),
rollup_data AS (
SELECT
DATE_TRUNC(${bucketSize}, hour_bucket) AS bucket,
SUM(sum_value) AS total
FROM metrics_hourly
WHERE
workspace_id = ${workspaceId}::uuid
AND metric_name = ${metricName}
AND hour_bucket BETWEEN ${startDate} AND ${endDate}
GROUP BY 1
)
SELECT ds.bucket, COALESCE(rd.total, 0) AS value
FROM date_series ds
LEFT JOIN rollup_data rd USING (bucket)
ORDER BY ds.bucket
`;
}
// Use raw metrics for recent data (< 7 days)
return prisma.$queryRaw`
WITH date_series AS (
SELECT generate_series(
DATE_TRUNC(${bucketSize}, ${startDate}::timestamptz),
DATE_TRUNC(${bucketSize}, ${endDate}::timestamptz),
('1 ' || ${bucketSize})::interval
) AS bucket
),
actual AS (
SELECT DATE_TRUNC(${bucketSize}, recorded_at) AS bucket, SUM(value) AS total
FROM metrics
WHERE
workspace_id = ${workspaceId}::uuid
AND metric_name = ${metricName}
AND recorded_at BETWEEN ${startDate} AND ${endDate}
GROUP BY 1
)
SELECT ds.bucket, COALESCE(a.total, 0) AS value
FROM date_series ds
LEFT JOIN actual a USING (bucket)
ORDER BY ds.bucket
`;
}
PostgreSQL vs TimescaleDB
| Feature | PostgreSQL | TimescaleDB |
|---|---|---|
| Setup complexity | Simple | Moderate (extension + hypertable) |
| Write throughput | ~10K rows/s per table | ~500K rows/s with compression |
| Automatic partitioning | Manual (pg_partman) | Automatic (chunks by time) |
| Data compression | Standard TOAST | 10–20x columnar compression |
| Continuous aggregates | Manual materialized views | CREATE MATERIALIZED VIEW ... WITH (timescaledb.continuous) |
| Gap-fill | Manual generate_series | Built-in time_bucket_gapfill() |
| Retention policies | Manual PARTITION DROP | add_retention_policy() |
| Cost | Free | Free OSS; Timescale Cloud from $95/mo |
Use plain PostgreSQL when:
- Under 100M rows per table
- Write rate under 10K/s
- Team already manages PostgreSQL
Add TimescaleDB when:
- Write rate exceeds 50K rows/s
- Need automatic compression (>10GB time-series data)
- Continuous aggregates are complex to maintain manually
- Retention policies needed (auto-drop old data)
Cost and Timeline Estimates
| Scope | Team | Timeline | Cost Range |
|---|---|---|---|
| Basic time-series queries (no partitioning) | 1 dev | 1–2 days | $300–600 |
| Partitioned table + rollups + gap-fill | 1 dev | 1 week | $1,500–3,000 |
| Full analytics pipeline (rollups + router + dashboard) | 1–2 devs | 2–3 weeks | $5,000–10,000 |
| TimescaleDB migration from PostgreSQL | 1–2 devs | 1–2 weeks | $3,000–6,000 |
See Also
- PostgreSQL Window Functions for Analytics
- PostgreSQL Partitioning for Large Tables
- PostgreSQL Materialized Views
- AWS OpenSearch for Analytics and Search
- SaaS Activity Feed Architecture
Working With Viprasol
Time-series data becomes a performance problem when it grows — queries that ran in 200ms at 1M rows crawl at 100M rows. Our team designs metric storage schemas with partitioning, rollup pipelines, and smart query routing that keeps dashboards fast as data accumulates.
What we deliver:
- Partitioned metrics table schema (by month)
- Hourly rollup function with pg_cron scheduling
- Gap-fill queries with generate_series
- Moving average and period-over-period comparison queries
- Smart query router (raw vs rollup based on date range)
Talk to our team about your analytics data pipeline →
Or explore our cloud and database services.
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need DevOps & Cloud Expertise?
Scale your infrastructure with confidence. AWS, GCP, Azure certified team.
Free consultation • No commitment • Response within 24 hours
Making sense of your data at scale?
Viprasol builds end-to-end big data analytics solutions — ETL pipelines, data warehouses on Snowflake or BigQuery, and self-service BI dashboards. One reliable source of truth for your entire organisation.