PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman
Master the PostgreSQL extension ecosystem: pgvector for AI embeddings, PostGIS for geospatial, pg_trgm for fuzzy search, TimescaleDB for time-series, and pg_partman for automated partitioning.
PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman
PostgreSQL's extension system turns a general-purpose relational database into a specialized engine for vector search, geospatial queries, fuzzy text matching, time-series analytics, and automated partition management โ all without leaving your existing stack. For most teams, the right extension is cheaper, faster to implement, and operationally simpler than adding a specialized database.
This post covers the five extensions we reach for most often, with production-ready schemas and queries for each.
Extension Overview: When to Use What
| Extension | Use Case | Alternative | When to Choose Alternative |
|---|---|---|---|
| pgvector | AI embeddings, semantic search | Pinecone, Weaviate | >5M vectors, sub-10ms at scale |
| PostGIS | Geospatial queries, mapping | Google Maps API | Rarely โ PostGIS handles nearly everything |
| pg_trgm | Fuzzy text search, LIKE optimization | Elasticsearch | Full-text search at >10M docs |
| TimescaleDB | Time-series data, continuous aggregates | InfluxDB, ClickHouse | >100M rows/day ingestion |
| pg_partman | Automated table partitioning | Manual partition DDL | Never โ pg_partman is always better |
pgvector: AI Embeddings and Semantic Search
pgvector stores and queries high-dimensional embeddings from OpenAI, Cohere, or locally-run models. It enables semantic search, RAG (retrieval-augmented generation), and recommendation systems inside PostgreSQL.
-- Install and create vector column
CREATE EXTENSION IF NOT EXISTS vector;
-- Document embeddings table
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
metadata JSONB NOT NULL DEFAULT '{}',
embedding vector(1536), -- OpenAI text-embedding-3-small dimension
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- IVFFlat index for approximate nearest neighbor (ANN) search
-- Create AFTER you have data โ index quality depends on row count
-- lists = sqrt(total_rows) is a good starting point
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- HNSW index (better recall, higher build cost โ prefer for <1M vectors)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
// src/lib/embeddings.ts โ Generate and store embeddings
import { OpenAI } from 'openai';
import { db } from '@/lib/db';
const openai = new OpenAI();
export async function embedAndStore(
content: string,
metadata: Record<string, unknown> = {},
): Promise<string> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: content,
});
const embedding = response.data[0].embedding;
const { rows } = await db.query<{ id: string }>(
`INSERT INTO documents (content, metadata, embedding)
VALUES ($1, $2, $3)
RETURNING id`,
[content, JSON.stringify(metadata), JSON.stringify(embedding)],
);
return rows[0].id;
}
export async function semanticSearch(
query: string,
topK: number = 10,
filter?: { category?: string },
): Promise<Array<{ id: string; content: string; similarity: number; metadata: Record<string, unknown> }>> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: query,
});
const queryEmbedding = response.data[0].embedding;
// Set ivfflat probe count for better recall (trades speed for accuracy)
await db.query('SET ivfflat.probes = 10');
const { rows } = await db.query<{
id: string;
content: string;
similarity: number;
metadata: Record<string, unknown>;
}>(
`SELECT
id,
content,
metadata,
1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE ($2::jsonb IS NULL OR metadata @> $2::jsonb)
ORDER BY embedding <=> $1::vector
LIMIT $3`,
[
JSON.stringify(queryEmbedding),
filter ? JSON.stringify(filter) : null,
topK,
],
);
return rows;
}
Hybrid Search: Vector + Full-Text
-- Combine semantic similarity with keyword relevance (RRF fusion)
WITH vector_results AS (
SELECT id, row_number() OVER (ORDER BY embedding <=> $1::vector) AS rank
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 60
),
fts_results AS (
SELECT id, row_number() OVER (ORDER BY ts_rank(to_tsvector('english', content), query) DESC) AS rank
FROM documents, to_tsquery('english', $2) query
WHERE to_tsvector('english', content) @@ query
LIMIT 60
),
-- Reciprocal Rank Fusion
rrf AS (
SELECT
COALESCE(v.id, f.id) AS id,
COALESCE(1.0 / (60 + v.rank), 0) + COALESCE(1.0 / (60 + f.rank), 0) AS score
FROM vector_results v
FULL OUTER JOIN fts_results f USING (id)
)
SELECT d.id, d.content, d.metadata, r.score
FROM rrf r
JOIN documents d ON d.id = r.id
ORDER BY r.score DESC
LIMIT 10;
๐ Looking for a Dev Team That Actually Delivers?
Most agencies sell you a project manager and assign juniors. Viprasol is different โ senior engineers only, direct Slack access, and a 5.0โ Upwork record across 100+ projects.
- React, Next.js, Node.js, TypeScript โ production-grade stack
- Fixed-price contracts โ no surprise invoices
- Full source code ownership from day one
- 90-day post-launch support included
PostGIS: Geospatial Queries
PostGIS adds spatial types (GEOMETRY, GEOGRAPHY) and thousands of spatial functions to PostgreSQL โ radius searches, polygon containment, distance calculations, and routing.
CREATE EXTENSION IF NOT EXISTS postgis;
-- Locations table with geography column (sphere-based, uses meters)
CREATE TABLE locations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
category TEXT NOT NULL,
coordinates GEOGRAPHY(POINT, 4326), -- WGS84 (lat/lon)
address TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Spatial index (mandatory for performance)
CREATE INDEX ON locations USING GIST (coordinates);
-- Insert a point (longitude first, then latitude โ PostGIS convention)
INSERT INTO locations (name, category, coordinates)
VALUES ('Viprasol HQ', 'office', ST_GeogFromText('SRID=4326;POINT(76.3 29.6)'));
-- Find locations within radius (meters)
SELECT
id,
name,
category,
ST_Distance(coordinates, ST_GeogFromText('SRID=4326;POINT(-73.935242 40.730610)')) AS distance_meters
FROM locations
WHERE ST_DWithin(
coordinates,
ST_GeogFromText('SRID=4326;POINT(-73.935242 40.730610)'),
5000 -- 5km radius
)
ORDER BY distance_meters
LIMIT 20;
// src/api/nearby.ts โ Nearest-location API
export async function findNearby(
lat: number,
lng: number,
radiusMeters: number,
category?: string,
): Promise<Location[]> {
const { rows } = await db.query<Location & { distance_meters: number }>(
`SELECT
id, name, category, address,
ST_Y(coordinates::geometry) AS latitude,
ST_X(coordinates::geometry) AS longitude,
ST_Distance(coordinates, ST_MakePoint($2, $1)::geography) AS distance_meters
FROM locations
WHERE ST_DWithin(coordinates, ST_MakePoint($2, $1)::geography, $3)
AND ($4::text IS NULL OR category = $4)
ORDER BY distance_meters
LIMIT 50`,
[lat, lng, radiusMeters, category ?? null],
);
return rows;
}
Polygon Geofencing
-- Service area polygon (e.g., delivery zone)
CREATE TABLE service_areas (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
polygon GEOGRAPHY(POLYGON, 4326)
);
-- Check if a point is within a service area
SELECT sa.name
FROM service_areas sa
WHERE ST_Contains(
sa.polygon::geometry,
ST_MakePoint(-73.935242, 40.730610) -- User's lng, lat
);
pg_trgm: Fuzzy Search and LIKE Optimization
pg_trgm enables sub-100ms fuzzy text matching across millions of rows โ without Elasticsearch.
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- GIN index on trigrams (for LIKE and similarity queries)
CREATE INDEX ON products USING GIN (name gin_trgm_ops);
CREATE INDEX ON products USING GIN (description gin_trgm_ops);
-- Fuzzy search: returns results even with typos
SELECT
id,
name,
price,
similarity(name, 'iphome 15') AS name_similarity -- typo: 'iphome'
FROM products
WHERE name % 'iphome 15' -- % = similarity > threshold (default 0.3)
ORDER BY name_similarity DESC
LIMIT 20;
-- LIKE optimization (without pg_trgm, this would be a full table scan)
-- With GIN index, this is fast
SELECT * FROM products
WHERE name ILIKE '%iphone%';
-- Combined search: trigram + full-text
SELECT id, name, description,
similarity(name, $1) * 2 +
ts_rank(to_tsvector('english', description), plainto_tsquery('english', $1)) AS relevance
FROM products
WHERE name % $1
OR to_tsvector('english', description) @@ plainto_tsquery('english', $1)
ORDER BY relevance DESC
LIMIT 20;
-- Tune similarity threshold per query (default 0.3 is often too strict or too loose)
SET pg_trgm.similarity_threshold = 0.2; -- Session-level; more permissive
๐ Senior Engineers. No Junior Handoffs. Ever.
You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.
- MVPs in 4โ8 weeks, full platforms in 3โ5 months
- Lighthouse 90+ performance scores standard
- Works across US, UK, AU timezones
- Free 30-min architecture review, no commitment
TimescaleDB: Time-Series Data
TimescaleDB extends PostgreSQL with hypertables (transparent time-based partitioning), continuous aggregates (pre-computed rollups), and compression โ essential for metrics, IoT, and financial tick data.
CREATE EXTENSION IF NOT EXISTS timescaledb;
-- Create a regular table, then convert to hypertable
CREATE TABLE metrics (
time TIMESTAMPTZ NOT NULL,
device_id UUID NOT NULL,
metric_name TEXT NOT NULL,
value DOUBLE PRECISION NOT NULL,
tags JSONB NOT NULL DEFAULT '{}'
);
-- Convert to hypertable (partitioned by time, 1 week per chunk)
SELECT create_hypertable('metrics', 'time', chunk_time_interval => INTERVAL '1 week');
-- Add compression policy (compress chunks older than 30 days)
ALTER TABLE metrics SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'device_id, metric_name'
);
SELECT add_compression_policy('metrics', INTERVAL '30 days');
-- Continuous aggregate: pre-compute hourly rollups
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 hour', time) AS bucket,
device_id,
metric_name,
avg(value) AS avg_value,
min(value) AS min_value,
max(value) AS max_value,
count(*) AS sample_count
FROM metrics
GROUP BY 1, 2, 3
WITH NO DATA;
-- Automatically refresh the continuous aggregate
SELECT add_continuous_aggregate_policy(
'metrics_hourly',
start_offset => INTERVAL '3 hours',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour'
);
-- Retention policy: drop data older than 1 year
SELECT add_retention_policy('metrics', INTERVAL '1 year');
// src/api/metrics.ts โ Query time-series data
export async function getMetricHistory(
deviceId: string,
metricName: string,
from: Date,
to: Date,
bucketInterval: '1m' | '1h' | '1d' = '1h',
): Promise<MetricBucket[]> {
// Use pre-computed view for hourly+ queries
const table = bucketInterval === '1m' ? 'metrics' : 'metrics_hourly';
const bucketCol = bucketInterval === '1m'
? `time_bucket('1 minute', time)`
: 'bucket';
const { rows } = await db.query<MetricBucket>(
`SELECT
${bucketCol} AS time,
avg_value,
min_value,
max_value,
sample_count
FROM ${table}
WHERE device_id = $1
AND metric_name = $2
AND ${bucketInterval === '1m' ? 'time' : 'bucket'} BETWEEN $3 AND $4
ORDER BY 1`,
[deviceId, metricName, from, to],
);
return rows;
}
pg_partman: Automated Table Partitioning
pg_partman creates and maintains time-based or serial partitions automatically โ no manual DDL, no maintenance windows.
CREATE EXTENSION IF NOT EXISTS pg_partman;
CREATE EXTENSION IF NOT EXISTS pg_cron; -- Required for automated maintenance
-- Create parent table (partitioned by range)
CREATE TABLE events (
id UUID NOT NULL DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
event_type TEXT NOT NULL,
payload JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);
-- Create index template (applied to each partition automatically)
CREATE INDEX ON events (user_id, created_at);
CREATE INDEX ON events (event_type, created_at);
-- Configure pg_partman: monthly partitions, keep last 12 months
SELECT partman.create_parent(
p_parent_table => 'public.events',
p_control => 'created_at',
p_type => 'native',
p_interval => 'monthly',
p_premake => 3 -- Create 3 future partitions in advance
);
-- Retention: automatically drop partitions older than 13 months
UPDATE partman.part_config
SET retention = '13 months',
retention_keep_table = FALSE, -- Actually drop old partitions
retention_keep_index = FALSE
WHERE parent_table = 'public.events';
-- Schedule pg_partman maintenance (creates future partitions, drops old ones)
SELECT cron.schedule(
'partman-maintenance',
'0 * * * *', -- Every hour
'SELECT partman.run_maintenance_proc()'
);
Extension Cost and Compatibility Matrix
| Extension | AWS RDS | AWS Aurora | Supabase | Neon | Self-hosted |
|---|---|---|---|---|---|
| pgvector | โ | โ | โ | โ | โ |
| PostGIS | โ | โ | โ | โ | โ |
| pg_trgm | โ | โ | โ | โ | โ |
| TimescaleDB | โ | โ | โ (limited) | โ | โ |
| pg_partman | โ | โ | โ ๏ธ (no cron) | โ | โ |
Working With Viprasol
We implement PostgreSQL extension stacks for teams replacing fragmented specialized databases with a unified PostgreSQL-based architecture.
What we deliver:
- pgvector setup for RAG and semantic search pipelines
- PostGIS geospatial schema design and spatial index tuning
- pg_trgm fuzzy search with relevance scoring
- TimescaleDB schema design with continuous aggregates and retention policies
- pg_partman configuration with automated maintenance and retention
โ Discuss your database architecture โ Backend development services
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.
Need a Modern Web Application?
From landing pages to complex SaaS platforms โ we build it all with Next.js and React.
Free consultation โข No commitment โข Response within 24 hours
Need a custom web application built?
We build React and Next.js web applications with Lighthouse โฅ90 scores, mobile-first design, and full source code ownership. Senior engineers only โ from architecture through deployment.