PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman
Master the PostgreSQL extension ecosystem: pgvector for AI embeddings, PostGIS for geospatial, pg_trgm for fuzzy search, TimescaleDB for time-series, and pg_partman for automated partitioning.
PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman
PostgreSQL's extension system turns a general-purpose relational database into a specialized engine for vector search, geospatial queries, fuzzy text matching, time-series analytics, and automated partition management โ all without leaving your existing stack. For most teams, the right extension is cheaper, faster to implement, and operationally simpler than adding a specialized database.
This post covers the five extensions we reach for most often, with production-ready schemas and queries for each.
Extension Overview: When to Use What
| Extension | Use Case | Alternative | When to Choose Alternative |
|---|---|---|---|
| pgvector | AI embeddings, semantic search | Pinecone, Weaviate | >5M vectors, sub-10ms at scale |
| PostGIS | Geospatial queries, mapping | Google Maps API | Rarely โ PostGIS handles nearly everything |
| pg_trgm | Fuzzy text search, LIKE optimization | Elasticsearch | Full-text search at >10M docs |
| TimescaleDB | Time-series data, continuous aggregates | InfluxDB, ClickHouse | >100M rows/day ingestion |
| pg_partman | Automated table partitioning | Manual partition DDL | Never โ pg_partman is always better |
pgvector: AI Embeddings and Semantic Search
pgvector stores and queries high-dimensional embeddings from OpenAI, Cohere, or locally-run models. It enables semantic search, RAG (retrieval-augmented generation), and recommendation systems inside PostgreSQL.
-- Install and create vector column
CREATE EXTENSION IF NOT EXISTS vector;
-- Document embeddings table
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
metadata JSONB NOT NULL DEFAULT '{}',
embedding vector(1536), -- OpenAI text-embedding-3-small dimension
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- IVFFlat index for approximate nearest neighbor (ANN) search
-- Create AFTER you have data โ index quality depends on row count
-- lists = sqrt(total_rows) is a good starting point
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- HNSW index (better recall, higher build cost โ prefer for <1M vectors)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
// src/lib/embeddings.ts โ Generate and store embeddings
import { OpenAI } from 'openai';
import { db } from '@/lib/db';
const openai = new OpenAI();
export async function embedAndStore(
content: string,
metadata: Record<string, unknown> = {},
): Promise<string> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: content,
});
const embedding = response.data[0].embedding;
const { rows } = await db.query<{ id: string }>(
`INSERT INTO documents (content, metadata, embedding)
VALUES ($1, $2, $3)
RETURNING id`,
[content, JSON.stringify(metadata), JSON.stringify(embedding)],
);
return rows[0].id;
}
export async function semanticSearch(
query: string,
topK: number = 10,
filter?: { category?: string },
): Promise<Array<{ id: string; content: string; similarity: number; metadata: Record<string, unknown> }>> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: query,
});
const queryEmbedding = response.data[0].embedding;
// Set ivfflat probe count for better recall (trades speed for accuracy)
await db.query('SET ivfflat.probes = 10');
const { rows } = await db.query<{
id: string;
content: string;
similarity: number;
metadata: Record<string, unknown>;
}>(
`SELECT
id,
content,
metadata,
1 - (embedding <=> $1::vector) AS similarity
FROM documents
WHERE ($2::jsonb IS NULL OR metadata @> $2::jsonb)
ORDER BY embedding <=> $1::vector
LIMIT $3`,
[
JSON.stringify(queryEmbedding),
filter ? JSON.stringify(filter) : null,
topK,
],
);
return rows;
}
Hybrid Search: Vector + Full-Text
-- Combine semantic similarity with keyword relevance (RRF fusion)
WITH vector_results AS (
SELECT id, row_number() OVER (ORDER BY embedding <=> $1::vector) AS rank
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 60
),
fts_results AS (
SELECT id, row_number() OVER (ORDER BY ts_rank(to_tsvector('english', content), query) DESC) AS rank
FROM documents, to_tsquery('english', $2) query
WHERE to_tsvector('english', content) @@ query
LIMIT 60
),
-- Reciprocal Rank Fusion
rrf AS (
SELECT
COALESCE(v.id, f.id) AS id,
COALESCE(1.0 / (60 + v.rank), 0) + COALESCE(1.0 / (60 + f.rank), 0) AS score
FROM vector_results v
FULL OUTER JOIN fts_results f USING (id)
)
SELECT d.id, d.content, d.metadata, r.score
FROM rrf r
JOIN documents d ON d.id = r.id
ORDER BY r.score DESC
LIMIT 10;
๐ Looking for a Dev Team That Actually Delivers?
Most agencies sell you a project manager and assign juniors. Viprasol is different โ senior engineers only, direct Slack access, and a 5.0โ Upwork record across 1000+ projects.
- React, Next.js, Node.js, TypeScript โ production-grade stack
- Fixed-price contracts โ no surprise invoices
- Full source code ownership from day one
- 90-day post-launch support included
PostGIS: Geospatial Queries
PostGIS adds spatial types (GEOMETRY, GEOGRAPHY) and thousands of spatial functions to PostgreSQL โ radius searches, polygon containment, distance calculations, and routing.
CREATE EXTENSION IF NOT EXISTS postgis;
-- Locations table with geography column (sphere-based, uses meters)
CREATE TABLE locations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
category TEXT NOT NULL,
coordinates GEOGRAPHY(POINT, 4326), -- WGS84 (lat/lon)
address TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Spatial index (mandatory for performance)
CREATE INDEX ON locations USING GIST (coordinates);
-- Insert a point (longitude first, then latitude โ PostGIS convention)
INSERT INTO locations (name, category, coordinates)
VALUES ('Viprasol HQ', 'office', ST_GeogFromText('SRID=4326;POINT(76.3 29.6)'));
-- Find locations within radius (meters)
SELECT
id,
name,
category,
ST_Distance(coordinates, ST_GeogFromText('SRID=4326;POINT(-73.935242 40.730610)')) AS distance_meters
FROM locations
WHERE ST_DWithin(
coordinates,
ST_GeogFromText('SRID=4326;POINT(-73.935242 40.730610)'),
5000 -- 5km radius
)
ORDER BY distance_meters
LIMIT 20;
// src/api/nearby.ts โ Nearest-location API
export async function findNearby(
lat: number,
lng: number,
radiusMeters: number,
category?: string,
): Promise<Location[]> {
const { rows } = await db.query<Location & { distance_meters: number }>(
`SELECT
id, name, category, address,
ST_Y(coordinates::geometry) AS latitude,
ST_X(coordinates::geometry) AS longitude,
ST_Distance(coordinates, ST_MakePoint($2, $1)::geography) AS distance_meters
FROM locations
WHERE ST_DWithin(coordinates, ST_MakePoint($2, $1)::geography, $3)
AND ($4::text IS NULL OR category = $4)
ORDER BY distance_meters
LIMIT 50`,
[lat, lng, radiusMeters, category ?? null],
);
return rows;
}
Polygon Geofencing
-- Service area polygon (e.g., delivery zone)
CREATE TABLE service_areas (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
polygon GEOGRAPHY(POLYGON, 4326)
);
-- Check if a point is within a service area
SELECT sa.name
FROM service_areas sa
WHERE ST_Contains(
sa.polygon::geometry,
ST_MakePoint(-73.935242, 40.730610) -- User's lng, lat
);
pg_trgm: Fuzzy Search and LIKE Optimization
pg_trgm enables sub-100ms fuzzy text matching across millions of rows โ without Elasticsearch.
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- GIN index on trigrams (for LIKE and similarity queries)
CREATE INDEX ON products USING GIN (name gin_trgm_ops);
CREATE INDEX ON products USING GIN (description gin_trgm_ops);
-- Fuzzy search: returns results even with typos
SELECT
id,
name,
price,
similarity(name, 'iphome 15') AS name_similarity -- typo: 'iphome'
FROM products
WHERE name % 'iphome 15' -- % = similarity > threshold (default 0.3)
ORDER BY name_similarity DESC
LIMIT 20;
-- LIKE optimization (without pg_trgm, this would be a full table scan)
-- With GIN index, this is fast
SELECT * FROM products
WHERE name ILIKE '%iphone%';
-- Combined search: trigram + full-text
SELECT id, name, description,
similarity(name, $1) * 2 +
ts_rank(to_tsvector('english', description), plainto_tsquery('english', $1)) AS relevance
FROM products
WHERE name % $1
OR to_tsvector('english', description) @@ plainto_tsquery('english', $1)
ORDER BY relevance DESC
LIMIT 20;
-- Tune similarity threshold per query (default 0.3 is often too strict or too loose)
SET pg_trgm.similarity_threshold = 0.2; -- Session-level; more permissive

๐ Senior Engineers. No Junior Handoffs. Ever.
You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.
- MVPs in 4โ8 weeks, full platforms in 3โ5 months
- Lighthouse 90+ performance scores standard
- Works across US, UK, AU timezones
- Free 30-min architecture review, no commitment
Recommended Reading
TimescaleDB: Time-Series Data
TimescaleDB extends PostgreSQL with hypertables (transparent time-based partitioning), continuous aggregates (pre-computed rollups), and compression โ essential for metrics, IoT, and financial tick data.
CREATE EXTENSION IF NOT EXISTS timescaledb;
-- Create a regular table, then convert to hypertable
CREATE TABLE metrics (
time TIMESTAMPTZ NOT NULL,
device_id UUID NOT NULL,
metric_name TEXT NOT NULL,
value DOUBLE PRECISION NOT NULL,
tags JSONB NOT NULL DEFAULT '{}'
);
-- Convert to hypertable (partitioned by time, 1 week per chunk)
SELECT create_hypertable('metrics', 'time', chunk_time_interval => INTERVAL '1 week');
-- Add compression policy (compress chunks older than 30 days)
ALTER TABLE metrics SET (
timescaledb.compress,
timescaledb.compress_segmentby = 'device_id, metric_name'
);
SELECT add_compression_policy('metrics', INTERVAL '30 days');
-- Continuous aggregate: pre-compute hourly rollups
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 hour', time) AS bucket,
device_id,
metric_name,
avg(value) AS avg_value,
min(value) AS min_value,
max(value) AS max_value,
count(*) AS sample_count
FROM metrics
GROUP BY 1, 2, 3
WITH NO DATA;
-- Automatically refresh the continuous aggregate
SELECT add_continuous_aggregate_policy(
'metrics_hourly',
start_offset => INTERVAL '3 hours',
end_offset => INTERVAL '1 hour',
schedule_interval => INTERVAL '1 hour'
);
-- Retention policy: drop data older than 1 year
SELECT add_retention_policy('metrics', INTERVAL '1 year');
// src/api/metrics.ts โ Query time-series data
export async function getMetricHistory(
deviceId: string,
metricName: string,
from: Date,
to: Date,
bucketInterval: '1m' | '1h' | '1d' = '1h',
): Promise<MetricBucket[]> {
// Use pre-computed view for hourly+ queries
const table = bucketInterval === '1m' ? 'metrics' : 'metrics_hourly';
const bucketCol = bucketInterval === '1m'
? `time_bucket('1 minute', time)`
: 'bucket';
const { rows } = await db.query<MetricBucket>(
`SELECT
${bucketCol} AS time,
avg_value,
min_value,
max_value,
sample_count
FROM ${table}
WHERE device_id = $1
AND metric_name = $2
AND ${bucketInterval === '1m' ? 'time' : 'bucket'} BETWEEN $3 AND $4
ORDER BY 1`,
[deviceId, metricName, from, to],
);
return rows;
}
pg_partman: Automated Table Partitioning
pg_partman creates and maintains time-based or serial partitions automatically โ no manual DDL, no maintenance windows.
CREATE EXTENSION IF NOT EXISTS pg_partman;
CREATE EXTENSION IF NOT EXISTS pg_cron; -- Required for automated maintenance
-- Create parent table (partitioned by range)
CREATE TABLE events (
id UUID NOT NULL DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
event_type TEXT NOT NULL,
payload JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);
-- Create index template (applied to each partition automatically)
CREATE INDEX ON events (user_id, created_at);
CREATE INDEX ON events (event_type, created_at);
-- Configure pg_partman: monthly partitions, keep last 12 months
SELECT partman.create_parent(
p_parent_table => 'public.events',
p_control => 'created_at',
p_type => 'native',
p_interval => 'monthly',
p_premake => 3 -- Create 3 future partitions in advance
);
-- Retention: automatically drop partitions older than 13 months
UPDATE partman.part_config
SET retention = '13 months',
retention_keep_table = FALSE, -- Actually drop old partitions
retention_keep_index = FALSE
WHERE parent_table = 'public.events';
-- Schedule pg_partman maintenance (creates future partitions, drops old ones)
SELECT cron.schedule(
'partman-maintenance',
'0 * * * *', -- Every hour
'SELECT partman.run_maintenance_proc()'
);
Extension Cost and Compatibility Matrix
| Extension | AWS RDS | AWS Aurora | Supabase | Neon | Self-hosted |
|---|---|---|---|---|---|
| pgvector | โ | โ | โ | โ | โ |
| PostGIS | โ | โ | โ | โ | โ |
| pg_trgm | โ | โ | โ | โ | โ |
| TimescaleDB | โ | โ | โ (limited) | โ | โ |
| pg_partman | โ | โ | โ ๏ธ (no cron) | โ | โ |
Why Clients Trust Viprasol
We implement PostgreSQL extension stacks for teams replacing fragmented specialized databases with a unified PostgreSQL-based architecture.
What we deliver:
- pgvector setup for RAG and semantic search pipelines
- PostGIS geospatial schema design and spatial index tuning
- pg_trgm fuzzy search with relevance scoring
- TimescaleDB schema design with continuous aggregates and retention policies
- pg_partman configuration with automated maintenance and retention
โ Discuss your database architecture โ Backend development services
External Resources
About the Author
Viprasol Tech Team
Custom Software Development Specialists
The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 1000+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement.
Need a Modern Web Application?
From landing pages to complex SaaS platforms โ we build it all with Next.js and React.
Free consultation โข No commitment โข Response within 24 hours
Need a custom web application built?
We build React and Next.js web applications with Lighthouse โฅ90 scores, mobile-first design, and full source code ownership. Senior engineers only โ from architecture through deployment.