PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman

PostgreSQL's extension system turns a general-purpose relational database into a specialized engine for vector search, geospatial queries, fuzzy text matching, time-series analytics, and automated partition management — all without leaving your existing stack. For most teams, the right extension is cheaper, faster to implement, and operationally simpler than adding a specialized database.

This post covers the five extensions we reach for most often, with production-ready schemas and queries for each.

Extension Overview: When to Use What

Extension	Use Case	Alternative	When to Choose Alternative
pgvector	AI embeddings, semantic search	Pinecone, Weaviate	>5M vectors, sub-10ms at scale
PostGIS	Geospatial queries, mapping	Google Maps API	Rarely — PostGIS handles nearly everything
pg_trgm	Fuzzy text search, LIKE optimization	Elasticsearch	Full-text search at >10M docs
TimescaleDB	Time-series data, continuous aggregates	InfluxDB, ClickHouse	>100M rows/day ingestion
pg_partman	Automated table partitioning	Manual partition DDL	Never — pg_partman is always better

pgvector: AI Embeddings and Semantic Search

pgvector stores and queries high-dimensional embeddings from OpenAI, Cohere, or locally-run models. It enables semantic search, RAG (retrieval-augmented generation), and recommendation systems inside PostgreSQL.

-- Install and create vector column
CREATE EXTENSION IF NOT EXISTS vector;

-- Document embeddings table
CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content     TEXT NOT NULL,
    metadata    JSONB NOT NULL DEFAULT '{}',
    embedding   vector(1536),  -- OpenAI text-embedding-3-small dimension
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- IVFFlat index for approximate nearest neighbor (ANN) search
-- Create AFTER you have data — index quality depends on row count
-- lists = sqrt(total_rows) is a good starting point
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

-- HNSW index (better recall, higher build cost — prefer for <1M vectors)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

// src/lib/embeddings.ts — Generate and store embeddings
import { OpenAI } from 'openai';
import { db } from '@/lib/db';

const openai = new OpenAI();

export async function embedAndStore(
  content: string,
  metadata: Record<string, unknown> = {},
): Promise<string> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: content,
  });

  const embedding = response.data[0].embedding;

  const { rows } = await db.query<{ id: string }>(
    `INSERT INTO documents (content, metadata, embedding)
     VALUES ($1, $2, $3)
     RETURNING id`,
    [content, JSON.stringify(metadata), JSON.stringify(embedding)],
  );

  return rows[0].id;
}

export async function semanticSearch(
  query: string,
  topK: number = 10,
  filter?: { category?: string },
): Promise<Array<{ id: string; content: string; similarity: number; metadata: Record<string, unknown> }>> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: query,
  });

  const queryEmbedding = response.data[0].embedding;

  // Set ivfflat probe count for better recall (trades speed for accuracy)
  await db.query('SET ivfflat.probes = 10');

  const { rows } = await db.query<{
    id: string;
    content: string;
    similarity: number;
    metadata: Record<string, unknown>;
  }>(
    `SELECT
       id,
       content,
       metadata,
       1 - (embedding <=> $1::vector) AS similarity
     FROM documents
     WHERE ($2::jsonb IS NULL OR metadata @> $2::jsonb)
     ORDER BY embedding <=> $1::vector
     LIMIT $3`,
    [
      JSON.stringify(queryEmbedding),
      filter ? JSON.stringify(filter) : null,
      topK,
    ],
  );

  return rows;
}

Hybrid Search: Vector + Full-Text

-- Combine semantic similarity with keyword relevance (RRF fusion)
WITH vector_results AS (
  SELECT id, row_number() OVER (ORDER BY embedding <=> $1::vector) AS rank
  FROM documents
  ORDER BY embedding <=> $1::vector
  LIMIT 60
),
fts_results AS (
  SELECT id, row_number() OVER (ORDER BY ts_rank(to_tsvector('english', content), query) DESC) AS rank
  FROM documents, to_tsquery('english', $2) query
  WHERE to_tsvector('english', content) @@ query
  LIMIT 60
),
-- Reciprocal Rank Fusion
rrf AS (
  SELECT
    COALESCE(v.id, f.id) AS id,
    COALESCE(1.0 / (60 + v.rank), 0) + COALESCE(1.0 / (60 + f.rank), 0) AS score
  FROM vector_results v
  FULL OUTER JOIN fts_results f USING (id)
)
SELECT d.id, d.content, d.metadata, r.score
FROM rrf r
JOIN documents d ON d.id = r.id
ORDER BY r.score DESC
LIMIT 10;

🌐 Looking for a Dev Team That Actually Delivers?

Most agencies sell you a project manager and assign juniors. Viprasol is different — senior engineers only, direct Slack access, and a 5.0★ Upwork record across 100+ projects.

React, Next.js, Node.js, TypeScript — production-grade stack
Fixed-price contracts — no surprise invoices
Full source code ownership from day one
90-day post-launch support included

Get a Free Scope Review WhatsApp

PostGIS: Geospatial Queries

PostGIS adds spatial types (GEOMETRY, GEOGRAPHY) and thousands of spatial functions to PostgreSQL — radius searches, polygon containment, distance calculations, and routing.

CREATE EXTENSION IF NOT EXISTS postgis;

-- Locations table with geography column (sphere-based, uses meters)
CREATE TABLE locations (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL,
    category    TEXT NOT NULL,
    coordinates GEOGRAPHY(POINT, 4326),  -- WGS84 (lat/lon)
    address     TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Spatial index (mandatory for performance)
CREATE INDEX ON locations USING GIST (coordinates);

-- Insert a point (longitude first, then latitude — PostGIS convention)
INSERT INTO locations (name, category, coordinates)
VALUES ('Viprasol HQ', 'office', ST_GeogFromText('SRID=4326;POINT(76.3 29.6)'));

-- Find locations within radius (meters)
SELECT
    id,
    name,
    category,
    ST_Distance(coordinates, ST_GeogFromText('SRID=4326;POINT(-73.935242 40.730610)')) AS distance_meters
FROM locations
WHERE ST_DWithin(
    coordinates,
    ST_GeogFromText('SRID=4326;POINT(-73.935242 40.730610)'),
    5000  -- 5km radius
)
ORDER BY distance_meters
LIMIT 20;

// src/api/nearby.ts — Nearest-location API
export async function findNearby(
  lat: number,
  lng: number,
  radiusMeters: number,
  category?: string,
): Promise<Location[]> {
  const { rows } = await db.query<Location & { distance_meters: number }>(
    `SELECT
       id, name, category, address,
       ST_Y(coordinates::geometry) AS latitude,
       ST_X(coordinates::geometry) AS longitude,
       ST_Distance(coordinates, ST_MakePoint($2, $1)::geography) AS distance_meters
     FROM locations
     WHERE ST_DWithin(coordinates, ST_MakePoint($2, $1)::geography, $3)
       AND ($4::text IS NULL OR category = $4)
     ORDER BY distance_meters
     LIMIT 50`,
    [lat, lng, radiusMeters, category ?? null],
  );

  return rows;
}

Polygon Geofencing

-- Service area polygon (e.g., delivery zone)
CREATE TABLE service_areas (
    id       UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name     TEXT NOT NULL,
    polygon  GEOGRAPHY(POLYGON, 4326)
);

-- Check if a point is within a service area
SELECT sa.name
FROM service_areas sa
WHERE ST_Contains(
    sa.polygon::geometry,
    ST_MakePoint(-73.935242, 40.730610)  -- User's lng, lat
);

pg_trgm: Fuzzy Search and LIKE Optimization

pg_trgm enables sub-100ms fuzzy text matching across millions of rows — without Elasticsearch.

CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- GIN index on trigrams (for LIKE and similarity queries)
CREATE INDEX ON products USING GIN (name gin_trgm_ops);
CREATE INDEX ON products USING GIN (description gin_trgm_ops);

-- Fuzzy search: returns results even with typos
SELECT
    id,
    name,
    price,
    similarity(name, 'iphome 15') AS name_similarity  -- typo: 'iphome'
FROM products
WHERE name % 'iphome 15'           -- % = similarity > threshold (default 0.3)
ORDER BY name_similarity DESC
LIMIT 20;

-- LIKE optimization (without pg_trgm, this would be a full table scan)
-- With GIN index, this is fast
SELECT * FROM products
WHERE name ILIKE '%iphone%';

-- Combined search: trigram + full-text
SELECT id, name, description,
    similarity(name, $1) * 2 +
    ts_rank(to_tsvector('english', description), plainto_tsquery('english', $1)) AS relevance
FROM products
WHERE name % $1
   OR to_tsvector('english', description) @@ plainto_tsquery('english', $1)
ORDER BY relevance DESC
LIMIT 20;

-- Tune similarity threshold per query (default 0.3 is often too strict or too loose)
SET pg_trgm.similarity_threshold = 0.2;  -- Session-level; more permissive

🚀 Senior Engineers. No Junior Handoffs. Ever.

You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.

MVPs in 4–8 weeks, full platforms in 3–5 months
Lighthouse 90+ performance scores standard
Works across US, UK, AU timezones
Free 30-min architecture review, no commitment

Start My Project WhatsApp

TimescaleDB: Time-Series Data

TimescaleDB extends PostgreSQL with hypertables (transparent time-based partitioning), continuous aggregates (pre-computed rollups), and compression — essential for metrics, IoT, and financial tick data.

CREATE EXTENSION IF NOT EXISTS timescaledb;

-- Create a regular table, then convert to hypertable
CREATE TABLE metrics (
    time        TIMESTAMPTZ NOT NULL,
    device_id   UUID NOT NULL,
    metric_name TEXT NOT NULL,
    value       DOUBLE PRECISION NOT NULL,
    tags        JSONB NOT NULL DEFAULT '{}'
);

-- Convert to hypertable (partitioned by time, 1 week per chunk)
SELECT create_hypertable('metrics', 'time', chunk_time_interval => INTERVAL '1 week');

-- Add compression policy (compress chunks older than 30 days)
ALTER TABLE metrics SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'device_id, metric_name'
);

SELECT add_compression_policy('metrics', INTERVAL '30 days');

-- Continuous aggregate: pre-compute hourly rollups
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', time) AS bucket,
    device_id,
    metric_name,
    avg(value)  AS avg_value,
    min(value)  AS min_value,
    max(value)  AS max_value,
    count(*)    AS sample_count
FROM metrics
GROUP BY 1, 2, 3
WITH NO DATA;

-- Automatically refresh the continuous aggregate
SELECT add_continuous_aggregate_policy(
    'metrics_hourly',
    start_offset  => INTERVAL '3 hours',
    end_offset    => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour'
);

-- Retention policy: drop data older than 1 year
SELECT add_retention_policy('metrics', INTERVAL '1 year');

// src/api/metrics.ts — Query time-series data
export async function getMetricHistory(
  deviceId: string,
  metricName: string,
  from: Date,
  to: Date,
  bucketInterval: '1m' | '1h' | '1d' = '1h',
): Promise<MetricBucket[]> {
  // Use pre-computed view for hourly+ queries
  const table = bucketInterval === '1m' ? 'metrics' : 'metrics_hourly';
  const bucketCol = bucketInterval === '1m'
    ? `time_bucket('1 minute', time)`
    : 'bucket';

  const { rows } = await db.query<MetricBucket>(
    `SELECT
       ${bucketCol} AS time,
       avg_value,
       min_value,
       max_value,
       sample_count
     FROM ${table}
     WHERE device_id = $1
       AND metric_name = $2
       AND ${bucketInterval === '1m' ? 'time' : 'bucket'} BETWEEN $3 AND $4
     ORDER BY 1`,
    [deviceId, metricName, from, to],
  );

  return rows;
}

pg_partman: Automated Table Partitioning

pg_partman creates and maintains time-based or serial partitions automatically — no manual DDL, no maintenance windows.

CREATE EXTENSION IF NOT EXISTS pg_partman;
CREATE EXTENSION IF NOT EXISTS pg_cron;  -- Required for automated maintenance

-- Create parent table (partitioned by range)
CREATE TABLE events (
    id         UUID NOT NULL DEFAULT gen_random_uuid(),
    user_id    UUID NOT NULL,
    event_type TEXT NOT NULL,
    payload    JSONB NOT NULL DEFAULT '{}',
    created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);

-- Create index template (applied to each partition automatically)
CREATE INDEX ON events (user_id, created_at);
CREATE INDEX ON events (event_type, created_at);

-- Configure pg_partman: monthly partitions, keep last 12 months
SELECT partman.create_parent(
    p_parent_table => 'public.events',
    p_control      => 'created_at',
    p_type         => 'native',
    p_interval     => 'monthly',
    p_premake      => 3  -- Create 3 future partitions in advance
);

-- Retention: automatically drop partitions older than 13 months
UPDATE partman.part_config
SET retention = '13 months',
    retention_keep_table = FALSE,  -- Actually drop old partitions
    retention_keep_index = FALSE
WHERE parent_table = 'public.events';

-- Schedule pg_partman maintenance (creates future partitions, drops old ones)
SELECT cron.schedule(
    'partman-maintenance',
    '0 * * * *',  -- Every hour
    'SELECT partman.run_maintenance_proc()'
);

Extension Cost and Compatibility Matrix

Extension	AWS RDS	AWS Aurora	Supabase	Neon	Self-hosted
pgvector	✅	✅	✅	✅	✅
PostGIS	✅	✅	✅	✅	✅
pg_trgm	✅	✅	✅	✅	✅
TimescaleDB	❌	❌	✅ (limited)	❌	✅
pg_partman	✅	✅	⚠️ (no cron)	✅	✅

Working With Viprasol

We implement PostgreSQL extension stacks for teams replacing fragmented specialized databases with a unified PostgreSQL-based architecture.

What we deliver:

pgvector setup for RAG and semantic search pipelines
PostGIS geospatial schema design and spatial index tuning
pg_trgm fuzzy search with relevance scoring
TimescaleDB schema design with continuous aggregates and retention policies
pg_partman configuration with automated maintenance and retention

→ Discuss your database architecture → Backend development services

PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman

PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman

Extension Overview: When to Use What

pgvector: AI Embeddings and Semantic Search

Hybrid Search: Vector + Full-Text

🌐 Looking for a Dev Team That Actually Delivers?

PostGIS: Geospatial Queries

Polygon Geofencing

pg_trgm: Fuzzy Search and LIKE Optimization

🚀 Senior Engineers. No Junior Handoffs. Ever.

TimescaleDB: Time-Series Data

pg_partman: Automated Table Partitioning

Extension Cost and Compatibility Matrix

Working With Viprasol

See Also

Viprasol Tech Team

Need a Modern Web Application?

Need a custom web application built?

Related Articles

PostgreSQL Logical Replication in 2026: CDC, Replication Slots, and Streaming to Data Warehouse

PostgreSQL Partitioning in 2026: Range, List, and Hash Partitioning with Partition Pruning

PostgreSQL EXPLAIN ANALYZE in 2026: Reading Query Plans, Identifying Slow Queries, and Index Fixes