Back to Blog

PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman

Master the PostgreSQL extension ecosystem: pgvector for AI embeddings, PostGIS for geospatial, pg_trgm for fuzzy search, TimescaleDB for time-series, and pg_partman for automated partitioning.

Viprasol Tech Team
August 5, 2026
14 min read

PostgreSQL Extensions in 2026: pgvector, PostGIS, pg_trgm, TimescaleDB, and pg_partman

PostgreSQL's extension system turns a general-purpose relational database into a specialized engine for vector search, geospatial queries, fuzzy text matching, time-series analytics, and automated partition management โ€” all without leaving your existing stack. For most teams, the right extension is cheaper, faster to implement, and operationally simpler than adding a specialized database.

This post covers the five extensions we reach for most often, with production-ready schemas and queries for each.


Extension Overview: When to Use What

ExtensionUse CaseAlternativeWhen to Choose Alternative
pgvectorAI embeddings, semantic searchPinecone, Weaviate>5M vectors, sub-10ms at scale
PostGISGeospatial queries, mappingGoogle Maps APIRarely โ€” PostGIS handles nearly everything
pg_trgmFuzzy text search, LIKE optimizationElasticsearchFull-text search at >10M docs
TimescaleDBTime-series data, continuous aggregatesInfluxDB, ClickHouse>100M rows/day ingestion
pg_partmanAutomated table partitioningManual partition DDLNever โ€” pg_partman is always better

pgvector: AI Embeddings and Semantic Search

pgvector stores and queries high-dimensional embeddings from OpenAI, Cohere, or locally-run models. It enables semantic search, RAG (retrieval-augmented generation), and recommendation systems inside PostgreSQL.

-- Install and create vector column
CREATE EXTENSION IF NOT EXISTS vector;

-- Document embeddings table
CREATE TABLE documents (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content     TEXT NOT NULL,
    metadata    JSONB NOT NULL DEFAULT '{}',
    embedding   vector(1536),  -- OpenAI text-embedding-3-small dimension
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- IVFFlat index for approximate nearest neighbor (ANN) search
-- Create AFTER you have data โ€” index quality depends on row count
-- lists = sqrt(total_rows) is a good starting point
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

-- HNSW index (better recall, higher build cost โ€” prefer for <1M vectors)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);
// src/lib/embeddings.ts โ€” Generate and store embeddings
import { OpenAI } from 'openai';
import { db } from '@/lib/db';

const openai = new OpenAI();

export async function embedAndStore(
  content: string,
  metadata: Record<string, unknown> = {},
): Promise<string> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: content,
  });

  const embedding = response.data[0].embedding;

  const { rows } = await db.query<{ id: string }>(
    `INSERT INTO documents (content, metadata, embedding)
     VALUES ($1, $2, $3)
     RETURNING id`,
    [content, JSON.stringify(metadata), JSON.stringify(embedding)],
  );

  return rows[0].id;
}

export async function semanticSearch(
  query: string,
  topK: number = 10,
  filter?: { category?: string },
): Promise<Array<{ id: string; content: string; similarity: number; metadata: Record<string, unknown> }>> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: query,
  });

  const queryEmbedding = response.data[0].embedding;

  // Set ivfflat probe count for better recall (trades speed for accuracy)
  await db.query('SET ivfflat.probes = 10');

  const { rows } = await db.query<{
    id: string;
    content: string;
    similarity: number;
    metadata: Record<string, unknown>;
  }>(
    `SELECT
       id,
       content,
       metadata,
       1 - (embedding <=> $1::vector) AS similarity
     FROM documents
     WHERE ($2::jsonb IS NULL OR metadata @> $2::jsonb)
     ORDER BY embedding <=> $1::vector
     LIMIT $3`,
    [
      JSON.stringify(queryEmbedding),
      filter ? JSON.stringify(filter) : null,
      topK,
    ],
  );

  return rows;
}

Hybrid Search: Vector + Full-Text

-- Combine semantic similarity with keyword relevance (RRF fusion)
WITH vector_results AS (
  SELECT id, row_number() OVER (ORDER BY embedding <=> $1::vector) AS rank
  FROM documents
  ORDER BY embedding <=> $1::vector
  LIMIT 60
),
fts_results AS (
  SELECT id, row_number() OVER (ORDER BY ts_rank(to_tsvector('english', content), query) DESC) AS rank
  FROM documents, to_tsquery('english', $2) query
  WHERE to_tsvector('english', content) @@ query
  LIMIT 60
),
-- Reciprocal Rank Fusion
rrf AS (
  SELECT
    COALESCE(v.id, f.id) AS id,
    COALESCE(1.0 / (60 + v.rank), 0) + COALESCE(1.0 / (60 + f.rank), 0) AS score
  FROM vector_results v
  FULL OUTER JOIN fts_results f USING (id)
)
SELECT d.id, d.content, d.metadata, r.score
FROM rrf r
JOIN documents d ON d.id = r.id
ORDER BY r.score DESC
LIMIT 10;

๐ŸŒ Looking for a Dev Team That Actually Delivers?

Most agencies sell you a project manager and assign juniors. Viprasol is different โ€” senior engineers only, direct Slack access, and a 5.0โ˜… Upwork record across 100+ projects.

  • React, Next.js, Node.js, TypeScript โ€” production-grade stack
  • Fixed-price contracts โ€” no surprise invoices
  • Full source code ownership from day one
  • 90-day post-launch support included

PostGIS: Geospatial Queries

PostGIS adds spatial types (GEOMETRY, GEOGRAPHY) and thousands of spatial functions to PostgreSQL โ€” radius searches, polygon containment, distance calculations, and routing.

CREATE EXTENSION IF NOT EXISTS postgis;

-- Locations table with geography column (sphere-based, uses meters)
CREATE TABLE locations (
    id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name        TEXT NOT NULL,
    category    TEXT NOT NULL,
    coordinates GEOGRAPHY(POINT, 4326),  -- WGS84 (lat/lon)
    address     TEXT,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Spatial index (mandatory for performance)
CREATE INDEX ON locations USING GIST (coordinates);

-- Insert a point (longitude first, then latitude โ€” PostGIS convention)
INSERT INTO locations (name, category, coordinates)
VALUES ('Viprasol HQ', 'office', ST_GeogFromText('SRID=4326;POINT(76.3 29.6)'));

-- Find locations within radius (meters)
SELECT
    id,
    name,
    category,
    ST_Distance(coordinates, ST_GeogFromText('SRID=4326;POINT(-73.935242 40.730610)')) AS distance_meters
FROM locations
WHERE ST_DWithin(
    coordinates,
    ST_GeogFromText('SRID=4326;POINT(-73.935242 40.730610)'),
    5000  -- 5km radius
)
ORDER BY distance_meters
LIMIT 20;
// src/api/nearby.ts โ€” Nearest-location API
export async function findNearby(
  lat: number,
  lng: number,
  radiusMeters: number,
  category?: string,
): Promise<Location[]> {
  const { rows } = await db.query<Location & { distance_meters: number }>(
    `SELECT
       id, name, category, address,
       ST_Y(coordinates::geometry) AS latitude,
       ST_X(coordinates::geometry) AS longitude,
       ST_Distance(coordinates, ST_MakePoint($2, $1)::geography) AS distance_meters
     FROM locations
     WHERE ST_DWithin(coordinates, ST_MakePoint($2, $1)::geography, $3)
       AND ($4::text IS NULL OR category = $4)
     ORDER BY distance_meters
     LIMIT 50`,
    [lat, lng, radiusMeters, category ?? null],
  );

  return rows;
}

Polygon Geofencing

-- Service area polygon (e.g., delivery zone)
CREATE TABLE service_areas (
    id       UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name     TEXT NOT NULL,
    polygon  GEOGRAPHY(POLYGON, 4326)
);

-- Check if a point is within a service area
SELECT sa.name
FROM service_areas sa
WHERE ST_Contains(
    sa.polygon::geometry,
    ST_MakePoint(-73.935242, 40.730610)  -- User's lng, lat
);

pg_trgm: Fuzzy Search and LIKE Optimization

pg_trgm enables sub-100ms fuzzy text matching across millions of rows โ€” without Elasticsearch.

CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- GIN index on trigrams (for LIKE and similarity queries)
CREATE INDEX ON products USING GIN (name gin_trgm_ops);
CREATE INDEX ON products USING GIN (description gin_trgm_ops);

-- Fuzzy search: returns results even with typos
SELECT
    id,
    name,
    price,
    similarity(name, 'iphome 15') AS name_similarity  -- typo: 'iphome'
FROM products
WHERE name % 'iphome 15'           -- % = similarity > threshold (default 0.3)
ORDER BY name_similarity DESC
LIMIT 20;

-- LIKE optimization (without pg_trgm, this would be a full table scan)
-- With GIN index, this is fast
SELECT * FROM products
WHERE name ILIKE '%iphone%';

-- Combined search: trigram + full-text
SELECT id, name, description,
    similarity(name, $1) * 2 +
    ts_rank(to_tsvector('english', description), plainto_tsquery('english', $1)) AS relevance
FROM products
WHERE name % $1
   OR to_tsvector('english', description) @@ plainto_tsquery('english', $1)
ORDER BY relevance DESC
LIMIT 20;
-- Tune similarity threshold per query (default 0.3 is often too strict or too loose)
SET pg_trgm.similarity_threshold = 0.2;  -- Session-level; more permissive

๐Ÿš€ Senior Engineers. No Junior Handoffs. Ever.

You get the senior developer, not a project manager who relays your requirements to someone you never meet. Every Viprasol project has a senior lead from kickoff to launch.

  • MVPs in 4โ€“8 weeks, full platforms in 3โ€“5 months
  • Lighthouse 90+ performance scores standard
  • Works across US, UK, AU timezones
  • Free 30-min architecture review, no commitment

TimescaleDB: Time-Series Data

TimescaleDB extends PostgreSQL with hypertables (transparent time-based partitioning), continuous aggregates (pre-computed rollups), and compression โ€” essential for metrics, IoT, and financial tick data.

CREATE EXTENSION IF NOT EXISTS timescaledb;

-- Create a regular table, then convert to hypertable
CREATE TABLE metrics (
    time        TIMESTAMPTZ NOT NULL,
    device_id   UUID NOT NULL,
    metric_name TEXT NOT NULL,
    value       DOUBLE PRECISION NOT NULL,
    tags        JSONB NOT NULL DEFAULT '{}'
);

-- Convert to hypertable (partitioned by time, 1 week per chunk)
SELECT create_hypertable('metrics', 'time', chunk_time_interval => INTERVAL '1 week');

-- Add compression policy (compress chunks older than 30 days)
ALTER TABLE metrics SET (
    timescaledb.compress,
    timescaledb.compress_segmentby = 'device_id, metric_name'
);

SELECT add_compression_policy('metrics', INTERVAL '30 days');

-- Continuous aggregate: pre-compute hourly rollups
CREATE MATERIALIZED VIEW metrics_hourly
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', time) AS bucket,
    device_id,
    metric_name,
    avg(value)  AS avg_value,
    min(value)  AS min_value,
    max(value)  AS max_value,
    count(*)    AS sample_count
FROM metrics
GROUP BY 1, 2, 3
WITH NO DATA;

-- Automatically refresh the continuous aggregate
SELECT add_continuous_aggregate_policy(
    'metrics_hourly',
    start_offset  => INTERVAL '3 hours',
    end_offset    => INTERVAL '1 hour',
    schedule_interval => INTERVAL '1 hour'
);

-- Retention policy: drop data older than 1 year
SELECT add_retention_policy('metrics', INTERVAL '1 year');
// src/api/metrics.ts โ€” Query time-series data
export async function getMetricHistory(
  deviceId: string,
  metricName: string,
  from: Date,
  to: Date,
  bucketInterval: '1m' | '1h' | '1d' = '1h',
): Promise<MetricBucket[]> {
  // Use pre-computed view for hourly+ queries
  const table = bucketInterval === '1m' ? 'metrics' : 'metrics_hourly';
  const bucketCol = bucketInterval === '1m'
    ? `time_bucket('1 minute', time)`
    : 'bucket';

  const { rows } = await db.query<MetricBucket>(
    `SELECT
       ${bucketCol} AS time,
       avg_value,
       min_value,
       max_value,
       sample_count
     FROM ${table}
     WHERE device_id = $1
       AND metric_name = $2
       AND ${bucketInterval === '1m' ? 'time' : 'bucket'} BETWEEN $3 AND $4
     ORDER BY 1`,
    [deviceId, metricName, from, to],
  );

  return rows;
}

pg_partman: Automated Table Partitioning

pg_partman creates and maintains time-based or serial partitions automatically โ€” no manual DDL, no maintenance windows.

CREATE EXTENSION IF NOT EXISTS pg_partman;
CREATE EXTENSION IF NOT EXISTS pg_cron;  -- Required for automated maintenance

-- Create parent table (partitioned by range)
CREATE TABLE events (
    id         UUID NOT NULL DEFAULT gen_random_uuid(),
    user_id    UUID NOT NULL,
    event_type TEXT NOT NULL,
    payload    JSONB NOT NULL DEFAULT '{}',
    created_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (created_at);

-- Create index template (applied to each partition automatically)
CREATE INDEX ON events (user_id, created_at);
CREATE INDEX ON events (event_type, created_at);

-- Configure pg_partman: monthly partitions, keep last 12 months
SELECT partman.create_parent(
    p_parent_table => 'public.events',
    p_control      => 'created_at',
    p_type         => 'native',
    p_interval     => 'monthly',
    p_premake      => 3  -- Create 3 future partitions in advance
);

-- Retention: automatically drop partitions older than 13 months
UPDATE partman.part_config
SET retention = '13 months',
    retention_keep_table = FALSE,  -- Actually drop old partitions
    retention_keep_index = FALSE
WHERE parent_table = 'public.events';

-- Schedule pg_partman maintenance (creates future partitions, drops old ones)
SELECT cron.schedule(
    'partman-maintenance',
    '0 * * * *',  -- Every hour
    'SELECT partman.run_maintenance_proc()'
);

Extension Cost and Compatibility Matrix

ExtensionAWS RDSAWS AuroraSupabaseNeonSelf-hosted
pgvectorโœ…โœ…โœ…โœ…โœ…
PostGISโœ…โœ…โœ…โœ…โœ…
pg_trgmโœ…โœ…โœ…โœ…โœ…
TimescaleDBโŒโŒโœ… (limited)โŒโœ…
pg_partmanโœ…โœ…โš ๏ธ (no cron)โœ…โœ…

Working With Viprasol

We implement PostgreSQL extension stacks for teams replacing fragmented specialized databases with a unified PostgreSQL-based architecture.

What we deliver:

  • pgvector setup for RAG and semantic search pipelines
  • PostGIS geospatial schema design and spatial index tuning
  • pg_trgm fuzzy search with relevance scoring
  • TimescaleDB schema design with continuous aggregates and retention policies
  • pg_partman configuration with automated maintenance and retention

โ†’ Discuss your database architecture โ†’ Backend development services


Share this article:

About the Author

V

Viprasol Tech Team

Custom Software Development Specialists

The Viprasol Tech team specialises in algorithmic trading software, AI agent systems, and SaaS development. With 100+ projects delivered across MT4/MT5 EAs, fintech platforms, and production AI systems, the team brings deep technical experience to every engagement. Based in India, serving clients globally.

MT4/MT5 EA DevelopmentAI Agent SystemsSaaS DevelopmentAlgorithmic Trading

Need a Modern Web Application?

From landing pages to complex SaaS platforms โ€” we build it all with Next.js and React.

Free consultation โ€ข No commitment โ€ข Response within 24 hours

Viprasol ยท Web Development

Need a custom web application built?

We build React and Next.js web applications with Lighthouse โ‰ฅ90 scores, mobile-first design, and full source code ownership. Senior engineers only โ€” from architecture through deployment.