OpenTelemetry: Traces, Metrics, and Logs That...

Q: What is OpenTelemetry?

> Quick answer. OpenTelemetry unifies traces, metrics, and logs under one vendor-neutral standard, replacing fragmented stacks like Application Insights, Datadog, and scattered custom logging. It lets you follow a single user request end-to-end across services, so you can trace a slow request to its source instead of checking disconnected tools.

OpenTelemetry: Observability for Modern Applications (2026)

Quick answer. OpenTelemetry unifies traces, metrics, and logs under one vendor-neutral standard, replacing fragmented stacks like Application Insights, Datadog, and scattered custom logging. It lets you follow a single user request end-to-end across services, so you can trace a slow request to its source instead of checking disconnected tools.

Observability is the difference between firefighting in the dark and methodically solving problems. At Viprasol, we've moved from the fragmented world of multiple monitoring tools to OpenTelemetry—a unified approach to collecting, processing, and exporting telemetry data. This shift has transformed how we understand what's happening inside our applications.

The Observability Crisis We Solved

Five years ago, our monitoring setup looked like this: Application Insights for some services, Datadog for others, custom logging in a few places, and manual traces scattered throughout the codebase. Each tool worked fine in isolation, but getting a complete picture of a user request flowing through our system was nearly impossible.

A user reported slow performance. We checked metrics. No spike. We checked logs. Found an error, but couldn't correlate it with anything else. We grabbed a sample trace from one service, but the next service in the chain logged differently. Three hours later, we finally found the culprit: a database connection pool was exhausted in an obscure service.

This experience pushed us to find a better way. We discovered OpenTelemetry—an open standard for observability that was gaining momentum. Instead of replacing one vendor lock-in with another, we could instrument our code once and send data to any backend we chose. That flexibility changed everything.

Understanding the Three Pillars of OpenTelemetry

OpenTelemetry unifies three types of telemetry data:

Traces

A trace represents the entire journey of a single request through your system. It shows:

Which services processed the request
How long each operation took
Where errors occurred
Dependencies between operations

When a user makes a request to your application, a trace captures every step: frontend JavaScript execution, API call, database query, cache lookup, external API call. All connected in a single timeline.

Metrics

Metrics answer the question: "What's happening in aggregate?" They measure:

Request rates and latencies
Error percentages
CPU and memory usage
Queue depths and throughput
Business metrics (signups, purchases, etc.)

Unlike traces which are request-specific, metrics are rolled-up statistics. They tell you that your 99th percentile latency is 2 seconds, not that user Alice's request took 2 seconds.

Logs

Logs remain important, but in OpenTelemetry they're contextualized. Instead of a log message floating in isolation, it includes trace IDs and span IDs, connecting it to the broader picture.

Code:

2026-03-07T10:15:23Z ERROR [trace_id=abc123] Payment processing failed
// vs
2026-03-07T10:15:23Z ERROR Payment processing failed (the old way)

The first log message can be found instantly by anyone looking at the payment processing trace. The second requires guesswork and hope.

☁️ Is Your Cloud Costing Too Much?

Most teams overspend 30–40% on cloud — wrong instance types, no reserved pricing, bloated storage. We audit, right-size, and automate your infrastructure.

AWS, GCP, Azure certified engineers
Infrastructure as Code (Terraform, CDK)
Docker, Kubernetes, GitHub Actions CI/CD
Typical audit recovers $500–$3,000/month in savings

Get a Free Cloud Audit WhatsApp

Setting Up OpenTelemetry in Node.js Applications

For web development projects, here's how we bootstrap OpenTelemetry:

Code:

import { NodeSDK } from '@opentelemetry/sdk-node';
import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-node';
import { MeterProvider, PeriodicExportingMetricReader } from '@opentelemetry/sdk-metrics';
import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http';

const traceExporter = new OTLPTraceExporter({
  url: 'http://otel-collector:4318/v1/traces'
});

const metricExporter = new OTLPMetricExporter({
  url: 'http://otel-collector:4318/v1/metrics'
});

const sdk = new NodeSDK({
  traceExporter,
  instrumentations: [getNodeAutoInstrumentations()],
  metricReader: new PeriodicExportingMetricReader({
    exporter: metricExporter
  })
});

sdk.start();
console.log('OpenTelemetry started');

This single initialization automatically instruments:

HTTP requests
Database calls
External API calls
Async operations
Custom code

Auto-instrumentation is powerful, but custom instrumentation is where you gain real insight:

Code:

import { trace } from '@opentelemetry/api';

const tracer = trace.getTracer('my-app');

async function processPayment(userId, amount) {
  const span = tracer.startSpan('payment.process');
  
  try {
    span.setAttributes({
      'user.id': userId,
      'payment.amount': amount,
      'payment.currency': 'USD'
    });

    const result = await chargeCard(userId, amount);
    span.setStatus({ code: SpanStatusCode.OK });
    return result;

  } catch (error) {
    span.recordException(error);
    span.setStatus({ code: SpanStatusCode.ERROR });
    throw error;

  } finally {
    span.end();
  }
}

Browser and Frontend Instrumentation

OpenTelemetry isn't just for backend. Modern SaaS development requires frontend observability too:

Code:

import { WebTracerProvider } from '@opentelemetry/sdk-trace-web';
import { ZoneContextManager } from '@opentelemetry/context-zone';
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';

const provider = new WebTracerProvider({
  resource: new Resource({
    'service.name': 'frontend-app'
  })
});

provider.addSpanProcessor(
  new BatchSpanProcessor(new OTLPTraceExporter())
);
provider.register({
  contextManager: new ZoneContextManager()
});

const tracer = trace.getTracer('app');

// Track user interactions
document.addEventListener('click', (event) => {
  const span = tracer.startSpan('user.interaction.click');
  span.setAttributes({
    'element.id': event.target.id,
    'element.class': event.target.className
  });
  span.end();
});

opentelemetry - OpenTelemetry: Traces, Metrics, and Logs That Actually Help

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Ship faster without breaking things. We build CI/CD pipelines, monitoring stacks, and auto-scaling infrastructure that your team can actually maintain.

Staging + production environments with feature flags
Automated security scanning in the pipeline
Uptime monitoring + alerting + runbook automation
On-call support handover docs included

Modernize My DevOps WhatsApp

Deployment Architecture

For cloud solutions, OpenTelemetry follows this pattern:

Code:

┌─────────────────────────────────────┐
│  Your Applications                   │
│  (Node.js, Python, Go, Java, etc.)  │
└────────────────┬────────────────────┘
                 │ OTLP Protocol (HTTP/gRPC)
                 ▼
┌─────────────────────────────────────┐
│  OpenTelemetry Collector            │
│  - Receives telemetry               │
│  - Batches for efficiency           │
│  - Routes to multiple backends      │
└────────┬───────────────┬────────────┘
         │               │
         ▼               ▼
    Jaeger (Traces)  Prometheus (Metrics)

Each application sends telemetry to a central collector, which acts as a router. This provides:

Decoupling: Change backends without redeploying applications
Batching: More efficient network usage
Filtering: Reduce storage costs by dropping unneeded data
Transformation: Enrich telemetry with additional context

Practical Instrumentation Patterns

Database Observability

Most frameworks auto-instrument databases, but custom context helps:

Code:

async function queryDatabase(query, params) {
  const span = tracer.startSpan('db.query', {
    attributes: {
      'db.system': 'postgres',
      'db.statement': query.substring(0, 100), // Truncate for safety
      'db.params': params.length
    }
  });

  try {
    const startTime = Date.now();
    const result = await pool.query(query, params);
    span.setAttributes({
      'db.rows_affected': result.rowCount,
      'db.duration_ms': Date.now() - startTime
    });
    return result;
  } catch (error) {
    span.recordException(error);
    throw error;
  } finally {
    span.end();
  }
}

External API Calls

Track third-party integrations:

Code:

async function callExternalAPI(service, endpoint) {
  const span = tracer.startSpan('http.client', {
    attributes: {
      'http.method': 'GET',
      'http.url': **${service}${endpoint}**,
      'http.target': endpoint
    }
  });

  try {
    const response = await fetch(**${service}${endpoint}**);
    span.setAttributes({
      'http.status_code': response.status,
      'http.response_time_ms': Date.now() - startTime
    });
    return response;
  } catch (error) {
    span.recordException(error);
    throw error;
  } finally {
    span.end();
  }
}

Business Logic Instrumentation

This is where OpenTelemetry really shines:

Code:

async function checkoutCart(userId, items) {
  const span = tracer.startSpan('checkout.process');
  
  span.setAttributes({
    'user.id': userId,
    'cart.item_count': items.length,
    'cart.total': items.reduce((sum, i) => sum + i.price, 0)
  });

  // Capture business events
  const validationSpan = tracer.startSpan('checkout.validation', {
    parent: span
  });
  validateItems(items);
  validationSpan.end();

  const paymentSpan = tracer.startSpan('checkout.payment', {
    parent: span
  });
  const paymentResult = await processPayment(userId, items);
  paymentSpan.setAttributes({
    'payment.status': paymentResult.status,
    'payment.method': paymentResult.method
  });
  paymentSpan.end();

  span.end();
  return paymentResult;
}

Sampling Strategies for Cost Control

Collecting telemetry for every request gets expensive at scale. Sampling reduces costs while maintaining insight:

Code:

import { ProbabilitySampler } from '@opentelemetry/sdk-trace-node';

// Sample 10% of requests
const sdk = new NodeSDK({
  sampler: new ProbabilitySampler(0.1)
});

Better: adaptive sampling that samples more when error rates are high:

Code:

class AdaptiveSampler implements Sampler {
  shouldSample(context, traceId, spanName, spanKind, attributes) {
    // Sample all errors
    if (attributes['error'] === true) {
      return { decision: SamplingDecision.RECORD_AND_SAMPLE };
    }

    // Sample 5% of normal requests
    if (Math.random() < 0.05) {
      return { decision: SamplingDecision.RECORD_AND_SAMPLE };
    }

    // Don't sample health checks
    return { decision: SamplingDecision.NOT_RECORD };
  }
}

Key Features Comparison

Feature	Jaeger	Tempo	Datadog
Open Source	Yes	Yes	No
Trace Storage	Local/ES	S3/GCS	Proprietary
Cost	Low	Low	High
Ease of Setup	Medium	Easy	Very Easy
Query Flexibility	Good	Limited	Excellent

For detailed implementation guidance, consult the official OpenTelemetry documentation and explore Jaeger's architecture guide to understand how distributed tracing works at scale. Also review Google Cloud's observability documentation for additional best practices.

Common Pitfalls and Solutions

Too Much Data, Too Little Insight

Don't instrument everything. Focus on:

User-facing operations
External integrations
Error paths
Business-critical workflows

Cardinality Explosion

Avoid creating spans with unbounded attributes:

Code:

// Bad: Creates thousands of unique span names
for (let i = 0; i < items.length; i++) {
  tracer.startSpan(**item.process.${items[i].id}**);
}

// Good: Single span with list attribute
const span = tracer.startSpan('items.process');
span.setAttributes({
  'items.count': items.length
});

Performance Impact

OpenTelemetry instrumentation has overhead. Minimize it:

Code:

// Batch exports instead of sending individually
const processor = new BatchSpanProcessor(exporter, {
  maxQueueSize: 2048,
  maxExportBatchSize: 512,
  scheduledDelayMillis: 5000
});

Advanced Instrumentation Strategies

Request Context Propagation

Trace requests across services using W3C Trace Context:

Code:

import { W3CTraceContextPropagator } from '@opentelemetry/core';

const propagator = new W3CTraceContextPropagator();

// Extract trace context from incoming request
const extractedContext = propagator.extract(
  context.active(),
  request.headers,
  defaultTextMapGetter
);

// Set as active context for downstream operations
context.with(extractedContext, async () => {
  // All operations here use the same trace
  await processRequest(request);
});

Custom Resource Attributes

Add metadata to identify your services:

Code:

import { Resource } from '@opentelemetry/resources';
import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions';

const resource = Resource.default().merge(
  new Resource({
    [SemanticResourceAttributes.SERVICE_NAME]: 'payment-service',
    [SemanticResourceAttributes.SERVICE_VERSION]: '1.2.3',
    'deployment.environment': process.env.NODE_ENV,
    'git.commit': process.env.GIT_SHA,
    'kubernetes.namespace': process.env.K8S_NAMESPACE
  })
);

Filtering and Processing Telemetry

Reduce storage costs by filtering unneeded data:

Code:

class FilteringSpanProcessor {
  onStart(span, context) {
    // Don't trace health checks
    if (span.name.includes('health')) {
      span.addEvent('filtered');
      return;
    }
  }

  onEnd(span) {
    // Drop very fast spans in production
    if (span.duration < 1 && process.env.NODE_ENV === 'production') {
      return;
    }
  }
}

Correlation with Business Events

Connect telemetry to business metrics:

Code:

// In payment processing
async function processPayment(userId: string, amount: number) {
  const span = tracer.startSpan('payment.process');
  
  span.setAttributes({
    'user.id': userId,
    'payment.amount': amount,
    'user.tier': await getUserTier(userId),
    'payment.method': 'credit_card'
  });

  // Track business event
  metrics.recordPayment(amount);

  try {
    const result = await chargeCard(userId, amount);
    span.addEvent('payment.success', {
      'transaction.id': result.transactionId
    });
    return result;
  } catch (error) {
    span.recordException(error);
    metrics.recordPaymentFailure(amount);
    throw error;
  } finally {
    span.end();
  }
}

Deployment and Operations

Docker Container Setup

OpenTelemetry in containers:

Code:

FROM node:18-alpine

WORKDIR /app
COPY package.json .
RUN npm install

COPY . .

ENV OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4318
ENV OTEL_EXPORTER_OTLP_INSECURE=true
ENV OTEL_TRACES_EXPORTER=otlp
ENV OTEL_METRICS_EXPORTER=otlp
ENV OTEL_LOGS_EXPORTER=otlp

CMD ["node", "app.js"]

Kubernetes Integration

Use sidecar pattern for the collector:

Code:

apiVersion: v1
kind: Pod
metadata:
  name: app-with-collector
spec:
  containers:
  - name: app
    image: myapp:latest
    env:
    - name: OTEL_EXPORTER_OTLP_ENDPOINT
      value: http://localhost:4318
  - name: otel-collector
    image: otel/opentelemetry-collector:latest
    ports:
    - containerPort: 4318

FAQ

Q: Do I need to use OpenTelemetry? A: If you run multiple services, yes. It's the industry standard. For single monoliths, it's still valuable for understanding performance.

Q: Can I migrate from another tool? A: Yes. OpenTelemetry works alongside existing tools. Gradually migrate by setting up both.

Q: What's the performance overhead? A: Typically 5-15% CPU impact when batched. Auto-instrumentation is more expensive than manual.

Q: How much data should I collect? A: Start with 100% sampling in development, 5-10% in production, 100% for errors.

Q: Can I query OpenTelemetry data? A: Yes, through your backend. Jaeger, Tempo, and others have query UIs.

Q: What about privacy and data retention? A: OpenTelemetry doesn't store data—backends do. Implement retention policies (30-90 days typical).

Q: How do I handle cardinality explosion? A: Avoid using unbounded values (user IDs, order IDs) as attribute keys. Use them as values instead, and limit unique values.

Q: What's the learning curve? A: Basic instrumentation is straightforward. Advanced patterns (sampling, filtering, context propagation) take more time to master.

Moving Forward with OpenTelemetry

Observability is not optional anymore. As systems grow more complex, the ability to see what's happening becomes mission-critical. OpenTelemetry provides the foundation that lets us instrument once and adapt our observability infrastructure as our needs evolve.

Start with auto-instrumentation. It gives you 80% of the value. Then add custom spans for business logic. Ship telemetry to a backend you choose. Move from reactive firefighting to proactive understanding.

The teams we work with—across web development, SaaS, and cloud infrastructure—consistently tell us that OpenTelemetry transformed how they debug production issues. What used to take hours now takes minutes. And more importantly, they catch problems before users notice them.

OpenTelemetry: Observability for Modern Applications (2026)

The Observability Crisis We Solved

Understanding the Three Pillars of OpenTelemetry

Traces

Metrics

Logs

☁️ Is Your Cloud Costing Too Much?

Setting Up OpenTelemetry in Node.js Applications

Browser and Frontend Instrumentation

⚙️ DevOps Done Right — Zero Downtime, Full Automation

Recommended Reading

Deployment Architecture

Practical Instrumentation Patterns

Database Observability

External API Calls

Business Logic Instrumentation

Sampling Strategies for Cost Control

Key Features Comparison

Common Pitfalls and Solutions

Too Much Data, Too Little Insight

Cardinality Explosion

Performance Impact

Advanced Instrumentation Strategies

Request Context Propagation

Custom Resource Attributes

Filtering and Processing Telemetry

Correlation with Business Events

Deployment and Operations

Docker Container Setup

Kubernetes Integration

FAQ

Moving Forward with OpenTelemetry

External Resources

Viprasol Tech Team

Need DevOps & Cloud Expertise?

Making sense of your data at scale?

Related Articles

Distributed Tracing: OpenTelemetry, Jaeger, Tempo

Observability and Monitoring: Logs, Metrics, Traces

AWS CloudWatch Observability in 2026: Custom Metrics, Log Insights

OpenTelemetry for Node.js: Auto-Instrumentation, Custom Spans

SLI, SLO, and Error Budgets: Building Meaningful Observability in 2026

Monitoring and Observability Guide