GUID Database Performance & Indexing Guide

Understanding how GUIDs / UUIDs impact database performance, index behavior, and storage efficiency is crucial for building scalable applications. This guide covers indexing strategies, fragmentation issues, database-specific considerations and practical optimization techniques.

Quick recommendation: For database primary keys, prefer GUID / UUID v7 (time-ordered) over GUID / UUID v4 (random) to minimize index fragmentation and improve insert performance in B-tree indexes.

Overview: Why database performance matters

The primary performance concern with GUIDs / UUIDs in databases is their impact on index behavior, particularly B-tree indexes used by most relational databases. Random identifiers can cause poor insert performance due to index fragmentation, while sequential identifiers maintain better locality.

Random UUIDs (v4) Issues

Index fragmentation (B-tree page splits)
Poor cache locality
Higher write amplification
Degraded insert performance at scale
Increased I/O for maintenance operations

Time-Ordered UUIDs (v7) Benefits

Sequential insertion pattern
Better cache locality
Reduced page splits
Improved insert throughput
Natural chronological ordering

B-tree indexes and GUID / UUID insertion patterns

Most relational databases (PostgreSQL, MySQL, SQL Server, Oracle, SQLite and many others) use B-tree indexes for primary keys and secondary indexes. B-tree structures maintain sorted data in leaf pages, with internal nodes pointing to child pages.

How B-tree indexes work

A B-tree index organizes data in a hierarchical tree structure where:

Leaf pages contain the actual index entries (key + pointer to row)
Internal nodes contain keys and pointers to child pages
Pages have fixed size (typically 8-16 KB depending on database)
Sorted order is maintained for efficient range queries and lookups

Sequential vs random insertion

Sequential insertion (UUID v7, v6, v1):

Inserts at the end: New values append to the rightmost leaf page
Minimal page splits: Only the last page needs updates until full
Cache-friendly: Recent pages stay in buffer pool
Write locality: Reduces random I/O operations

Random insertion (UUID v4):

Inserts anywhere: New values scatter across all leaf pages
Frequent page splits: Pages fill unevenly causing splits across the tree
Cache pollution: Many different pages need to be loaded and modified
Write amplification: More disk I/O due to page management overhead

Example: Inserting 1 million UUIDs into a B-tree index:

UUID v7 (sequential): ~100-200 page splits (appending to end)
UUID v4 (random): ~50,000-100,000 page splits (scattered insertions)

Index fragmentation explained

Index fragmentation occurs when the logical order of index pages does not match their physical order on disk. This happens with random UUIDs because insertions split pages throughout the index tree.

Types of fragmentation

Logical fragmentation (internal): Pages are not fully packed, wasting space within each page
Physical fragmentation (external): Logically consecutive pages are not physically adjacent on disk
Page splits: When a page is full, it splits into two half-full pages, reducing density

Performance impact

Range scans: Slower because consecutive pages may require random disk I/O
Buffer pool efficiency: More pages needed in memory for same data volume
Insert performance: Additional I/O to read, modify, and write split pages
Index size: Fragmented indexes consume more storage space (30-50% larger)

Real-world impact: In high-volume systems, random GUID / UUID (v4) primary keys can reduce insert throughput by 30-50% compared to sequential GUID / UUIDs (v7) due to index fragmentation and cache pressure.

Mitigation strategies

Use GUID / UUID v7: Time-ordered UUIDs reduce fragmentation significantly
Rebuild indexes: Periodic index rebuilds to defragment (requires maintenance window)
Fillfactor tuning: Leave space in pages to accommodate future inserts (database-specific)
Partition tables: Split large tables to limit index size and fragmentation scope

Storage considerations

GUIDs / UUIDs are 128 bits (16 bytes), compared to typical integer primary keys like BIGINT (8 bytes) or INT (4 bytes). This affects storage, memory and I/O.

Storage overhead

Type	Size	Storage for 10M rows	Index size impact
`INT`	4 bytes	~40 MB	Baseline
`BIGINT`	8 bytes	~80 MB	2x INT
`UUID (GUID)`	16 bytes	~160 MB	4x INT, 2x BIGINT
`VARCHAR(36)`	36+ bytes	~360 MB	9x INT (if stored as string)

Memory and cache impact

Buffer pool: Larger keys mean fewer rows fit per page, reducing cache efficiency
Index nodes: Internal B-tree nodes hold fewer keys, increasing tree height and lookup cost
Foreign keys: Every foreign key reference also stores 16 bytes (multiplied across relationships)
Network transfer: Larger identifiers increase query result payload size

Best practice: Always store GUIDs / UUIDs using native UUID or BINARY(16) types, never as VARCHAR(36) or CHAR(36). String storage wastes space and degrades performance.

Database-specific considerations

PostgreSQL

PostgreSQL has native UUID type and excellent UUID support.

Native type: Use UUID type (16 bytes storage)
Extensions: uuid-ossp for v1/v4, pgcrypto for v4
UUID v7: Use custom function or application-generated values
Index type: B-tree by default, consider BRIN for time-ordered UUIDs on large tables

Example: Creating UUID primary key

-- Using UUID v4 (random)
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    username VARCHAR(100) NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Using UUID v7 (time-ordered, application-generated)
CREATE TABLE events (
    id UUID PRIMARY KEY, -- Generated by application
    event_type VARCHAR(50) NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Index performance check
SELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE schemaname = 'public';

MySQL / MariaDB

MySQL 8.0+ and MariaDB 10.7+ have native UUID support. Earlier versions require BINARY(16).

MySQL 8.0+: UUID type and UUID_TO_BIN() / BIN_TO_UUID() functions
Older versions: Use BINARY(16) for storage
InnoDB: Clustered index on primary key makes sequential inserts critical
UUID v1 reordering: MySQL's UUID_TO_BIN(uuid, 1) reorders v1 for time-sorting

Example: UUID with InnoDB

-- MySQL 8.0+ with native UUID
CREATE TABLE products (
    id BINARY(16) PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_created (created_at)
);

-- Insert with UUID conversion
INSERT INTO products (id, name)
VALUES (UUID_TO_BIN(UUID()), 'Product Name');

-- Query with conversion back
SELECT BIN_TO_UUID(id) as id, name
FROM products
WHERE id = UUID_TO_BIN('550e8400-e29b-41d4-a716-446655440000');

-- Check fragmentation
SELECT TABLE_NAME, DATA_FREE, DATA_LENGTH
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'your_database';

InnoDB warning: InnoDB uses a clustered index where table data is stored in primary key order. Random UUID v4 primary keys cause severe fragmentation and poor insert performance. Always prefer UUID v7 for InnoDB tables.

Microsoft SQL Server

SQL Server has UNIQUEIDENTIFIER type and NEWSEQUENTIALID() for sequential GUIDs.

Native type: UNIQUEIDENTIFIER (16 bytes)
Sequential GUIDs: NEWSEQUENTIALID() generates time-based sequential values
Byte ordering: SQL Server reorders bytes differently than RFC 4122 standard
Clustered indexes: Sequential GUIDs essential for clustered primary keys

Example: Sequential GUID in SQL Server

-- Using NEWSEQUENTIALID() for sequential GUIDs
CREATE TABLE Orders (
    OrderId UNIQUEIDENTIFIER PRIMARY KEY DEFAULT NEWSEQUENTIALID(),
    CustomerName NVARCHAR(200) NOT NULL,
    OrderDate DATETIME2 DEFAULT GETDATE()
);

-- Index fragmentation check
SELECT 
    OBJECT_NAME(ips.object_id) AS TableName,
    i.name AS IndexName,
    ips.avg_fragmentation_in_percent,
    ips.page_count
FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'DETAILED') ips
JOIN sys.indexes i ON ips.object_id = i.object_id AND ips.index_id = i.index_id
WHERE ips.avg_fragmentation_in_percent > 10
ORDER BY ips.avg_fragmentation_in_percent DESC;

SQLite

SQLite has no native UUID type but can store as BLOB or TEXT.

Storage: Use BLOB (16 bytes) for efficiency or TEXT for readability
B-tree index: SQLite uses B-tree indexes; sequential UUIDs improve performance
Generation: No built-in UUID functions, use application layer
Page size: Default 4KB pages fill faster with UUIDs vs integers

Example: UUID in SQLite

-- Store as BLOB for efficiency
CREATE TABLE sessions (
    id BLOB PRIMARY KEY,  -- 16 bytes
    user_id INTEGER NOT NULL,
    expires_at INTEGER NOT NULL,
    FOREIGN KEY (user_id) REFERENCES users(id)
);

-- Insert (application generates UUID and converts to bytes)
-- Example with Python: uuid.uuid4().bytes
INSERT INTO sessions (id, user_id, expires_at)
VALUES (X'550e8400e29b41d4a716446655440000', 1, 1735689600);

-- Query
SELECT hex(id) as id, user_id FROM sessions;

MongoDB

MongoDB uses ObjectId by default but supports UUID storage as BinData.

ObjectId: 12-byte time-ordered identifier (similar concept to UUID v7)
UUID support: Store as BinData subtype 4 (UUID)
Indexes: B-tree style indexes benefit from time-ordered identifiers
Sharding: Random UUIDs distribute well across shards (ObjectId also works)

Example: UUID in MongoDB

// Using ObjectId (default, recommended)
db.users.insertOne({
    _id: ObjectId(),  // 12 bytes, time-ordered
    username: "john_doe",
    createdAt: new Date()
});

// Using UUID (if required for interoperability)
db.users.insertOne({
    _id: UUID("550e8400-e29b-41d4-a716-446655440000"),
    username: "jane_doe",
    createdAt: new Date()
});

// Index stats
db.users.stats().indexSizes;

Practical recommendations

Choosing the right GUID / UUID version

Use Case	Recommended	Reason
Database primary keys	GUID / UUID v7	Time-ordered, minimal fragmentation
Distributed systems	GUID / UUID v7	Time-ordered + no coordination
Privacy-sensitive IDs	GUID / UUID v4	No embedded metadata
Public API identifiers	GUID / UUID v4	Unpredictable, no time leakage
Legacy compatibility	GUID / UUID v1	Existing systems using v1

Database optimization checklist

Use native UUID types: UUID (PostgreSQL), BINARY(16) (MySQL), UNIQUEIDENTIFIER (SQL Server)
Never store as VARCHAR: String storage wastes 2-3x space and degrades performance
Prefer UUID v7 for primary keys: Minimizes fragmentation in B-tree indexes
Monitor fragmentation: Check index health regularly and rebuild when needed
Consider partitioning: For very large tables (100M+ rows), partition by time range
Tune fillfactor: Leave space in index pages for future inserts (e.g., 90% fillfactor)
Benchmark your workload: Read-heavy vs write-heavy patterns impact UUID choice

When NOT to use UUIDs

Small tables: Integer IDs are simpler and more efficient for small datasets
Extreme performance requirements: Auto-increment integers provide best performance
Storage-constrained systems: UUIDs consume 2-4x more space than integers
Simple single-database apps: No need for distributed ID generation

Performance benchmarks

Real-world performance varies by database, hardware, and workload. These approximate benchmarks illustrate typical differences:

Insert performance comparison

Key Type	Inserts/sec (10M rows)	Index Size	Fragmentation
`BIGINT AUTO_INCREMENT`	~50,000	~120 MB	<5%
UUID v7 (time-ordered)	~40,000	~220 MB	<10%
UUID v1 (timestamp-based)	~35,000	~230 MB	10-15%
UUID v4 (random)	~25,000	~280 MB	30-50%

Approximate benchmarks on PostgreSQL 15 with 16GB RAM, SSD storage. Actual performance varies by hardware and configuration.

Query performance impact

Point lookups: GUID / UUID indexes ~10-20% slower than integer indexes (larger keys)
Range scans: Sequential GUIDs / UUIDs comparable to integers; random GUIDs / UUIDs 2-3x slower
Join performance: GUID / UUID joins slightly slower due to larger key comparison
Index-only scans: GUIDs / UUIDs reduce number of rows cached per page

Recommendation: Always benchmark with your specific workload. Read-heavy applications may not notice GUID / UUID overhead, while write-heavy systems benefit significantly from sequential GUIDs / UUIDs.

Conclusion

GUIDs / UUIDs offer significant advantages for distributed systems and avoiding ID coordination, but come with performance trade-offs in databases. The choice between random (GUID / UUID v4) and time-ordered (GUID / UUID v7) identifiers has substantial impact on index behavior and insert performance.

For most database use cases, GUID / UUID v7 provides the best balance: standard-compliant, time-ordered for index efficiency, and no coordination overhead. Random GUID / UUID v4 remains appropriate for privacy-sensitive contexts or public API identifiers where time leakage is unacceptable.

Always use native GUID / UUID storage types, monitor index fragmentation and benchmark with representative workloads to optimize for your specific requirements.

Learn more

These articles expand on related concepts, formats and practical considerations.

GUID V7 - Unix Time-Ordered Identifier (Recommended)

GUID v7 is a modern time-ordered identifier combining Unix millisecond timestamps with randomness. Learn about its structure, database performance benefits, advantages over v1 and v4, and when to use v7 for sortable distributed IDs.

Read

GUID V4 - Random-Based Identifier (Recommended)

GUID v4 is the most widely used random-based identifier. Learn about its structure, collision probability, advantages and why it is recommended for general-purpose applications.

Read

GUID Version 1-8 - The Complete Comparison

Comprehensive comparison of all GUID versions (v1-v8). Understand the differences, use cases, security implications and performance characteristics of each version to choose the right one for your project.

Read