Schema Reference¶
This page provides a complete reference to all views, data types, and field definitions available in Micromegas SQL queries.
Views Overview¶
Micromegas organizes telemetry data into several views that can be queried using SQL:
View | Description | Use Cases |
---|---|---|
processes |
Process metadata and system information | System overview, process tracking |
streams |
Data stream information within processes | Stream debugging, data flow analysis |
blocks |
Core telemetry block metadata | Low-level data inspection |
log_entries |
Application log messages with levels | Error tracking, debugging, monitoring |
measures |
Numeric metrics and performance data | Performance monitoring, alerting |
thread_spans |
Synchronous execution spans and timing | Performance profiling, call tracing |
async_events |
Asynchronous event lifecycle tracking | Async operation monitoring |
Core Views¶
processes
¶
Contains metadata about processes that have sent telemetry data.
Field | Type | Description |
---|---|---|
process_id |
Dictionary(Int16, Utf8) |
Unique identifier for the process |
exe |
Dictionary(Int16, Utf8) |
Executable name |
username |
Dictionary(Int16, Utf8) |
User who ran the process |
realname |
Dictionary(Int16, Utf8) |
Real name of the user |
computer |
Dictionary(Int16, Utf8) |
Computer/hostname |
distro |
Dictionary(Int16, Utf8) |
Operating system distribution |
cpu_brand |
Dictionary(Int16, Utf8) |
CPU brand information |
tsc_frequency |
UInt64 |
Time stamp counter frequency |
start_time |
Timestamp(Nanosecond) |
Process start time |
start_ticks |
UInt64 |
Process start time in ticks |
insert_time |
Timestamp(Nanosecond) |
When the process data was first inserted |
parent_process_id |
Dictionary(Int16, Utf8) |
Parent process identifier |
properties |
Map |
Additional process metadata |
last_update_time |
Timestamp(Nanosecond) |
When the process data was last updated |
Example Queries:
-- Get all processes from the last day
SELECT process_id, exe, computer, start_time
FROM processes
WHERE start_time >= NOW() - INTERVAL '1 day'
ORDER BY start_time DESC;
-- Find processes by executable name
SELECT process_id, exe, username, computer
FROM processes
WHERE exe LIKE '%analytics%';
streams
¶
Contains information about data streams within processes.
Field | Type | Description |
---|---|---|
stream_id |
Dictionary(Int16, Utf8) |
Unique identifier for the stream |
process_id |
Dictionary(Int16, Utf8) |
Reference to the parent process |
dependencies_metadata |
Various | Stream dependency metadata |
objects_metadata |
Various | Stream object metadata |
tags |
Various | Stream tags |
properties |
Various | Stream properties |
insert_time |
Timestamp(Nanosecond) |
When the stream data was first inserted |
last_update_time |
Timestamp(Nanosecond) |
When the stream data was last updated |
Example Queries:
-- Get streams for a specific process
SELECT stream_id, tags, properties
FROM streams
WHERE process_id = 'my_process_123';
-- Join streams with process information
SELECT s.stream_id, s.tags, p.exe, p.computer
FROM streams s
JOIN processes p ON s.process_id = p.process_id;
blocks
¶
Core table containing telemetry block metadata with joined process and stream information.
Field | Type | Description |
---|---|---|
block_id |
Utf8 |
Unique identifier for the block |
stream_id |
Utf8 |
Stream identifier |
process_id |
Utf8 |
Process identifier |
begin_time |
Timestamp(Nanosecond) |
Block start time |
begin_ticks |
Int64 |
Block start time in ticks |
end_time |
Timestamp(Nanosecond) |
Block end time |
end_ticks |
Int64 |
Block end time in ticks |
nb_objects |
Int32 |
Number of objects in block |
object_offset |
Int64 |
Offset to objects in storage |
payload_size |
Int64 |
Size of block payload |
insert_time |
Timestamp(Nanosecond) |
When block was inserted |
Joined Process Fields:
Field | Type | Description |
---|---|---|
processes.start_time |
Timestamp(Nanosecond) |
Process start time |
processes.start_ticks |
Int64 |
Process start ticks |
processes.tsc_frequency |
Int64 |
Time stamp counter frequency |
processes.exe |
Utf8 |
Executable name |
processes.username |
Utf8 |
User who ran the process |
processes.realname |
Utf8 |
Real name of the user |
processes.computer |
Utf8 |
Computer/hostname |
processes.distro |
Utf8 |
Operating system distribution |
processes.cpu_brand |
Utf8 |
CPU brand information |
Example Queries:
-- Analyze block sizes and object counts
SELECT
process_id,
AVG(payload_size) as avg_block_size,
AVG(nb_objects) as avg_objects_per_block,
COUNT(*) as total_blocks
FROM blocks
WHERE insert_time >= NOW() - INTERVAL '1 hour'
GROUP BY process_id;
Observability Data Views¶
log_entries
¶
Text-based log entries with levels and structured data.
Field | Type | Description |
---|---|---|
process_id |
Dictionary(Int16, Utf8) |
Process identifier |
stream_id |
Dictionary(Int16, Utf8) |
Stream identifier |
block_id |
Dictionary(Int16, Utf8) |
Block identifier |
insert_time |
Timestamp(Nanosecond) |
Block insertion time |
exe |
Dictionary(Int16, Utf8) |
Executable name |
username |
Dictionary(Int16, Utf8) |
User who ran the process |
computer |
Dictionary(Int16, Utf8) |
Computer/hostname |
time |
Timestamp(Nanosecond) |
Log entry timestamp |
target |
Dictionary(Int16, Utf8) |
Module/target |
level |
Int32 |
Log level (see Log Levels) |
msg |
Utf8 |
Log message |
properties |
List<Struct> |
Log-specific properties |
process_properties |
List<Struct> |
Process-specific properties |
Log Levels¶
Micromegas uses numeric log levels for efficient filtering:
Level | Name | Description |
---|---|---|
1 | Fatal | Critical errors that cause application termination |
2 | Error | Errors that don't stop execution but need attention |
3 | Warn | Warning conditions that might cause problems |
4 | Info | Informational messages about normal operation |
5 | Debug | Detailed information for debugging |
6 | Trace | Very detailed tracing information |
Example Queries:
-- Get recent error and warning logs
SELECT time, process_id, level, target, msg
FROM log_entries
WHERE level <= 3 -- Fatal, Error, Warn
AND time >= NOW() - INTERVAL '1 hour'
ORDER BY time DESC;
-- Count logs by level for a specific process
SELECT level, COUNT(*) as count
FROM view_instance('log_entries', 'my_process_123')
WHERE time >= NOW() - INTERVAL '1 day'
GROUP BY level
ORDER BY level;
measures
¶
Numerical measurements and counters.
Field | Type | Description |
---|---|---|
process_id |
Dictionary(Int16, Utf8) |
Process identifier |
stream_id |
Dictionary(Int16, Utf8) |
Stream identifier |
block_id |
Dictionary(Int16, Utf8) |
Block identifier |
insert_time |
Timestamp(Nanosecond) |
Block insertion time |
exe |
Dictionary(Int16, Utf8) |
Executable name |
username |
Dictionary(Int16, Utf8) |
User who ran the process |
computer |
Dictionary(Int16, Utf8) |
Computer/hostname |
time |
Timestamp(Nanosecond) |
Measurement timestamp |
target |
Dictionary(Int16, Utf8) |
Module/target |
name |
Dictionary(Int16, Utf8) |
Metric name |
unit |
Dictionary(Int16, Utf8) |
Measurement unit |
value |
Float64 |
Metric value |
properties |
List<Struct> |
Metric-specific properties |
process_properties |
List<Struct> |
Process-specific properties |
Example Queries:
-- Get CPU metrics over time
SELECT time, value, unit
FROM measures
WHERE name = 'cpu_usage'
AND time >= NOW() - INTERVAL '1 hour'
ORDER BY time;
-- Aggregate memory usage by process
SELECT
process_id,
AVG(value) as avg_memory,
MAX(value) as peak_memory,
unit
FROM measures
WHERE name LIKE '%memory%'
AND time >= NOW() - INTERVAL '1 hour'
GROUP BY process_id, unit;
thread_spans
¶
Derived view for analyzing span durations and hierarchies. Access via view_instance('thread_spans', stream_id)
.
Field | Type | Description |
---|---|---|
id |
Int64 |
Span identifier |
parent |
Int64 |
Parent span identifier |
depth |
UInt32 |
Nesting depth in call tree |
hash |
UInt32 |
Span hash for deduplication |
begin |
Timestamp(Nanosecond) |
Span start time |
end |
Timestamp(Nanosecond) |
Span end time |
duration |
Int64 |
Span duration in nanoseconds |
name |
Dictionary(Int16, Utf8) |
Span name (function) |
target |
Dictionary(Int16, Utf8) |
Module/target |
filename |
Dictionary(Int16, Utf8) |
Source file |
line |
UInt32 |
Line number |
Example Queries:
-- Get slowest functions in a stream
SELECT name, AVG(duration) as avg_duration_ns, COUNT(*) as call_count
FROM view_instance('thread_spans', 'stream_123')
WHERE duration > 1000000 -- > 1ms
GROUP BY name
ORDER BY avg_duration_ns DESC
LIMIT 10;
-- Analyze call hierarchy
SELECT depth, name, duration
FROM view_instance('thread_spans', 'stream_123')
WHERE parent = 42 -- specific parent span
ORDER BY begin;
async_events
¶
Asynchronous span events for tracking async operations with call hierarchy depth information.
Field | Type | Description |
---|---|---|
stream_id |
Dictionary(Int16, Utf8) |
Thread stream identifier |
block_id |
Dictionary(Int16, Utf8) |
Block identifier |
time |
Timestamp(Nanosecond) |
Event timestamp |
event_type |
Dictionary(Int16, Utf8) |
"begin" or "end" |
span_id |
Int64 |
Async span identifier |
parent_span_id |
Int64 |
Parent span identifier |
depth |
UInt32 |
Nesting depth in async call hierarchy |
name |
Dictionary(Int16, Utf8) |
Span name (function) |
filename |
Dictionary(Int16, Utf8) |
Source file |
target |
Dictionary(Int16, Utf8) |
Module/target |
line |
UInt32 |
Line number |
Example Queries:
-- Find top-level async operations (depth = 0) with performance metrics
SELECT
name,
depth,
AVG(duration_ms) as avg_duration,
COUNT(*) as operation_count
FROM (
SELECT
begin_events.name,
begin_events.depth,
CAST((end_events.time - begin_events.time) AS BIGINT) / 1000000 as duration_ms
FROM
(SELECT * FROM view_instance('async_events', 'my_process_123') WHERE event_type = 'begin') begin_events
LEFT JOIN
(SELECT * FROM view_instance('async_events', 'my_process_123') WHERE event_type = 'end') end_events
ON begin_events.span_id = end_events.span_id
WHERE end_events.span_id IS NOT NULL AND begin_events.depth = 0
)
GROUP BY name, depth
ORDER BY avg_duration DESC;
-- Compare performance by call depth
SELECT
depth,
COUNT(*) as span_count,
AVG(duration_ms) as avg_duration,
MIN(duration_ms) as min_duration,
MAX(duration_ms) as max_duration
FROM (
SELECT
begin_events.depth,
CAST((end_events.time - begin_events.time) AS BIGINT) / 1000000 as duration_ms
FROM
(SELECT * FROM view_instance('async_events', 'my_process_123') WHERE event_type = 'begin') begin_events
LEFT JOIN
(SELECT * FROM view_instance('async_events', 'my_process_123') WHERE event_type = 'end') end_events
ON begin_events.span_id = end_events.span_id
WHERE end_events.span_id IS NOT NULL
)
GROUP BY depth
ORDER BY depth;
-- Find operations that spawn many nested async calls
SELECT
name,
depth,
COUNT(*) as nested_count
FROM view_instance('async_events', 'my_process_123')
WHERE depth > 0 AND event_type = 'begin'
GROUP BY name, depth
HAVING COUNT(*) > 5 -- Functions that create multiple nested async operations
ORDER BY nested_count DESC, depth DESC;
-- Analyze async call hierarchy and parent-child relationships
SELECT
parent.name as parent_operation,
parent.depth as parent_depth,
child.name as child_operation,
child.depth as child_depth,
COUNT(*) as relationship_count
FROM view_instance('async_events', 'my_process_123') parent
JOIN view_instance('async_events', 'my_process_123') child
ON parent.span_id = child.parent_span_id
WHERE parent.event_type = 'begin' AND child.event_type = 'begin'
GROUP BY parent.name, parent.depth, child.name, child.depth
ORDER BY relationship_count DESC;
-- Filter async operations by depth level for focused analysis
-- Shallow operations only (depth <= 2)
SELECT name, event_type, time, depth, span_id
FROM view_instance('async_events', 'my_process_123')
WHERE depth <= 2
ORDER BY time;
-- Deep nested operations only (depth >= 3)
SELECT name, depth, COUNT(*) as deep_operation_count
FROM view_instance('async_events', 'my_process_123')
WHERE depth >= 3 AND event_type = 'begin'
GROUP BY name, depth
ORDER BY depth DESC, deep_operation_count DESC;
-- Track async operation lifecycle with depth context
SELECT time, event_type, name, span_id, parent_span_id, depth
FROM view_instance('async_events', 'my_process_123')
WHERE span_id = 12345
ORDER BY time;
Data Types¶
Properties¶
Key-value pairs stored as List<Struct>
with the following structure:
Common properties fields:
properties
- Event-specific metadata (log properties, metric properties)process_properties
- Process-wide metadata shared across all events from a process
Querying properties:
-- Access property values using property_get function
SELECT property_get(process_properties, 'thread-name') as thread_name
FROM log_entries
WHERE property_get(process_properties, 'thread-name') IS NOT NULL;
Dictionary Compression¶
Most string fields use dictionary compression (Dictionary(Int16, Utf8)
) for storage efficiency:
- Reduces storage space for repeated values
- Improves query performance
- Transparent to SQL queries - use as normal strings
Timestamps¶
All time fields use Timestamp(Nanosecond)
precision:
- Nanosecond resolution for high-precision timing
- UTC timezone assumed
- Compatible with standard SQL time functions
View Relationships¶
Views can be joined to combine information:
-- Join log entries with process information
SELECT l.time, l.level, l.msg, p.exe, p.computer
FROM log_entries l
JOIN processes p ON l.process_id = p.process_id
WHERE l.level <= 2; -- Fatal and Error only
-- Join measures with stream information
SELECT m.time, m.name, m.value, s.tags
FROM measures m
JOIN streams s ON m.stream_id = s.stream_id
WHERE m.name = 'cpu_usage';
Performance Considerations¶
Dictionary Fields¶
Dictionary-compressed fields are optimized for:
- Equality comparisons (
field = 'value'
) - IN clauses (
field IN ('val1', 'val2')
) - LIKE patterns on repeated values
Time-based Queries¶
Always use time ranges for optimal performance:
-- Good - uses time index
WHERE time >= NOW() - INTERVAL '1 hour'
-- Avoid - full table scan
WHERE level <= 3
View Instances¶
Use view_instance()
for process-specific queries:
-- Better performance for single process
SELECT * FROM view_instance('log_entries', 'process_123')
-- Less efficient for single process
SELECT * FROM log_entries WHERE process_id = 'process_123'
Next Steps¶
- Functions Reference - SQL functions available for queries
- Query Patterns - Common observability query patterns
- Performance Guide - Optimize your queries for best performance