Data Discovery and Inspection

This guide demonstrates how to programmatically explore the Mosaico Data Platform to discover ingested sequences and inspect their internal structures.

By following this guide, you will learn how to:

List all sequences available on a remote server.
Access high-level metadata (size, creation time, duration) for a specific recording session.
Drill down into individual topics to identify sensor types and sampling spans.

Prerequisites

This tutorial assumes you have already ingested data into your Mosaico instance, using the example described in the ROS Injection guide.

Experiment Yourself

This guide is fully executable.

Start the Mosaico Infrastructure
Run the example
```
mosaicolabs.examples data_inspection
```

Full Code

The full code of the example is available here.

In Depth Explanation

Step 1: Connecting to the Catalog¶

The first step is establishing a control connection. We use the MosaicoClient within a context manager to ensure that network resources and internal connection pools are gracefully released when the script finishes.

from mosaicolabs import MosaicoClient

# Connect to the gateway of the Mosaico Data Platform
with MosaicoClient.connect(host=MOSAICO_HOST, port=MOSAICO_PORT) as client:
    # Retrieve a simple list of all unique sequence identifiers
    seq_list = client.list_sequences()
    print(f"Discovered {len(seq_list)} sequences on the server.")

Step 2: Inspecting Sequence Metadata¶

A Sequence represents a holistic recording session. To inspect its metadata without downloading bulk data, we request a SequenceHandler. This object acts as a "lazy" proxy to the server-side resource.

# Assuming we are iterating through the seq_list from Step 1
for sequence_name in seq_list:
    shandler = client.sequence_handler(sequence_name)

    if shandler:
        # Access physical and logical diagnostics
        size_mb = shandler.total_size_bytes / (1024 * 1024)
        print(f"Sequence: {shandler.name}")
        print(f"• Remote Size: {size_mb:.2f} MB")
        print(f"• Created At:  {shandler.created_datetime}")

        # Determine the global temporal bounds of the entire session
        start, end = shandler.timestamp_ns_min, shandler.timestamp_ns_max
        print(f"• Time Span:   {start} to {end} ns")

Step 3: Inspecting Individual Topics¶

Inside a sequence, data is partitioned into Topics, each corresponding to a specific sensor stream or data channel. We can use the SequenceHandler to spawn a TopicHandler for granular inspection.

# Iterate through all data channels in this sequence
for topic_name in shandler.topics:
    # Obtain a handler for the specific sensor stream
    thandler = shandler.get_topic_handler(topic_name)

    # Identify the semantic type of the data (e.g., 'imu', 'image')
    ontology = thandler.ontology_tag 

    # Calculate duration for this specific sensor
    duration_sec = 0
    if thandler.timestamp_ns_min and thandler.timestamp_ns_max:
        duration_sec = (thandler.timestamp_ns_max - thandler.timestamp_ns_min) / 1e9

    print(f"  - [{ontology}] {topic_name}: {duration_sec:.2f}s of data")

Comparisons¶

Sequence vs. Topic Handlers¶

Feature	Sequence Handler	Topic Handler
Scope	Entire Recording Session	Single Sensor
Metadata	Mission-wide (e.g., driver, weather)	Sensor-specific (e.g., model, serial)
Time Bounds	Global min/max of all topics	Min/max for that specific stream
Topics	List of all available streams	N/A

Catalog Layer vs. Data Layer (Handlers vs Streamers)¶

Feature	Handlers (Catalog Layer)	Streamers (Data Layer)
Primary Use	Metadata inspection & discovery	High-volume data retrieval
Object Type	`SequenceHandler` / `TopicHandler`	`SequenceDataStreamer` / `TopicDataStreamer`
Data Scope	Size, Timestamps, Ontology Tags	Raw sensor messages