Synthetic Data Pipeline

Build a production-grade pipeline that procedurally generates driving scenes, renders multi-sensor data, auto-labels every frame with pixel-perfect annotations, and exports nuScenes-compliant datasets ready for perception model training.

Track: B -- Synthetic Data & Sensor Sim | Level: Intermediate | Total Time: ~20 hours

Overview

In this project you will build an end-to-end synthetic data generation pipeline for autonomous driving perception. Starting from a programmable scene graph, you will compose realistic driving environments with roads, vehicles, pedestrians, and traffic infrastructure. You will then render those scenes through simulated cameras and lidar sensors, automatically extract ground-truth labels (3D bounding boxes, 2D projections, semantic segmentation, depth maps), and export everything in the nuScenes dataset format so it can be consumed directly by standard perception training frameworks.

Synthetic data is one of the highest-leverage tools available to autonomous driving teams. Real-world data collection costs $5--10 per labeled frame, takes weeks to annotate, and still suffers from class imbalance -- rare actors like scooters, construction workers, or animals appear in less than 2% of drives. Simulation flips every one of those constraints: frames cost fractions of a cent in GPU time, labels are instantaneous and mathematically perfect, and you can spawn any object in any configuration on demand. Companies like Waymo, Cruise, and Applied Intuition generate billions of synthetic frames per year to supplement real data, stress-test perception models on long-tail scenarios, and validate safety-critical behaviors that would be dangerous to stage on public roads.

By the end of this project you will have a working Python package that takes a scene specification (or generates one randomly), renders multi-camera and lidar data, produces a complete set of annotations, and writes a valid nuScenes dataset to disk. You will also build a domain randomization engine that varies lighting, weather, textures, and actor placement so the generated data is diverse enough to improve real-world model generalization. The final deliverable is a CLI tool that can batch-generate hundreds of annotated scenes with a single command.

Learning Objectives

After completing this project, you will be able to:

Design and manipulate a 3D scene graph -- create hierarchical object trees, attach transforms, and query spatial relationships between actors in a driving environment.
Implement sensor simulation from first principles -- build a pinhole camera projection pipeline and a raycasting-based lidar simulator, understanding intrinsics, extrinsics, and noise models.
Generate pixel-perfect auto-labels -- extract 3D bounding boxes, 2D projected boxes, semantic segmentation maps, instance segmentation, and depth maps directly from the scene graph without any manual annotation.
Write nuScenes-compliant dataset exports -- produce the full relational database structure (scenes, samples, sample_data, ego_pose, calibrated_sensor, annotations) that loads cleanly with the official nuscenes-devkit.
Apply domain randomization -- systematically vary environmental parameters (lighting, weather, textures, actor density) to maximize dataset diversity and reduce the sim-to-real domain gap.
Validate synthetic data quality -- build checks for annotation coverage, class distribution balance, calibration consistency, and format compliance.
Orchestrate batch generation -- combine all components into a configurable pipeline that scales from single-scene debugging to large-scale dataset production.

Prerequisites

Required

Python proficiency -- comfortable with classes, file I/O, NumPy array operations, and building CLI tools.
3D coordinate systems -- understanding of 3D Cartesian coordinates, rotation representations (Euler angles, quaternions, rotation matrices), and rigid-body transforms.
JSON and data formats -- ability to read/write structured data, work with nested schemas, and validate against expected structures.

Basic rendering concepts -- familiarity with projection, rasterization, and the difference between a mesh and a point cloud.
nuScenes format familiarity -- having browsed the nuScenes schema documentation or loaded a mini dataset.
Open3D basics -- prior exposure to Open3D for 3D visualization helps but is not strictly necessary.

Deep Dive Reading

Before starting, read the companion deep dive for theoretical background:

Synthetic Data for AD Perception Training -- covers domain randomization theory, domain adaptation techniques (FDA, CyCADA), mixed training strategies, and cost/benefit analysis.

Key Concepts

Scene Composition

A driving scene is built from four layers of content:

Road layout -- lane geometry, road boundaries, intersections, and drivable surfaces. In a minimal pipeline this can be a textured ground plane or a set of parameterized road segments.
Static infrastructure -- buildings, trees, traffic signs, lane markings, barriers. These objects are placed once and do not move between frames.
Dynamic actors -- vehicles, pedestrians, cyclists. Each has a pose trajectory over time (position, heading per timestep).
Traffic signals -- lights with state (green/yellow/red) that change according to a programmed schedule.

Asset placement follows spatial constraints: vehicles stay in lanes, pedestrians on sidewalks, signs at road edges. A good scene composition system enforces these constraints while still allowing randomization within valid bounds.

Scene Graph Structure:
=====================

SceneRoot
  +-- RoadNetwork
  |     +-- Lane_0  (geometry, markings)
  |     +-- Lane_1
  |     +-- Intersection_0
  +-- StaticObjects
  |     +-- Building_0  (mesh, pose)
  |     +-- Tree_0
  |     +-- TrafficSign_0
  +-- DynamicActors
  |     +-- Vehicle_0   (mesh, trajectory, class="car")
  |     +-- Vehicle_1   (mesh, trajectory, class="truck")
  |     +-- Pedestrian_0 (mesh, trajectory, class="pedestrian")
  +-- TrafficSignals
  |     +-- Light_0     (state_schedule, pose)
  +-- EgoPose
        +-- frame_0: (x, y, z, qw, qx, qy, qz)
        +-- frame_1: ...

Coordinate Systems and Transforms

Three coordinate frames matter in the pipeline:

Frame	Origin	Convention	Use
World	Arbitrary map origin	Right-handed, Z-up	Scene graph, object placement
Ego	Center of ego vehicle rear axle	X-forward, Y-left, Z-up	Sensor mounting, annotations
Sensor	Sensor optical center	Varies by sensor type	Rendering, raw data

Converting between frames requires rigid-body transforms (rotation + translation). In homogeneous coordinates:

import numpy as np

def make_transform(rotation_matrix: np.ndarray, translation: np.ndarray) -> np.ndarray:
    """Build a 4x4 homogeneous transform from R (3x3) and t (3,)."""
    T = np.eye(4)
    T[:3, :3] = rotation_matrix
    T[:3, 3] = translation
    return T

# Transform a point from sensor frame to world frame:
# p_world = T_world_ego @ T_ego_sensor @ p_sensor
T_world_ego = make_transform(R_ego, t_ego)       # ego pose in world
T_ego_sensor = make_transform(R_sensor, t_sensor) # sensor extrinsics
p_sensor_h = np.array([x, y, z, 1.0])             # homogeneous point
p_world_h = T_world_ego @ T_ego_sensor @ p_sensor_h

Camera intrinsics map 3D points in the camera frame to 2D pixel coordinates:

K = [[fx,  0, cx],
     [ 0, fy, cy],
     [ 0,  0,  1]]

[u, v, 1]^T = (1/z) * K @ [x, y, z]^T

Camera extrinsics define where the camera sits relative to the ego vehicle (rotation + translation from ego frame to camera frame).

Sensor Rendering

Camera rendering follows the classical pinhole model. For each 3D point visible to the camera: (1) transform from world to camera frame, (2) project through the intrinsic matrix, (3) check that the point lies within the image bounds and in front of the camera (z > 0). For image rendering, you can use Open3D's offscreen renderer or a simple Z-buffer rasterizer.

Lidar rendering works via raycasting. For each beam direction in the lidar's scan pattern, cast a ray from the sensor origin into the scene. The first intersection with a mesh surface gives the range measurement. Typical configurations:

Parameter	Velodyne VLP-32C	Typical Sim Config
Beams	32	32--64
Horizontal FoV	360 deg	360 deg
Vertical FoV	-25 to +15 deg	-30 to +10 deg
Range	200 m	100 m
Points/frame	~70,000	50,000--100,000

After raycasting, add realistic noise: Gaussian range noise (sigma ~ 0.02 m), random dropouts (1--5% of points), and intensity values derived from surface material properties.

Auto-Labeling

This is where synthetic data shines. Because you control the scene graph, every label is derived analytically:

3D bounding boxes: Read directly from each object's pose (center x, y, z), dimensions (length, width, height), and heading. No estimation needed.
2D bounding boxes: Project the 8 corners of each 3D box through the camera intrinsics/extrinsics, take the axis-aligned bounding rectangle of the projected points.
Semantic segmentation: During rendering, assign each pixel the class ID of the object it belongs to. If using a rasterizer, this comes from the material/object ID buffer.
Instance segmentation: Same as semantic segmentation but with unique per-object IDs instead of per-class IDs.
Depth maps: Store the Z-buffer value (distance from camera plane) for each pixel during rendering.

def project_3d_box_to_2d(box_corners_3d, T_cam_world, K):
    """Project 8 corners of a 3D box to get a 2D bounding box."""
    # box_corners_3d: (8, 3) array of corner positions in world frame
    corners_h = np.hstack([box_corners_3d, np.ones((8, 1))])  # (8, 4)
    corners_cam = (T_cam_world @ corners_h.T).T  # (8, 4) in camera frame
    corners_cam = corners_cam[:, :3]  # drop homogeneous

    # Filter points behind camera
    valid = corners_cam[:, 2] > 0
    if not np.any(valid):
        return None

    corners_2d = (K @ corners_cam[valid].T).T  # (N, 3)
    corners_2d = corners_2d[:, :2] / corners_2d[:, 2:3]  # perspective divide

    x_min, y_min = corners_2d.min(axis=0)
    x_max, y_max = corners_2d.max(axis=0)
    return [x_min, y_min, x_max, y_max]

Domain Randomization

Domain randomization is a technique for training models on synthetic data that transfer well to the real world. The core insight, formalized by Tobin et al. (2017), is: if a model sees enough variation during training, the real world becomes "just another variation."

Key axes of randomization for driving scenes:

Axis	Parameters	Range
Lighting	Sun elevation, azimuth, intensity	10--80 deg elevation, full azimuth
Weather	Rain droplets, fog density, wet road reflections	Clear / light rain / heavy rain / fog
Time of day	Ambient light color temperature, shadow length	Dawn / noon / dusk / night
Textures	Road surface, building facades, vehicle paint	Texture pool with 10+ variants each
Actor density	Number of vehicles, pedestrians per scene	3--30 vehicles, 0--20 pedestrians
Actor placement	Lane position offset, lateral jitter	+/- 0.3 m lateral, +/- 2 m longitudinal

The randomization should be controlled by a configuration file so experiments are reproducible:

randomization_config = {
    "lighting": {
        "sun_elevation_range": [10, 80],      # degrees
        "sun_azimuth_range": [0, 360],         # degrees
        "intensity_range": [0.6, 1.0],         # normalized
    },
    "weather": {
        "options": ["clear", "light_rain", "heavy_rain", "fog"],
        "weights": [0.5, 0.2, 0.15, 0.15],
    },
    "actors": {
        "num_vehicles_range": [3, 30],
        "num_pedestrians_range": [0, 20],
        "lateral_jitter_m": 0.3,
    },
    "textures": {
        "road_variants": 12,
        "vehicle_color_pool": ["white", "black", "silver", "red", "blue", "grey"],
    },
}

Data Formats (nuScenes)

The nuScenes dataset format uses a relational database stored as JSON files. Understanding its schema is essential for writing compliant exports.

nuScenes Schema (simplified):
==============================

scene          1 --- N   sample           (a scene contains N keyframes)
sample         1 --- N   sample_data      (each keyframe has data from each sensor)
sample_data    N --- 1   calibrated_sensor (each data record links to a sensor config)
sample_data    N --- 1   ego_pose          (each data record has an ego pose)
sample         1 --- N   sample_annotation (each keyframe has object annotations)
sample_annotation N - 1  instance          (annotations track objects over time)
instance       N --- 1   category          (each object has a class label)

Core JSON tables you need to produce:

Table	Key Fields	Purpose
`scene.json`	name, description, first/last sample token	Top-level scene metadata
`sample.json`	timestamp, scene token, prev/next links	Individual keyframes (2 Hz typical)
`sample_data.json`	sample token, ego_pose token, calibrated_sensor token, filename	Sensor data file reference
`ego_pose.json`	timestamp, translation, rotation (quaternion)	Vehicle pose at each sensor reading
`calibrated_sensor.json`	sensor token, translation, rotation, camera_intrinsic	Sensor mounting and calibration
`sensor.json`	channel name, modality (camera/lidar)	Sensor definition
`sample_annotation.json`	sample token, instance token, category, translation, size, rotation	3D object annotations
`instance.json`	category token, number of annotations	Object identity across frames
`category.json`	name (e.g., "vehicle.car")	Class taxonomy
`attribute.json`	name (e.g., "vehicle.moving")	Annotation attributes

A minimal nuScenes-compliant annotation record looks like:

{
    "token": "a1b2c3d4e5f6...",
    "sample_token": "f6e5d4c3b2a1...",
    "instance_token": "1a2b3c4d5e6f...",
    "attribute_tokens": ["moving_token_123"],
    "visibility_token": "4",
    "translation": [100.5, 200.3, 1.2],
    "size": [4.5, 1.9, 1.6],
    "rotation": [0.707, 0.0, 0.0, 0.707],
    "prev": "",
    "next": "next_annotation_token",
    "num_lidar_pts": 342,
    "num_radar_pts": 5,
    "category_name": "vehicle.car"
}

Step-by-Step Implementation Guide

Step 1: Environment Setup (45 min)

Goal: Set up dependencies, project structure, and verify everything works.

1.1 Create the project

mkdir -p synthetic-data-pipeline/{synth_data,configs,assets,output,notebooks,tests}
cd synthetic-data-pipeline
python -m venv .venv
source .venv/bin/activate

1.2 Install dependencies

pip install numpy scipy open3d Pillow pyquaternion nuscenes-devkit matplotlib tqdm

Package	Purpose
`numpy`, `scipy`	Linear algebra, spatial transforms
`open3d`	3D mesh loading, rendering, point cloud ops
`Pillow`	Image I/O
`pyquaternion`	Quaternion math for rotations
`nuscenes-devkit`	Validate exported datasets
`matplotlib`	Visualization
`tqdm`	Progress bars for batch generation

1.3 Project structure

synthetic-data-pipeline/
  synth_data/
    __init__.py
    scene_graph.py       # Scene graph and asset management
    transforms.py        # 3D transform utilities
    camera.py            # Camera model and rendering
    lidar.py             # Lidar simulation
    auto_label.py        # Label generation
    nuscenes_export.py   # nuScenes format writer
    randomization.py     # Domain randomization engine
    pipeline.py          # Orchestration
    cli.py               # Command-line interface
  configs/
    default.yaml         # Default generation parameters
  assets/
    meshes/              # 3D mesh files (.obj, .ply)
    textures/            # Surface textures
  output/                # Generated datasets go here
  notebooks/             # Jupyter notebooks for exercises
  tests/                 # Unit tests

1.4 Verify setup

# verify_setup.py
import numpy as np
import open3d as o3d
from pyquaternion import Quaternion
from PIL import Image

print(f"NumPy:        {np.__version__}")
print(f"Open3D:       {o3d.__version__}")
print(f"Pillow:       {Image.__version__}")

# Quick transform test
q = Quaternion(axis=[0, 0, 1], angle=np.pi / 4)
print(f"Quaternion:   {q}")
print(f"Rotation mat:\n{q.rotation_matrix}")

# Quick mesh test
mesh = o3d.geometry.TriangleMesh.create_box(4.5, 1.9, 1.6)
print(f"Box mesh:     {len(mesh.vertices)} vertices, {len(mesh.triangles)} triangles")

print("\nAll checks passed.")

Step 2: Scene Graph and Asset Management (2.5 hours)

Goal: Build the data structures that represent a 3D driving scene.

2.1 Transform3D utility class

# synth_data/transforms.py
import numpy as np
from pyquaternion import Quaternion
from typing import Optional

class Transform3D:
    """Rigid-body transform (rotation + translation) in SE(3)."""

    def __init__(
        self,
        translation: np.ndarray = np.zeros(3),
        rotation: Optional[Quaternion] = None,
    ):
        self.translation = np.asarray(translation, dtype=np.float64)
        self.rotation = rotation or Quaternion()

    @property
    def matrix(self) -> np.ndarray:
        """Return 4x4 homogeneous transform matrix."""
        T = np.eye(4)
        T[:3, :3] = self.rotation.rotation_matrix
        T[:3, 3] = self.translation
        return T

    @property
    def inverse(self) -> "Transform3D":
        """Return the inverse transform."""
        R_inv = self.rotation.inverse
        t_inv = -(R_inv.rotation_matrix @ self.translation)
        return Transform3D(translation=t_inv, rotation=R_inv)

    def __matmul__(self, other: "Transform3D") -> "Transform3D":
        """Compose two transforms: T_a @ T_b = T_ab."""
        new_rotation = self.rotation * other.rotation
        new_translation = (
            self.rotation.rotation_matrix @ other.translation + self.translation
        )
        return Transform3D(translation=new_translation, rotation=new_rotation)

    def apply(self, points: np.ndarray) -> np.ndarray:
        """Transform an (N, 3) array of points."""
        return (self.rotation.rotation_matrix @ points.T).T + self.translation

    def __repr__(self):
        return f"Transform3D(t={self.translation}, q={self.rotation})"

2.2 Scene graph nodes

Design a SceneNode base class and specialized subclasses:

# synth_data/scene_graph.py
from dataclasses import dataclass, field
from typing import List, Dict, Optional
import numpy as np
import open3d as o3d
from .transforms import Transform3D

@dataclass
class SceneNode:
    """Base node in the scene graph."""
    name: str
    transform: Transform3D = field(default_factory=Transform3D)
    children: List["SceneNode"] = field(default_factory=list)

    def world_transform(self, parent_transform: Optional[Transform3D] = None) -> Transform3D:
        """Compute world-frame transform by chaining parent transforms."""
        if parent_transform is None:
            return self.transform
        return parent_transform @ self.transform

@dataclass
class MeshObject(SceneNode):
    """A node with associated 3D geometry."""
    mesh: Optional[o3d.geometry.TriangleMesh] = None
    class_name: str = "unknown"
    instance_id: int = 0
    dimensions: np.ndarray = field(default_factory=lambda: np.array([1.0, 1.0, 1.0]))

@dataclass
class DynamicActor(MeshObject):
    """An object with a trajectory over time."""
    trajectory: List[Transform3D] = field(default_factory=list)

    def pose_at_frame(self, frame_idx: int) -> Transform3D:
        """Get pose at a specific frame, clamping to available range."""
        idx = min(frame_idx, len(self.trajectory) - 1)
        return self.trajectory[idx]

@dataclass
class SceneGraph:
    """Top-level container for a driving scene."""
    name: str
    static_objects: List[MeshObject] = field(default_factory=list)
    dynamic_actors: List[DynamicActor] = field(default_factory=list)
    ego_trajectory: List[Transform3D] = field(default_factory=list)
    num_frames: int = 20

2.3 Asset loading and management

Write helpers to load OBJ/PLY meshes and create simple procedural assets (box cars, cylinder trees) for testing when you do not have a full asset library:

def create_box_vehicle(length=4.5, width=1.9, height=1.6, color=[0.3, 0.3, 0.8]):
    """Create a simple box mesh representing a vehicle."""
    mesh = o3d.geometry.TriangleMesh.create_box(length, width, height)
    mesh.translate([-length / 2, -width / 2, 0])  # center at base
    mesh.paint_uniform_color(color)
    mesh.compute_vertex_normals()
    return mesh

def create_cylinder_tree(radius=0.3, trunk_height=3.0, canopy_radius=1.5, canopy_height=2.0):
    """Create a simple tree from cylinder trunk + sphere canopy."""
    trunk = o3d.geometry.TriangleMesh.create_cylinder(radius, trunk_height)
    trunk.paint_uniform_color([0.4, 0.25, 0.1])
    canopy = o3d.geometry.TriangleMesh.create_sphere(canopy_radius)
    canopy.translate([0, 0, trunk_height + canopy_radius * 0.5])
    canopy.paint_uniform_color([0.1, 0.6, 0.1])
    return trunk + canopy

2.4 Randomized scene generation

Build a function that generates a random scene given constraints:

def generate_random_scene(config: dict, rng: np.random.Generator) -> SceneGraph:
    """Generate a randomized driving scene."""
    scene = SceneGraph(name=f"scene_{rng.integers(0, 100000):05d}")
    scene.num_frames = config.get("num_frames", 20)

    # Generate ego trajectory (straight road with slight curvature)
    ego_speed = rng.uniform(5.0, 15.0)  # m/s
    dt = 0.5  # seconds between frames
    for i in range(scene.num_frames):
        x = ego_speed * dt * i
        y = 0.5 * np.sin(0.05 * x)  # gentle curve
        yaw = np.arctan2(np.cos(0.05 * x) * 0.5 * 0.05, 1.0)
        pose = Transform3D(
            translation=np.array([x, y, 0.0]),
            rotation=Quaternion(axis=[0, 0, 1], angle=yaw),
        )
        scene.ego_trajectory.append(pose)

    # Place random vehicles
    num_vehicles = rng.integers(
        config["actors"]["num_vehicles_range"][0],
        config["actors"]["num_vehicles_range"][1],
    )
    for v in range(num_vehicles):
        # ... (place in nearby lanes with trajectories)
        pass  # Implementation details in notebook

    return scene

Step 3: Camera Rendering Pipeline (3 hours)

Goal: Render RGB images from the 3D scene using a pinhole camera model.

3.1 Camera model

# synth_data/camera.py
import numpy as np
from dataclasses import dataclass
from .transforms import Transform3D

@dataclass
class CameraIntrinsics:
    """Pinhole camera intrinsic parameters."""
    fx: float          # focal length x (pixels)
    fy: float          # focal length y (pixels)
    cx: float          # principal point x (pixels)
    cy: float          # principal point y (pixels)
    width: int         # image width
    height: int        # image height

    @property
    def matrix(self) -> np.ndarray:
        return np.array([
            [self.fx,  0.0, self.cx],
            [0.0,  self.fy, self.cy],
            [0.0,      0.0,     1.0],
        ])

@dataclass
class Camera:
    """A camera with intrinsics and extrinsic pose."""
    name: str
    intrinsics: CameraIntrinsics
    extrinsics: Transform3D  # Transform from ego frame to camera frame

    def project_points(self, points_world: np.ndarray, T_world_ego: Transform3D) -> np.ndarray:
        """Project (N, 3) world points to (N, 2) pixel coordinates.

        Returns (N, 3) array where columns are [u, v, depth].
        Points behind camera or outside image get depth = -1.
        """
        # World -> ego -> camera
        T_cam_world = self.extrinsics.inverse @ T_world_ego.inverse
        points_cam = T_cam_world.apply(points_world)

        # Filter behind camera
        mask = points_cam[:, 2] > 0.1
        result = np.full((len(points_world), 3), -1.0)

        if not np.any(mask):
            return result

        pts = points_cam[mask]
        K = self.intrinsics.matrix
        proj = (K @ pts.T).T
        uv = proj[:, :2] / proj[:, 2:3]

        # Bounds check
        in_bounds = (
            (uv[:, 0] >= 0) & (uv[:, 0] < self.intrinsics.width) &
            (uv[:, 1] >= 0) & (uv[:, 1] < self.intrinsics.height)
        )

        valid = np.where(mask)[0][in_bounds]
        result[valid, :2] = uv[in_bounds]
        result[valid, 2] = pts[in_bounds, 2]
        return result

3.2 Multi-camera setup

Define a standard six-camera rig (similar to nuScenes):

def create_nuscenes_camera_rig() -> List[Camera]:
    """Create a six-camera rig matching nuScenes sensor layout."""
    intrinsics = CameraIntrinsics(fx=1266.4, fy=1266.4, cx=816.3, cy=491.5,
                                   width=1600, height=900)
    cameras = []
    configs = [
        ("CAM_FRONT",       [1.7, 0.0, 1.5],  0.0),
        ("CAM_FRONT_LEFT",  [1.5, 0.5, 1.5],  55.0),
        ("CAM_FRONT_RIGHT", [1.5, -0.5, 1.5], -55.0),
        ("CAM_BACK",        [-0.3, 0.0, 1.5],  180.0),
        ("CAM_BACK_LEFT",   [-0.3, 0.5, 1.5],  110.0),
        ("CAM_BACK_RIGHT",  [-0.3, -0.5, 1.5], -110.0),
    ]
    for name, translation, yaw_deg in configs:
        yaw = np.radians(yaw_deg)
        rotation = Quaternion(axis=[0, 0, 1], angle=yaw)
        extrinsics = Transform3D(
            translation=np.array(translation),
            rotation=rotation,
        )
        cameras.append(Camera(name=name, intrinsics=intrinsics, extrinsics=extrinsics))
    return cameras

3.3 Rendering with Open3D

Use Open3D's offscreen renderer to produce RGB images:

import open3d as o3d

def render_camera_image(scene_meshes, camera, ego_pose, width=1600, height=900):
    """Render an RGB image from the given camera viewpoint."""
    renderer = o3d.visualization.rendering.OffscreenRenderer(width, height)
    renderer.scene.set_background([0.6, 0.75, 0.9, 1.0])  # sky blue

    # Add all scene meshes
    for i, mesh in enumerate(scene_meshes):
        mat = o3d.visualization.rendering.MaterialRecord()
        mat.shader = "defaultLit"
        renderer.scene.add_geometry(f"obj_{i}", mesh, mat)

    # Set camera from intrinsics/extrinsics
    T_cam_world = camera.extrinsics.inverse @ ego_pose.inverse
    # ... configure renderer camera from T_cam_world and intrinsics

    img = renderer.render_to_image()
    return np.asarray(img)

Step 4: Lidar Simulation (2.5 hours)

Goal: Generate synthetic lidar point clouds by raycasting into the scene.

4.1 Lidar configuration

# synth_data/lidar.py
from dataclasses import dataclass
import numpy as np

@dataclass
class LidarConfig:
    """Configuration for a spinning lidar sensor."""
    num_beams: int = 32
    horizontal_fov: float = 360.0        # degrees
    vertical_fov_up: float = 10.0        # degrees above horizon
    vertical_fov_down: float = 30.0      # degrees below horizon
    horizontal_resolution: float = 0.2   # degrees between horizontal samples
    max_range: float = 100.0             # meters
    min_range: float = 0.5               # meters
    range_noise_std: float = 0.02        # meters, Gaussian noise
    dropout_rate: float = 0.02           # fraction of points to drop
    mount_position: np.ndarray = None    # [x, y, z] in ego frame

    def __post_init__(self):
        if self.mount_position is None:
            self.mount_position = np.array([0.0, 0.0, 1.8])  # roof-mounted

4.2 Ray generation

def generate_ray_directions(config: LidarConfig) -> np.ndarray:
    """Generate unit ray direction vectors for all lidar beams.

    Returns: (N, 3) array of ray directions in sensor frame.
    """
    v_angles = np.linspace(
        np.radians(-config.vertical_fov_down),
        np.radians(config.vertical_fov_up),
        config.num_beams,
    )
    num_h = int(config.horizontal_fov / config.horizontal_resolution)
    h_angles = np.linspace(0, np.radians(config.horizontal_fov), num_h, endpoint=False)

    directions = []
    for v in v_angles:
        for h in h_angles:
            dx = np.cos(v) * np.cos(h)
            dy = np.cos(v) * np.sin(h)
            dz = np.sin(v)
            directions.append([dx, dy, dz])

    return np.array(directions)

4.3 Raycasting with Open3D

def simulate_lidar(scene_mesh, config: LidarConfig, ego_pose: Transform3D) -> np.ndarray:
    """Cast rays into scene and return point cloud in ego frame.

    Returns: (N, 4) array of [x, y, z, intensity].
    """
    # Create Open3D raycasting scene
    ray_scene = o3d.t.geometry.RaycastingScene()
    mesh_t = o3d.t.geometry.TriangleMesh.from_legacy(scene_mesh)
    ray_scene.add_triangles(mesh_t)

    # Generate rays in world frame
    directions = generate_ray_directions(config)
    sensor_pos_world = ego_pose.apply(
        config.mount_position.reshape(1, 3)
    ).flatten()

    origins = np.tile(sensor_pos_world, (len(directions), 1))
    dirs_world = ego_pose.rotation.rotation_matrix @ directions.T
    dirs_world = dirs_world.T

    rays = np.hstack([origins, dirs_world]).astype(np.float32)
    rays_tensor = o3d.core.Tensor(rays)

    # Cast rays
    result = ray_scene.cast_rays(rays_tensor)
    t_hit = result['t_hit'].numpy()

    # Filter valid hits
    valid = (t_hit > config.min_range) & (t_hit < config.max_range)

    # Add noise
    rng = np.random.default_rng()
    noise = rng.normal(0, config.range_noise_std, t_hit.shape)
    t_hit_noisy = t_hit + noise

    # Apply dropout
    dropout_mask = rng.random(len(t_hit)) > config.dropout_rate
    valid = valid & dropout_mask

    # Compute hit points in world frame
    hit_points = origins[valid] + dirs_world[valid] * t_hit_noisy[valid, np.newaxis]

    # Transform to ego frame
    hit_ego = ego_pose.inverse.apply(hit_points)

    # Intensity (simplified: based on distance)
    intensity = 1.0 - (t_hit_noisy[valid] / config.max_range)

    return np.column_stack([hit_ego, intensity])

Step 5: Auto-Labeling System (3 hours)

Goal: Generate ground-truth annotations automatically from the scene graph.

5.1 3D bounding boxes

# synth_data/auto_label.py
from dataclasses import dataclass
from typing import List, Optional
import numpy as np

@dataclass
class BoundingBox3D:
    """3D bounding box annotation."""
    center: np.ndarray          # (3,) center in world frame
    dimensions: np.ndarray      # (3,) [length, width, height]
    rotation: "Quaternion"      # orientation
    class_name: str
    instance_id: int
    num_lidar_pts: int = 0

def extract_3d_boxes(scene: "SceneGraph", frame_idx: int) -> List[BoundingBox3D]:
    """Extract 3D bounding boxes for all actors at a given frame."""
    boxes = []
    for actor in scene.dynamic_actors:
        pose = actor.pose_at_frame(frame_idx)
        box = BoundingBox3D(
            center=pose.translation + np.array([0, 0, actor.dimensions[2] / 2]),
            dimensions=actor.dimensions,
            rotation=pose.rotation,
            class_name=actor.class_name,
            instance_id=actor.instance_id,
        )
        boxes.append(box)

    for obj in scene.static_objects:
        if obj.class_name in ("building", "ground"):
            continue  # skip non-annotatable static objects
        box = BoundingBox3D(
            center=obj.transform.translation + np.array([0, 0, obj.dimensions[2] / 2]),
            dimensions=obj.dimensions,
            rotation=obj.transform.rotation,
            class_name=obj.class_name,
            instance_id=obj.instance_id,
        )
        boxes.append(box)

    return boxes

5.2 2D box projection and visibility

def project_boxes_to_2d(
    boxes_3d: List[BoundingBox3D],
    camera: "Camera",
    ego_pose: "Transform3D",
) -> List[Optional[List[float]]]:
    """Project 3D boxes to 2D axis-aligned bounding boxes in camera image.

    Returns list of [x_min, y_min, x_max, y_max] or None if not visible.
    """
    results = []
    for box in boxes_3d:
        corners_3d = get_box_corners(box.center, box.dimensions, box.rotation)
        projected = camera.project_points(corners_3d, ego_pose)
        visible = projected[:, 2] > 0  # depth > 0 means in front of camera
        if not np.any(visible):
            results.append(None)
            continue

        uv = projected[visible, :2]
        x_min, y_min = np.clip(uv.min(axis=0), 0,
                                [camera.intrinsics.width - 1, camera.intrinsics.height - 1])
        x_max, y_max = np.clip(uv.max(axis=0), 0,
                                [camera.intrinsics.width - 1, camera.intrinsics.height - 1])

        # Minimum box size filter
        if (x_max - x_min) < 5 or (y_max - y_min) < 5:
            results.append(None)
            continue

        results.append([float(x_min), float(y_min), float(x_max), float(y_max)])
    return results

def get_box_corners(center, dimensions, rotation):
    """Compute 8 corners of a 3D bounding box."""
    l, w, h = dimensions / 2
    corners_local = np.array([
        [ l,  w, -h], [ l, -w, -h], [-l, -w, -h], [-l,  w, -h],  # bottom
        [ l,  w,  h], [ l, -w,  h], [-l, -w,  h], [-l,  w,  h],  # top
    ])
    corners_world = (rotation.rotation_matrix @ corners_local.T).T + center
    return corners_world

5.3 Semantic and instance segmentation

During rendering, maintain a parallel "label buffer" alongside the RGB buffer. Each pixel stores the class ID and instance ID of the rendered object:

def generate_segmentation_maps(
    scene: "SceneGraph",
    camera: "Camera",
    ego_pose: "Transform3D",
    frame_idx: int,
) -> tuple:
    """Render semantic and instance segmentation maps.

    Returns:
        semantic_map: (H, W) uint8 array of class IDs
        instance_map: (H, W) int32 array of instance IDs
    """
    H, W = camera.intrinsics.height, camera.intrinsics.width
    semantic_map = np.zeros((H, W), dtype=np.uint8)
    instance_map = np.zeros((H, W), dtype=np.int32)
    depth_map = np.full((H, W), np.inf, dtype=np.float32)

    CLASS_TO_ID = {
        "vehicle.car": 1, "vehicle.truck": 2, "vehicle.bus": 3,
        "human.pedestrian": 4, "vehicle.bicycle": 5,
        "static.traffic_sign": 6, "static.tree": 7,
    }

    # For each object, project its mesh triangles and fill the label buffer
    # (simplified -- production code would use GPU rasterization)
    for actor in scene.dynamic_actors:
        pose = actor.pose_at_frame(frame_idx)
        class_id = CLASS_TO_ID.get(actor.class_name, 0)
        # ... rasterize mesh, fill semantic_map and instance_map
        # where depth < existing depth_map value

    return semantic_map, instance_map

5.4 Depth map generation

def generate_depth_map(
    scene_mesh,
    camera: "Camera",
    ego_pose: "Transform3D",
) -> np.ndarray:
    """Generate a dense depth map via raycasting from the camera.

    Returns: (H, W) float32 array of depths in meters. Inf where no hit.
    """
    H, W = camera.intrinsics.height, camera.intrinsics.width
    K_inv = np.linalg.inv(camera.intrinsics.matrix)

    # Generate pixel ray directions
    u, v = np.meshgrid(np.arange(W), np.arange(H))
    pixels = np.stack([u, v, np.ones_like(u)], axis=-1).reshape(-1, 3)
    ray_dirs_cam = (K_inv @ pixels.T).T  # (H*W, 3) in camera frame

    # Transform to world frame
    T_world_cam = ego_pose @ camera.extrinsics
    ray_dirs_world = T_world_cam.rotation.rotation_matrix @ ray_dirs_cam.T
    ray_dirs_world = ray_dirs_world.T
    ray_dirs_world /= np.linalg.norm(ray_dirs_world, axis=1, keepdims=True)

    cam_pos_world = T_world_cam.translation
    origins = np.tile(cam_pos_world, (len(ray_dirs_world), 1))

    # Cast rays using Open3D
    ray_scene = o3d.t.geometry.RaycastingScene()
    mesh_t = o3d.t.geometry.TriangleMesh.from_legacy(scene_mesh)
    ray_scene.add_triangles(mesh_t)

    rays = np.hstack([origins, ray_dirs_world]).astype(np.float32)
    result = ray_scene.cast_rays(o3d.core.Tensor(rays))
    t_hit = result['t_hit'].numpy().reshape(H, W)

    return t_hit

Step 6: nuScenes Format Export (3 hours)

Goal: Write a complete nuScenes-compliant dataset to disk.

6.1 Token generation

nuScenes uses 32-character hex tokens as primary keys:

import hashlib
import uuid

def generate_token() -> str:
    """Generate a unique 32-char hex token."""
    return uuid.uuid4().hex

def deterministic_token(seed_string: str) -> str:
    """Generate a deterministic token from a string (for reproducibility)."""
    return hashlib.md5(seed_string.encode()).hexdigest()

6.2 NuScenesExporter class

# synth_data/nuscenes_export.py
import json
import os
from pathlib import Path
from typing import List, Dict
import numpy as np

class NuScenesExporter:
    """Exports synthetic data to nuScenes format."""

    def __init__(self, output_dir: str, version: str = "v1.0-synth"):
        self.output_dir = Path(output_dir)
        self.version = version
        self.tables = {
            "scene": [],
            "sample": [],
            "sample_data": [],
            "sample_annotation": [],
            "instance": [],
            "category": [],
            "attribute": [],
            "sensor": [],
            "calibrated_sensor": [],
            "ego_pose": [],
            "log": [],
            "map": [],
            "visibility": [],
        }
        self._setup_directories()
        self._init_static_tables()

    def _setup_directories(self):
        """Create nuScenes directory structure."""
        dirs = [
            self.output_dir / self.version,
            self.output_dir / "samples" / "CAM_FRONT",
            self.output_dir / "samples" / "CAM_FRONT_LEFT",
            self.output_dir / "samples" / "CAM_FRONT_RIGHT",
            self.output_dir / "samples" / "CAM_BACK",
            self.output_dir / "samples" / "CAM_BACK_LEFT",
            self.output_dir / "samples" / "CAM_BACK_RIGHT",
            self.output_dir / "samples" / "LIDAR_TOP",
            self.output_dir / "sweeps",
        ]
        for d in dirs:
            d.mkdir(parents=True, exist_ok=True)

    def _init_static_tables(self):
        """Initialize category, attribute, visibility, sensor tables."""
        # Categories
        categories = [
            "vehicle.car", "vehicle.truck", "vehicle.bus",
            "human.pedestrian.adult", "human.pedestrian.child",
            "vehicle.bicycle", "vehicle.motorcycle",
            "movable_object.barrier", "movable_object.trafficcone",
        ]
        for cat in categories:
            self.tables["category"].append({
                "token": deterministic_token(cat),
                "name": cat,
                "description": "",
            })

        # Visibility levels
        for level, desc in [(1, "0-40%"), (2, "40-60%"), (3, "60-80%"), (4, "80-100%")]:
            self.tables["visibility"].append({
                "token": str(level),
                "level": desc,
                "description": f"Visibility {desc}",
            })

        # Sensors
        sensor_configs = [
            ("CAM_FRONT", "camera"), ("CAM_FRONT_LEFT", "camera"),
            ("CAM_FRONT_RIGHT", "camera"), ("CAM_BACK", "camera"),
            ("CAM_BACK_LEFT", "camera"), ("CAM_BACK_RIGHT", "camera"),
            ("LIDAR_TOP", "lidar"),
        ]
        for channel, modality in sensor_configs:
            self.tables["sensor"].append({
                "token": deterministic_token(channel),
                "channel": channel,
                "modality": modality,
            })

    def add_scene(self, scene: "SceneGraph", cameras, lidar_config, rendered_data):
        """Export a complete scene with all frames and annotations."""
        scene_token = generate_token()
        sample_tokens = []

        for frame_idx in range(scene.num_frames):
            sample_token = generate_token()
            timestamp = int(frame_idx * 0.5 * 1e6)  # microseconds

            # Create ego_pose record
            ego = scene.ego_trajectory[frame_idx]
            ego_pose_token = generate_token()
            self.tables["ego_pose"].append({
                "token": ego_pose_token,
                "timestamp": timestamp,
                "translation": ego.translation.tolist(),
                "rotation": [ego.rotation.w, ego.rotation.x,
                             ego.rotation.y, ego.rotation.z],
            })

            # Create sample_data for each sensor
            # ... (camera images, lidar point clouds)

            # Create sample_annotation for each visible object
            # ... (3D boxes in ego frame)

            sample_tokens.append(sample_token)

        # Link samples with prev/next
        for i, token in enumerate(sample_tokens):
            # ... set prev/next pointers

        # Write scene record
        self.tables["scene"].append({
            "token": scene_token,
            "name": scene.name,
            "description": "Synthetically generated scene",
            "log_token": generate_token(),
            "nbr_samples": scene.num_frames,
            "first_sample_token": sample_tokens[0],
            "last_sample_token": sample_tokens[-1],
        })

    def save(self):
        """Write all JSON table files to disk."""
        version_dir = self.output_dir / self.version
        for table_name, records in self.tables.items():
            filepath = version_dir / f"{table_name}.json"
            with open(filepath, "w") as f:
                json.dump(records, f, indent=2)
        print(f"Saved {sum(len(v) for v in self.tables.values())} "
              f"records across {len(self.tables)} tables")

6.3 Validation with nuscenes-devkit

After export, verify the dataset loads correctly:

from nuscenes.nuscenes import NuScenes

def validate_export(output_dir: str, version: str = "v1.0-synth"):
    """Load the exported dataset with nuscenes-devkit and run checks."""
    nusc = NuScenes(version=version, dataroot=output_dir, verbose=True)

    print(f"Scenes:      {len(nusc.scene)}")
    print(f"Samples:     {len(nusc.sample)}")
    print(f"Annotations: {len(nusc.sample_annotation)}")

    # Verify we can traverse the data
    scene = nusc.scene[0]
    sample_token = scene["first_sample_token"]
    while sample_token:
        sample = nusc.get("sample", sample_token)
        # Check each sensor has data
        for channel in ["CAM_FRONT", "LIDAR_TOP"]:
            assert channel in sample["data"], f"Missing {channel} data"
        sample_token = sample.get("next", "")
        if not sample_token:
            break

    print("Validation passed.")

Step 7: Domain Randomization Engine (2 hours)

Goal: Make generated scenes diverse enough to improve model generalization.

7.1 Randomization manager

# synth_data/randomization.py
import numpy as np
from dataclasses import dataclass, field
from typing import Dict, Any, List

@dataclass
class RandomizationConfig:
    """Full configuration for domain randomization."""
    seed: int = 42
    lighting: Dict[str, Any] = field(default_factory=lambda: {
        "sun_elevation_range": [10, 80],
        "sun_azimuth_range": [0, 360],
        "intensity_range": [0.6, 1.0],
    })
    weather: Dict[str, Any] = field(default_factory=lambda: {
        "options": ["clear", "light_rain", "heavy_rain", "fog"],
        "weights": [0.5, 0.2, 0.15, 0.15],
    })
    actors: Dict[str, Any] = field(default_factory=lambda: {
        "num_vehicles_range": [3, 30],
        "num_pedestrians_range": [0, 20],
        "lateral_jitter_m": 0.3,
    })
    textures: Dict[str, Any] = field(default_factory=lambda: {
        "vehicle_colors": [
            [0.9, 0.9, 0.9],  # white
            [0.1, 0.1, 0.1],  # black
            [0.7, 0.7, 0.7],  # silver
            [0.8, 0.1, 0.1],  # red
            [0.1, 0.2, 0.7],  # blue
        ],
    })

class DomainRandomizer:
    """Applies domain randomization to scene generation."""

    def __init__(self, config: RandomizationConfig):
        self.config = config
        self.rng = np.random.default_rng(config.seed)

    def sample_lighting(self) -> Dict[str, float]:
        cfg = self.config.lighting
        return {
            "sun_elevation": self.rng.uniform(*cfg["sun_elevation_range"]),
            "sun_azimuth": self.rng.uniform(*cfg["sun_azimuth_range"]),
            "intensity": self.rng.uniform(*cfg["intensity_range"]),
        }

    def sample_weather(self) -> str:
        cfg = self.config.weather
        return self.rng.choice(cfg["options"], p=cfg["weights"])

    def sample_vehicle_color(self) -> List[float]:
        colors = self.config.textures["vehicle_colors"]
        idx = self.rng.integers(0, len(colors))
        # Add slight per-channel jitter
        color = np.array(colors[idx]) + self.rng.normal(0, 0.03, 3)
        return np.clip(color, 0, 1).tolist()

    def apply_weather_effects(self, image: np.ndarray, weather: str) -> np.ndarray:
        """Apply post-processing weather effects to a rendered image."""
        if weather == "clear":
            return image
        elif weather == "fog":
            fog_density = self.rng.uniform(0.3, 0.7)
            fog_color = np.array([200, 200, 210], dtype=np.float32)
            blended = image.astype(np.float32) * (1 - fog_density) + fog_color * fog_density
            return np.clip(blended, 0, 255).astype(np.uint8)
        elif weather in ("light_rain", "heavy_rain"):
            # Add rain streaks and darken image
            factor = 0.85 if weather == "light_rain" else 0.65
            darkened = (image.astype(np.float32) * factor).astype(np.uint8)
            # Add rain streaks (simplified)
            num_streaks = 200 if weather == "light_rain" else 800
            for _ in range(num_streaks):
                x = self.rng.integers(0, image.shape[1])
                y = self.rng.integers(0, image.shape[0] - 20)
                length = self.rng.integers(10, 30)
                darkened[y:y+length, x] = np.minimum(
                    darkened[y:y+length, x].astype(int) + 40, 255
                ).astype(np.uint8)
            return darkened
        return image

7.2 Configuration files

Store randomization parameters in YAML for reproducibility:

# configs/default.yaml
pipeline:
  num_scenes: 50
  frames_per_scene: 20
  frame_rate: 2.0  # Hz

randomization:
  seed: 42
  lighting:
    sun_elevation_range: [10, 80]
    sun_azimuth_range: [0, 360]
    intensity_range: [0.6, 1.0]
  weather:
    options: [clear, light_rain, heavy_rain, fog]
    weights: [0.5, 0.2, 0.15, 0.15]
  actors:
    num_vehicles_range: [3, 30]
    num_pedestrians_range: [0, 20]
    lateral_jitter_m: 0.3
  textures:
    vehicle_color_pool: [white, black, silver, red, blue, grey, green]
    road_surface_variants: 8

sensors:
  cameras:
    width: 1600
    height: 900
    fx: 1266.4
    fy: 1266.4
  lidar:
    num_beams: 32
    max_range: 100.0
    horizontal_resolution: 0.2
    noise_std: 0.02
    dropout_rate: 0.02

Step 8: Pipeline Orchestration and Validation (2 hours)

Goal: Wire everything together into a batch generation tool with quality checks.

8.1 Pipeline orchestrator

# synth_data/pipeline.py
import yaml
import time
from pathlib import Path
from tqdm import tqdm

class SyntheticDataPipeline:
    """End-to-end synthetic data generation pipeline."""

    def __init__(self, config_path: str, output_dir: str):
        with open(config_path) as f:
            self.config = yaml.safe_load(f)

        self.output_dir = Path(output_dir)
        self.randomizer = DomainRandomizer(
            RandomizationConfig(**self.config.get("randomization", {}))
        )
        self.cameras = create_nuscenes_camera_rig()
        self.lidar_config = LidarConfig(**self.config.get("sensors", {}).get("lidar", {}))
        self.exporter = NuScenesExporter(str(self.output_dir))
        self.stats = GenerationStats()

    def generate(self):
        """Run the full generation pipeline."""
        num_scenes = self.config["pipeline"]["num_scenes"]
        print(f"Generating {num_scenes} scenes...")

        for scene_idx in tqdm(range(num_scenes)):
            # 1. Compose scene
            scene = generate_random_scene(self.config["randomization"], self.randomizer.rng)
            weather = self.randomizer.sample_weather()

            # 2. Render all frames
            rendered = self._render_scene(scene, weather)

            # 3. Generate labels
            labels = self._label_scene(scene)

            # 4. Export to nuScenes format
            self.exporter.add_scene(scene, self.cameras, self.lidar_config, rendered)

            # 5. Update stats
            self.stats.update(scene, labels)

        # Finalize
        self.exporter.save()
        self._run_validation()
        self._write_report()

    def _render_scene(self, scene, weather):
        """Render all sensors for all frames in a scene."""
        rendered_data = {"cameras": {}, "lidar": []}
        for frame_idx in range(scene.num_frames):
            ego_pose = scene.ego_trajectory[frame_idx]
            # Render cameras
            for cam in self.cameras:
                img = render_camera_image(
                    self._build_scene_meshes(scene, frame_idx),
                    cam, ego_pose,
                )
                img = self.randomizer.apply_weather_effects(img, weather)
                rendered_data["cameras"].setdefault(cam.name, []).append(img)
            # Render lidar
            pc = simulate_lidar(
                self._build_combined_mesh(scene, frame_idx),
                self.lidar_config, ego_pose,
            )
            rendered_data["lidar"].append(pc)
        return rendered_data

    def _label_scene(self, scene):
        """Generate all labels for a scene."""
        all_labels = []
        for frame_idx in range(scene.num_frames):
            boxes_3d = extract_3d_boxes(scene, frame_idx)
            boxes_2d = {}
            for cam in self.cameras:
                boxes_2d[cam.name] = project_boxes_to_2d(
                    boxes_3d, cam, scene.ego_trajectory[frame_idx]
                )
            all_labels.append({"boxes_3d": boxes_3d, "boxes_2d": boxes_2d})
        return all_labels

8.2 Quality validation checks

@dataclass
class GenerationStats:
    """Track statistics across the generation run."""
    total_scenes: int = 0
    total_frames: int = 0
    total_annotations: int = 0
    class_counts: Dict[str, int] = field(default_factory=dict)
    weather_counts: Dict[str, int] = field(default_factory=dict)

    def update(self, scene, labels):
        self.total_scenes += 1
        self.total_frames += scene.num_frames
        for frame_labels in labels:
            for box in frame_labels["boxes_3d"]:
                self.total_annotations += 1
                self.class_counts[box.class_name] = (
                    self.class_counts.get(box.class_name, 0) + 1
                )

def run_quality_checks(output_dir: str, stats: GenerationStats) -> Dict[str, bool]:
    """Run quality validation on the generated dataset."""
    checks = {}

    # 1. Format compliance -- does nuscenes-devkit load it?
    try:
        nusc = NuScenes(version="v1.0-synth", dataroot=output_dir, verbose=False)
        checks["format_valid"] = True
    except Exception as e:
        checks["format_valid"] = False

    # 2. Annotation coverage -- every frame has at least one annotation
    empty_frames = 0
    for sample in nusc.sample:
        anns = nusc.get_sample_data(sample["data"]["LIDAR_TOP"])[2]
        if len(anns) == 0:
            empty_frames += 1
    checks["annotation_coverage"] = empty_frames / len(nusc.sample) < 0.05

    # 3. Class distribution -- no single class > 80% of annotations
    total = sum(stats.class_counts.values())
    max_fraction = max(stats.class_counts.values()) / total if total > 0 else 0
    checks["class_balance"] = max_fraction < 0.8

    # 4. Calibration consistency -- intrinsics match across samples
    checks["calibration_consistent"] = True  # verify programmatically

    return checks

8.3 CLI interface

# synth_data/cli.py
import argparse

def main():
    parser = argparse.ArgumentParser(description="Synthetic Data Generation Pipeline")
    parser.add_argument("--config", type=str, default="configs/default.yaml",
                        help="Path to generation config YAML")
    parser.add_argument("--output", type=str, default="output/nuscenes_synth",
                        help="Output directory for generated dataset")
    parser.add_argument("--num-scenes", type=int, default=None,
                        help="Override number of scenes to generate")
    parser.add_argument("--validate-only", action="store_true",
                        help="Only run validation on existing dataset")
    parser.add_argument("--seed", type=int, default=None,
                        help="Override random seed")
    args = parser.parse_args()

    if args.validate_only:
        validate_export(args.output)
        return

    pipeline = SyntheticDataPipeline(args.config, args.output)
    if args.num_scenes:
        pipeline.config["pipeline"]["num_scenes"] = args.num_scenes
    if args.seed:
        pipeline.randomizer = DomainRandomizer(
            RandomizationConfig(seed=args.seed)
        )

    pipeline.generate()

if __name__ == "__main__":
    main()

Usage:

# Generate 50 scenes with default config
python -m synth_data.cli --config configs/default.yaml --output output/nuscenes_synth

# Quick test with 3 scenes
python -m synth_data.cli --num-scenes 3 --seed 123

# Validate an existing export
python -m synth_data.cli --validate-only --output output/nuscenes_synth

Notebook Exercises

#	Notebook	Focus	Time
1	`01_scene_composition.ipynb`	Build a scene graph from scratch. Place vehicles and static objects. Visualize the scene in 3D with Open3D. Experiment with randomized placement.	60 min
2	`02_sensor_rendering.ipynb`	Set up a pinhole camera and render images from the scene. Configure a lidar sensor and generate point clouds. Visualize multi-camera and lidar outputs side by side.	60 min
3	`03_auto_labeling.ipynb`	Extract 3D bounding boxes from the scene graph. Project to 2D for each camera. Generate semantic segmentation and depth maps. Overlay labels on rendered images for visual validation.	60 min
4	`04_nuscenes_export.ipynb`	Export a generated scene to nuScenes format. Load and browse the dataset with nuscenes-devkit. Apply domain randomization and export a batch of diverse scenes. Run quality validation checks.	60 min

Each notebook includes:

Step-by-step code cells with detailed comments
Visualization cells showing intermediate results
Challenge exercises for deeper exploration
A "check your work" cell comparing output against reference values

Expected Deliverables

Python package (synth_data/) -- modular library with scene graph, sensors, labeling, export, and randomization modules. Installable via pip install -e ..
CLI tool -- python -m synth_data.cli for batch dataset generation with configurable parameters.
nuScenes-compliant dataset -- at least 50 scenes (1000 frames) that load and browse correctly with nuscenes-devkit.
Visualization toolkit -- scripts/notebooks to render annotated images, point clouds with bounding boxes, and segmentation overlays.
Diversity report -- generated summary showing class distribution histograms, weather/lighting variation coverage, and per-scene statistics.
Unit tests -- test suite covering transform math, projection correctness, label consistency, and export format validity.

Evaluation Criteria

Criteria	Weight	Description
Pipeline Completeness	25%	The pipeline runs end-to-end: scene composition through sensor rendering through labeling through nuScenes export. All eight implementation steps are functional.
Label Accuracy	25%	Auto-generated 3D boxes match scene graph object poses exactly. 2D projections are geometrically correct. Segmentation maps assign the correct class and instance IDs.
Format Compliance	20%	Exported datasets load without errors in `nuscenes-devkit`. All required JSON tables are present and correctly linked. Sensor data files exist at referenced paths.
Randomization	15%	Generated datasets show meaningful variation in weather, lighting, actor count, and object appearance. Randomization is controlled by config and reproducible with a fixed seed.
Code Quality	15%	Code is modular with clear separation of concerns. Functions have docstrings and type hints. Key operations have unit tests. Configuration is externalized to YAML.

Synthetic Data for AD Perception Training -- the theoretical foundation for this project: domain randomization, domain adaptation (FDA, CyCADA), mixed training strategies, and cost-benefit analysis of synthetic data.
Sensor Simulation for Autonomous Driving -- advanced sensor modeling: physically-based camera simulation, lidar beam models, radar cross-section, and sensor fusion considerations.

Next Steps

After completing this project, consider these follow-up tracks:

Domain Adaptation Benchmark -- Train a perception model (e.g., PointPillars or CenterPoint) on your synthetic dataset, evaluate on real nuScenes data, then implement domain adaptation techniques (feature-level alignment, adversarial training) to close the sim-to-real gap. Measure mAP improvement from each technique.
Minority Class Augmentation -- Use your pipeline to specifically generate rare actors (motorcycles, construction vehicles, animals, wheelchairs) and inject them into real training datasets. Evaluate whether targeted synthetic augmentation improves per-class recall on the long tail.
Physics-Based Sensor Simulator -- Upgrade from simplified rendering to physically-based simulation: ray-traced camera images with realistic materials and lighting, physics-based lidar models with beam divergence and material reflectance (BRDF), and radar simulation with doppler and multi-path effects.
Scenario-Driven Generation -- Instead of random scenes, build a scenario specification language (e.g., "vehicle cuts in from left lane at 20 m ahead") and generate targeted scenes for safety validation and testing. Integrate with the Waymax scenario format for closed-loop evaluation.