Skip to content

Conversation

@bloom256
Copy link
Collaborator

Description

This PR improves performance of optimize_lidar_odometry() by reducing redundant
computation and hash-map lookups in the Hessian computation step.

Key changes

  • Unified Hessian loop – indoor and outdoor contributions are computed together
    in a single parallel_for, reducing overhead and improving cache locality.
  • Precomputed covariance inversescov_inverse is computed once per bucket
    update and reused during optimization (no per-point matrix inversion).
  • Pointer-based bucket linking – when indoor/outdoor bucket sizes have an
    integer ratio, indoor buckets store a direct pointer to the corresponding
    outdoor bucket, avoiding hash lookups.

Performance

Local measurements show a ~10–12% speedup compared to the previous version
for the optimized code path.

Note

The intent was to preserve the same underlying math while changing how work is
organized (single hessian computation loop, bucket linking). I tested on a reference dataset
and the trajectory looked consistent, but I would appreciate a careful review of all changes.


Detailed Changes

1. NDT::Bucket struct (core/include/ndt.h)

  • Added Eigen::Matrix3d cov_inverse - precomputed inverse of covariance matrix
  • Added const Bucket* coarser_bucket - pointer to corresponding outdoor bucket (for integer bucket ratios)

2. New helper functions (lidar_odometry_utils.cpp)

  • is_integer_bucket_ratio() - checks if outdoor/indoor bucket ratio is integer (enables pointer linking)
  • link_buckets_to_coarser() - links indoor buckets to outdoor buckets via coarser_bucket pointer
  • update_rgd_hierarchy() - parallel update of indoor and outdoor buckets using tbb::parallel_invoke, followed by linking

3. Precomputed cov_inverse (lidar_odometry_utils.cpp)

  • update_rgd() now computes cov_inverse after each cov update
  • update_rgd_spherical_coordinates() also updated for consistency

4. Unified hessian computation (lidar_odometry_utils_optimizers.cpp)

  • New compute_hessian() function replaces separate indoor/outdoor loops
  • Helper functions: add_indoor_hessian_contribution(), add_outdoor_hessian_contribution()
  • Uses cov_inverse instead of computing cov.inverse() per point
  • Uses squaredNorm() instead of norm() for range checks (avoids sqrt)
  • Indoor and outdoor contributions are independent (indoor miss doesn't skip outdoor)

5. Lookup statistics (lidar_odometry_utils.h)

  • Added LookupStats struct: indoor_lookups, outdoor_lookups, outdoor_pointer_hits, link_time_seconds
  • Tracks hash lookups vs pointer hits for performance analysis

6. NDTBucketMapType (lidar_odometry_utils.h)

  • Uses ankerl::unordered_dense::segmented_map for pointer stability (required for coarser_bucket optimization)

Additional details

Integer bucket ratios explanation:

When outdoor bucket size is an integer multiple of indoor bucket size (e.g., 0.6m / 0.3m = 2), indoor buckets align perfectly within outdoor buckets. This enables:

  • Pointer linking - indoor buckets store direct pointer to their outdoor bucket
  • O(1) outdoor access - pointer dereference instead of hash lookup to outdoor bucket

Behavior Notes

  • Integer bucket ratio (e.g., 0.3m/0.1m = 3): pointer linking enabled, faster outdoor access
  • Non-integer ratio (e.g., 0.3m/0.101m): fallback to hash lookup for outdoor buckets

Important: Pointer Stability Requirement

The coarser_bucket pointer optimization requires that pointers to bucket values remain valid after map insertions. This is why NDTBucketMapType uses ankerl::unordered_dense::segmented_map which guarantees pointer stability.

@bloom256
Copy link
Collaborator Author

Performance Comparison

Test Configuration

  • Convergence criteria: 1e-12
  • Indoor bucket size: 0.3m
  • Outdoor bucket size: 0.6m (integer ratio = 2)
  • Trajectory length: ~163.5m
  • Total iterations: ~246,700

Results (sorted by elapsed time, slowest to fastest)

Version Elapsed Time Optimization Time Total Iterations Avg Iter Time Speedup
Original HDMapping 370.17s - 246,678 - baseline
New (no bucket links) 326.83s 314.55s 246,718 1.275ms 1.13x
New (with bucket links) 309.79s 295.31s 246,728 1.197ms 1.19x

Lookup Statistics (new versions only)

Version Indoor Lookups Outdoor Lookups Outdoor Pointer Hits Link Time
No bucket links 5,202,737,578 4,460,709,990 0 0.000s
With bucket links 5,202,841,588 92,255,649 4,368,562,751 2.464s

Key Observations

  • 19% faster with all optimizations vs original (370s → 310s)
  • 13% faster with unified loop + precomputed inverse alone (no linking)
  • Bucket linking saves ~4.4 billion hash lookups, replaced by pointer dereferences
  • Link time overhead (2.5s) is negligible compared to savings
  • Trajectory length and iteration count are consistent across all versions (~163.5m, ~246K iterations)

@JanuszBedkowski JanuszBedkowski merged commit 117693b into MapsHD:main Jan 26, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants