# FAST-LIO2 from scratch: LiDAR-inertial odometry you can actually reproduce > Satyajit Ghana — Head of Engineering @ Inkers Technology > canonical: https://ai.thesatyajit.com/articles/fast-lio2-lidar-inertial-odometry > date: 2026-06-28 > tags: slam, lidar, state-estimation, point-cloud, explainer FAST-LIO2 is the LiDAR-inertial odometry I keep coming back to: it's accurate, it runs at 100 Hz on a laptop, it survives 1000 deg/s rotations, and the whole thing is one tight loop around a Kalman filter. But the official code is a maze of templates, and the cleanest annotated fork is in Chinese. So this is the article I wanted: what FAST-LIO2 actually *does*, derived from first principles, with real code — and a path to rebuild the core without ROS, because once you can do that, you understand SLAM. If the Kalman filter isn't fresh, read [my Kalman piece](/articles/kalman-filter) first — FAST-LIO2 is exactly the "iterated, error-state, on-manifold" filter that article ends on, fed by a high-rate IMU and corrected by thousands of LiDAR points per scan. ## The whole system in one loop The problem: a LiDAR gives you ~100k 3D points per scan at 10 Hz, but a scan takes ~100 ms during which the sensor moves, so the points are distorted; and LiDAR alone is slow to register and fragile under fast motion. An IMU gives you 200–1000 Hz acceleration and angular velocity — great for short-term motion, but it drifts. Fuse them tightly and each fixes the other's weakness.

Before the math, here's the cycle as a sequence — what each stage does and, just as important, *why it has to be there*. It runs once per LiDAR scan; the IMU drives the prediction in between: Two contributions set FAST-LIO2 apart from its predecessor: 1. **Direct registration.** No edge/plane feature extraction — it registers *raw* points to the map by point-to-plane residuals. Less to tune, and it uses all the geometry. 2. **The ikd-Tree.** An incremental k-d tree that inserts, deletes, downsamples, and re-balances in place, so the map updates in real time instead of being rebuilt. Underneath both is a **tightly-coupled iterated error-state Kalman filter on a manifold**. Let's build it piece by piece. I'll quote C++ from [`zlwang7/S-FAST_LIO`](https://github.com/zlwang7/S-FAST_LIO) — a clean reimplementation that writes the filter out explicitly instead of hiding it in template magic — and give simplified NumPy alongside. ## The state lives on a manifold You can't store an orientation as a 3-vector and add to it — rotations live on the manifold $SO(3)$. FAST-LIO2 tracks a **24-dimensional nominal state** but a **23-dimensional error state** (and covariance), because each $SO(3)$ rotation needs 3 tangent dimensions, not the 4 of a quaternion, and gravity lives on the sphere $S^2$ (2 dimensions). The state is: $$ \mathbf{x} = \big[\ \mathbf{p},\ \mathbf{R},\ \mathbf{R}_{L}^{I},\ \mathbf{t}_{L}^{I},\ \mathbf{v},\ \mathbf{b}_g,\ \mathbf{b}_a,\ \mathbf{g}\ \big] $$ position, attitude, LiDAR→IMU extrinsic rotation and translation, velocity, gyro bias, accel bias, and gravity. In the clean code that's one manifold declaration: ```cpp // include/use-ikfom.hpp — 24-D nominal, 23-D tangent MTK_BUILD_MANIFOLD(state_ikfom, ((vect3, pos)) ((SO3, rot)) ((SO3, offset_R_L_I)) ((vect3, offset_T_L_I)) ((vect3, vel)) ((vect3, bg)) ((vect3, ba)) ((S2, grav))); ``` We update on the manifold with $\boxplus$ (retraction) and measure differences with $\boxminus$ — for the $SO(3)$ part, $\mathbf{R}\boxplus\delta = \mathbf{R}\,\mathrm{Exp}(\delta)$ and $\mathbf{R}_1\boxminus\mathbf{R}_2 = \mathrm{Log}(\mathbf{R}_2^{\!\top}\mathbf{R}_1)$. Everything else is ordinary vector $+/-$. ## Forward propagation: ride the IMU Between LiDAR scans, the IMU drives the state forward. The continuous kinematics are the standard strapdown model — position integrates velocity, attitude integrates de-biased angular velocity, velocity integrates de-biased, gravity-corrected acceleration: $$ \dot{\mathbf{p}} = \mathbf{v}, \qquad \dot{\mathbf{R}} = \mathbf{R}\,(\boldsymbol{\omega}_m - \mathbf{b}_g)^\wedge, \qquad \dot{\mathbf{v}} = \mathbf{R}(\mathbf{a}_m - \mathbf{b}_a) + \mathbf{g}, $$ with the biases doing a slow random walk. That's `get_f` verbatim: ```cpp // f(x,u): continuous-time kinematics vect3 omega = in.gyro - s.bg; // ω = ω_m − b_g vect3 a_inertial = s.rot * (in.acc - s.ba); // R(a_m − b_a) res(i) = s.vel[i]; // ṗ = v res(i + 3) = omega[i]; // Ṙ ← ω res(i + 12) = a_inertial[i] + s.grav[i]; // v̇ = R(a_m−b_a) + g ``` The predict step pushes the *mean* forward by $\mathbf{x} \boxplus (\Delta t\,\mathbf{f})$ and the *covariance* forward with the error-state Jacobians $\mathbf{F_x},\mathbf{F_w}$: $$ \hat{\mathbf{x}} = \mathbf{x}\boxplus(\Delta t\,\mathbf{f}), \qquad \hat{\mathbf{P}} = \mathbf{F_x}\,\mathbf{P}\,\mathbf{F_x}^{\!\top} + (\Delta t\,\mathbf{F_w})\,\mathbf{Q}\,(\Delta t\,\mathbf{F_w})^{\!\top}. $$ ```cpp void predict(double &dt, Matrix &Q, const input_ikfom &i_in) { Matrix f_ = get_f(x_, i_in); // 24×1 Matrix df_dx_ = df_dx(x_, i_in); // ∂f/∂x Matrix df_dw_ = df_dw(x_, i_in); // ∂f/∂w x_ = x_.plus(f_, dt); // x ⊞ (dt·f) // F_x = I + dt·A·df_dx , F_w = dt·df_dw (assembled via the boxplus Jacobian) P_ = F_x1 * P_ * F_x1.transpose() + (dt*F_w1) * Q * (dt*F_w1).transpose(); } ``` In NumPy the shape of it is just: ```python def predict(x, P, imu, dt, Q): f = get_f(x, imu) # kinematics above F_x, F_w = jacobians(x, imu, dt) # error-state transition + noise maps x = boxplus(x, f * dt) # advance the mean on the manifold P = F_x @ P @ F_x.T + F_w @ Q @ F_w.T return x, P ``` This runs once per IMU sample, and the per-sample poses are cached — we need them next. ## Backward propagation: deskew the scan Because the scan sweeps over time while the platform moves, every point was measured from a slightly different pose. Stack them naively and a flat wall comes out sheared: FAST-LIO2 fixes this with **backward propagation**: walk the cached IMU poses from the scan-end time backward, and transform each point from the pose it was *actually* sampled at into the scan-end frame. For a point sampled at time $\rho_j$ with the IMU pose $(\mathbf{R}_j,\mathbf{t}_j)$ relative to the scan-end pose $(\mathbf{R}_e,\mathbf{t}_e)$: $$ \mathbf{p}^{\text{end}}_j = \mathbf{R}_{L}^{I\,\top}\!\Big(\mathbf{R}_e^{\!\top}\big(\mathbf{R}_j(\mathbf{R}_{L}^{I}\mathbf{p}_j + \mathbf{t}_{L}^{I}) + (\mathbf{t}_j - \mathbf{t}_e)\big) - \mathbf{t}_{L}^{I}\Big) $$ which is exactly the compensation in `UndistortPcl`: ```cpp M3D R_i(R_imu * Exp(angvel_avr, dt)); // attitude at this point's sample time V3D T_ei(pos_imu + vel_imu*dt + 0.5*acc_imu*dt*dt - imu_state.pos); V3D P_compensate = imu_state.offset_R_L_I.conjugate() * (imu_state.rot.conjugate() * (R_i * (imu_state.offset_R_L_I * P_i + imu_state.offset_T_L_I) + T_ei) - imu_state.offset_T_L_I); ``` Now every point lives in one consistent frame and it's safe to register against the map. ## The measurement: point-to-plane FAST-LIO2 doesn't extract features. For each deskewed point it transforms it into the world with the current state, finds its 5 nearest map points via the ikd-Tree, fits a plane to them, and the **residual is the point-to-plane distance** — zero when the point sits exactly on the surface:

For a point $\mathbf{p}$ in the body frame, transformed to world $\mathbf{p}^W = \mathbf{R}(\mathbf{R}_L^I\mathbf{p}+\mathbf{t}_L^I)+\mathbf{p}$, a plane with unit normal $\mathbf{u}$ and offset $d$ gives residual $z = \mathbf{u}^{\!\top}\mathbf{p}^W + d$. That's `h_share_model`: ```cpp V3D p_global(s.rot * (s.offset_R_L_I * p_body + s.offset_T_L_I) + s.pos); // to world ikdtree.Nearest_Search(point_world, NUM_MATCH_POINTS, points_near, sqDis); // 5 nearest if (esti_plane(pabcd, points_near, 0.1f)) { // fit plane (a,b,c,d) float pd2 = pabcd(0)*x + pabcd(1)*y + pabcd(2)*z + pabcd(3); // point-to-plane dist ... } // Jacobian row (w.r.t. attitude θ and extrinsic), residual = −distance V3D C(s.rot.conjugate() * norm_vec); // Rᵀu V3D A(point_I_crossmat * C); // (R_L^I p + t_L^I)^∧ Rᵀu ekfom_data.h_x.block<1,12>(i,0) << norm_p.x, norm_p.y, norm_p.z, A, ...; // [ u | A | … ] ekfom_data.h(i) = -norm_p.intensity; // the residual ``` The crucial detail: the Jacobian $\mathbf{H}$ is $m\times 12$ — $m$ thousands of points, but only **12 columns** (6 for pose, 6 for the extrinsic), because a single LiDAR scan can't observe velocity, biases, or gravity directly. Hold that thought; it's why the next step is fast. In NumPy: ```python def build_H_z(points_body, x, ikdtree, map_pts, R_LI, t_LI): H, z = [], [] for p in points_body: pw = x.R @ (R_LI @ p + t_LI) + x.p # body → world nn = ikdtree.nearest(pw, k=5) # 5 nearest map points n, d = fit_plane(map_pts[nn]) # unit normal, offset r = n @ pw + d # point-to-plane distance if abs(r) < 0.1: # keep confident matches pI = R_LI @ p + t_LI A = skew(pI) @ (x.R.T @ n) # ∂r/∂θ (attitude block) H.append(np.concatenate([n, A])) # [ normal | attitude ] z.append(-r) return np.array(H), np.array(z) # H: m×6 (here), z: m ``` ## The iterated update, and the gain that makes it cheap A single EKF update would linearize the very-nonlinear point-to-plane fit once, at a possibly-wrong pose, and be off. So FAST-LIO2 **iterates**: re-associate, rebuild $\mathbf{H}$ at the latest estimate, take one Kalman step, repeat until the correction is tiny. Watch the scan snap onto the map: Each iteration is $$ \delta\mathbf{x} = \mathbf{K}\,\mathbf{z} + (\mathbf{I}-\mathbf{K}\mathbf{H})(\mathbf{x}^\kappa \boxminus \hat{\mathbf{x}}), \qquad \mathbf{x}^{\kappa+1} = \mathbf{x}^\kappa \boxplus \delta\mathbf{x}, $$ iterating $\kappa$ until every component of $\delta\mathbf{x}$ drops below $10^{-3}$. The piece that makes FAST-LIO *fast* is the **reformulated Kalman gain**. The textbook form, $$ \mathbf{K} = \hat{\mathbf{P}}\mathbf{H}^{\!\top}(\mathbf{H}\hat{\mathbf{P}}\mathbf{H}^{\!\top}+\mathbf{R})^{-1}, $$ inverts an $m\times m$ matrix — and $m$ is *thousands* of points. FAST-LIO uses the information-form identity to rewrite it as $$ \mathbf{K} = (\mathbf{H}^{\!\top}\mathbf{R}^{-1}\mathbf{H} + \hat{\mathbf{P}}^{-1})^{-1}\mathbf{H}^{\!\top}\mathbf{R}^{-1}, $$ which inverts a $23\times 23$ matrix — the **state** dimension — no matter how many points there are. That's the whole trick, and in clean code it's one block: ```cpp // R is a scalar (LASER_POINT_COV = 0.001), so R⁻¹ = 1/R auto K_front = (HTH / R + P_.inverse()).inverse(); // (HᵀR⁻¹H + P⁻¹)⁻¹ — 23×23 K = K_front.block<23,12>(0,0) * H.transpose() / R; // … Hᵀ R⁻¹ Matrix dx_ = K * dyn_share.h // K z + (Matrix::Identity() - K*H) * dx_new; // (I−KH)(x ⊟ x̂) x_ = x_.boxplus(dx_); // convergence: every |dx_[j]| < epsi (0.001); then update covariance P_ = (Matrix::Identity() - K*H) * P_; ``` The same loop in NumPy, with the cheap gain spelled out: ```python def update_iterated(x, P, points_body, ikdtree, map_pts, R=1e-3, max_iter=4, eps=1e-3): x_prior = x.copy() n = P.shape[0] # 23 (error-state dim) for _ in range(max_iter): H, z = build_H_z(points_body, x, ikdtree, map_pts, R_LI, t_LI) # relinearize # information-form gain: invert (state × state), independent of len(z) S = H.T @ H / R + np.linalg.inv(P) # n×n K = np.linalg.solve(S, H.T) / R # K = S⁻¹ Hᵀ R⁻¹ dx = K @ z + (np.eye(n) - K @ H) @ (-boxminus(x, x_prior)) x = boxplus(x, dx) if np.max(np.abs(dx)) < eps: break P = (np.eye(n) - K @ H) @ P return x, P ``` That's the engine. The converged $\mathbf{x}$ is your odometry output, published at LiDAR rate. ## The map: an incremental k-d tree The nearest-neighbor search in the measurement step is the hot path, and the map is *growing and moving*. A static k-d tree would be rebuilt every scan — fatal. The **ikd-Tree** instead inserts points in place, downsamples on the tree, deletes whole regions with one box-wise delete as the local map window slides with the sensor, and lazily re-balances only the subtrees that get lopsided:

In code it's a handful of calls: ```cpp ikdtree.Build(feats_down_world->points); // first scan ikdtree.Add_Points(PointToAdd, true); // incremental insert + on-tree downsample ikdtree.Delete_Point_Boxes(cub_needrm); // box-wise delete (window slid) ikdtree.Nearest_Search(point_world, 5, near, d); // kNN, inside the measurement step ``` The payoff is real: on the authors' benchmarks FAST-LIO2 spends *less* time per scan than FAST-LIO while holding a *larger* map, on both Intel and Arm.

## Putting it together — and dropping ROS Here's the entire main loop, which is shorter than you'd expect: ```cpp while (running) { if (sync_packages(Measures)) { // group IMU + one LiDAR scan by time p_imu->Process(Measures, kf, feats_undistort); // forward-propagate + deskew downSizeFilterSurf.filter(*feats_down_body); // voxel-downsample the scan kf.update_iterated_dyn_share_modified( // the iterated point-to-plane EKF LASER_POINT_COV, feats_down_body, ikdtree, Nearest_Points, NUM_MAX_ITERATIONS, extrinsic_est_en); state_point = kf.get_x(); // odometry output map_incremental(); // ikdtree.Add_Points(...) } } ``` Notice what's *not* algorithm here: `sync_packages` is just time-aligning two streams, `publish_odometry`/`publish_frame_world` are ROS topics, and `tf` is bookkeeping. **None of that is the filter.** To reproduce FAST-LIO2 without ROS you only need: | You need | You don't need | |---|---| | read IMU samples (t, ω, a) from a file/array | ROS subscribers / message types | | read LiDAR points (x, y, z, per-point time) | rosbag, nodelets | | forward-propagate + deskew (the IMU code) | tf tree | | a k-d tree over the map (ikd-Tree, or even scipy `cKDTree` rebuilt per scan to start) | rviz, publishers | | the iterated point-to-plane update | the IKFoM template layer | A no-ROS skeleton is just the loop, fed from arrays: ```python x, P = init_state(), init_cov() ikdtree = KDMap(voxel=0.5) # or scipy cKDTree to begin with for scan in lidar_scans: # each: points + per-point timestamps imu_batch = imu_between(prev_t, scan.t_end) for imu in imu_batch: # 1) forward propagation x, P = predict(x, P, imu, imu.dt, Q) pts = deskew(scan.points, imu_poses, x) # 2) backward propagation pts = voxel_downsample(pts, 0.5) # 3) downsample x, P = update_iterated(x, P, pts, ikdtree, ikdtree.points) # 4) iterated EKF ikdtree.add(transform_to_world(pts, x)) # 5) grow the map yield x.p, x.R # pose = odometry ``` Start with a `cKDTree` rebuilt each scan to get the algorithm working end to end, then swap in a true incremental tree once you care about speed. That ordering — correctness first, then the ikd-Tree — is exactly how to learn it. To anchor yourself in the real repo, here's the file → concept map for the clean version: | File | What it owns | |---|---| | `use-ikfom.hpp` | the state manifold, `get_f`, `df_dx`, `df_dw` | | `esekfom.hpp` | the explicit ESEKF: `predict`, `h_share_model`, `update_iterated_dyn_share_modified`, the reformulated gain | | `IMU_Processing.hpp` | IMU init, forward propagation, `UndistortPcl` (deskew) | | `ikd_Tree.cpp` | `Build`, `Add_Points`, `Delete_Point_Boxes`, `Nearest_Search` | | `laserMapping.cpp` | the ROS glue + main loop (the part you replace) | ## A complete, runnable implementation I wrote the whole thing as one dependency-light file — [`fastlio2_mini.py`](/articles/fastlio2/fastlio2_mini.py) (≈250 lines, `numpy` + `scipy` + `rosbags`). It's a faithful teaching implementation: the on-manifold state, forward/backward propagation, the iterated point-to-plane update with the reformulated gain — all the code blocks above, assembled and tested. It takes a real LiDAR→IMU extrinsic and does per-scan voxel downsampling; the remaining simplifications, called out honestly, are gravity fixed after init and a `scipy` cKDTree rebuilt per scan instead of a true ikd-Tree. The driver is the no-ROS loop, fed from plain arrays — read IMU, propagate (caching poses), deskew into the IMU frame, downsample, iterated-update, grow the map: ```python def run_offline(imu_stream, lidar_scans, voxel=0.4, scan_voxel=0.5, T_LI=None, R_LI=None, acc_cov=1e-2, gyr_cov=1e-2, bacc_cov=1e-4, bgyr_cov=1e-4, init_secs=0.5): Q = np.diag([gyr_cov]*3 + [acc_cov]*3 + [bgyr_cov]*3 + [bacc_cov]*3) R_LI = np.eye(3) if R_LI is None else np.asarray(R_LI, float) T_LI = np.zeros(3) if T_LI is None else np.asarray(T_LI, float) to_imu = lambda p: (R_LI @ p.T).T + T_LI # LiDAR points -> IMU frame g, bg = imu_init([s for s in imu_stream if s[0] < imu_stream[0][0] + init_secs]) kf = ESEKF(g); kf.x.bg = bg lmap = LocalMap(voxel); traj = []; imu_i = 0; bootstrapped = False for scan in lidar_scans: poses = [] while imu_i < len(imu_stream) and imu_stream[imu_i][0] <= scan['t_end']: t, acc, gyro = imu_stream[imu_i] dt = t - (imu_stream[imu_i-1][0] if imu_i > 0 else t) if dt > 0: kf.predict(acc, gyro, dt, Q) # 1. forward propagation poses.append((t, kf.x.R.copy(), kf.x.p.copy())) imu_i += 1 body = to_imu(scan['points']) # extrinsic pts = deskew(body, scan['dts'], poses, scan['t_end']) # 2. backward deskew pts = voxel_downsample(pts, scan_voxel) # sparse, even set if not bootstrapped: lmap.add((kf.x.R @ pts.T).T + kf.x.p); bootstrapped = True # seed the map else: kf.update(pts, lmap) # 3. iterated point-to-plane EKF lmap.add((kf.x.R @ pts.T).T + kf.x.p) # 4. grow the map traj.append((scan['t_end'], kf.x.p.copy(), kf.x.R.copy())) return traj, lmap ``` It ships with a synthetic world (a robot looping through a 10×10×3 m room) so you can run it with **no dataset at all** — and that's how I validated it: ``` $ python fastlio2_mini.py in-memory : ATE rmse = 0.037 m final = 0.020 m via .bag : ATE rmse = 0.069 m final = 0.058 m ``` The second line is the important one: the file also writes the simulated data to a real ROS1 `.bag` (`sensor_msgs/Imu` + `PointCloud2`), reads it back through `read_bag()`, and re-runs — exercising the exact bag-parsing path you'd use on real hardware, end to end, to **4–7 cm** of absolute trajectory error. The math is correct. ## Reading a real `.bag` — including Livox `read_bag()` uses the pure-python `rosbags` (no ROS install) and handles both standard `sensor_msgs/PointCloud2` (Velodyne/Ouster) and Livox's custom `CustomMsg`. Livox is the catch with FAST-LIO data — its bags aren't PointCloud2, they're a custom message, so you register the type definition and parse it yourself: ```python from rosbags.typesys import Stores, get_typestore from rosbags.typesys.msg import get_types_from_msg ts = get_typestore(Stores.ROS1_NOETIC) ts.register(get_types_from_msg( # the Livox point struct "uint32 offset_time\nfloat32 x\nfloat32 y\nfloat32 z\n" "uint8 reflectivity\nuint8 tag\nuint8 line\n", 'livox_ros_driver/msg/CustomPoint')) ts.register(get_types_from_msg( # the Livox scan message "std_msgs/Header header\nuint64 timebase\nuint32 point_num\nuint8 lidar_id\n" "uint8[3] rsvd\nlivox_ros_driver/CustomPoint[] points\n", 'livox_ros_driver/msg/CustomMsg')) # then: msg.points -> (x,y,z, offset_time); offset_time is per-point time for the deskew ``` So fetching and running an actual HKU dataset is two steps: ```bash pip install gdown gdown 1YqxHuDKzWUcda80QKBV61lXI86TXsGjP -O avia.bag # a Livox Avia indoor bag python fastlio2_mini.py avia.bag ``` ## What happens on real data — and the bug that taught me the most I ran exactly that on the HKU Avia "quick-shack" bag (49 s, 9953 IMU + 491 Livox scans). It tracks. Over the full bag the filter follows **47.1 m of path** and **47.9 rad of cumulative rotation** (the sensor is waved around a small room — roughly eight full turns in a ~3 m box) and reconstructs a recognizable room:

But it did **not** track on the first try, and the reason is the most useful thing in this whole article — because it wasn't the filter math at all. The first run collapsed exactly the way a broken LIO does: the trajectory froze within a few centimetres of the origin while the IMU clearly showed the sensor swinging through ~1 rad/s of rotation. I almost wrote it off as "the teaching filter isn't robust enough." It wasn't that. **The bug was a sensor-clock mismatch.** I instrumented the scan timestamps and found them landing at `t ≈ -1.6e9` *relative to the IMU* — an impossible 50-year gap. The Livox `CustomMsg` header stamps each scan on the **sensor's own clock** (seconds since the LiDAR booted, ~361 s into this bag), while the `/livox/imu` messages are stamped on the **bag's record clock** (Unix time, ~1.6 billion). My propagate-up-to-scan-end loop compares the two: ```python while imu_stream[imu_i][0] <= scan['t_end']: # IMU time vs LiDAR header time kf.predict(...) # ...never true → never runs ``` Because every IMU timestamp (1.6e9) was vastly larger than every scan's header time (361), that condition was *never* true. **The IMU never propagated.** The filter sat at its initial pose, the update snapped each scan onto the origin-seeded map, and the whole thing looked like a plausible "data-association collapse" — when really it was a unit/epoch bug two layers down. The fix is to ignore the Livox header entirely and timestamp each scan with the **bag record time** (which `rosbags` gives you for every message, on one consistent clock): ```python for conn, t, raw in reader.messages(connections=conns): # t = bag record time (ns) ... lidar_scans.append({'t_end': t * 1e-9, 'points': pts, 'dts': dts}) # not msg.header! ``` That one change is the difference between a frozen origin and a tracked 47 m trajectory. Three smaller changes turned tracking into *clean* tracking, and they're the calibration work any real deployment needs: the **LiDAR→IMU extrinsic** from the repo's `avia.yaml` (`T = [0.04165, 0.02326, -0.0284]`, `R = I`), the **real IMU noise** (`acc_cov = gyr_cov = 0.1`, which my synthetic `Q` had 100× too small — too small and the filter trusts a stale prediction and refuses to move), and **per-scan voxel downsampling** so the iterated update stays real-time on tens of thousands of raw points. All four are wired into `run_offline` and the CLI uses the Avia values by default, so `python fastlio2_mini.py avia.bag` just works. The honest caveats remain: this is still a teaching filter. The reconstructed map smears where fast rotation meets the narrow ~70° Avia FOV (you can see stray points in the wide view), there's no loop closure, and the `cKDTree`-rebuilt-per-scan map is why a full bag takes a few minutes offline rather than running at sensor rate. Real FAST-LIO2's ikd-Tree, in-state gravity, and tighter outlier handling are what close that last gap. But the spine — the five steps, the manifold state, the reformulated gain — is exactly what's running here, and it's enough to track a real Livox bag. The lesson I'd carry out of this: in sensor fusion, **check your clocks first.** A mismatched epoch or a nanosecond-vs-second unit error masquerades perfectly as a modelling failure, and you can waste a day tuning covariances that were never the problem. ## Honest notes - **It needs a decent IMU and initialization.** The filter assumes a high-rate IMU (200 Hz+) and a short static period at start to estimate the gravity direction and biases. Garbage init, garbage trajectory. - **Geometric degeneracy is the real failure mode.** Point-to-plane constraints vanish in a long featureless tunnel or an open field — the update becomes unobservable along the degenerate direction and the IMU drift takes over. This is fundamental to LiDAR odometry, not a bug. - **It's odometry, not loop-closing SLAM.** FAST-LIO2 drifts slowly but has no global loop closure; pair it with a pose-graph backend (e.g. a FAST-LIO-SLAM setup) if you need globally consistent maps. - **The "100 Hz" is real but hardware-shaped.** The headline rates assume the ikd-Tree and a reasonable CPU; the per-scan cost grows with map density and point count, which is exactly what the downsampling and the moving window are there to bound. The thing I'd take away: FAST-LIO2 is not a pile of heuristics — it's *one* iterated error-state Kalman filter, fed a deskewed point cloud, corrected by point-to-plane residuals, over an incremental map, with a gain rewritten so thousands of measurements cost the same as a few. Understand those five steps and you can rebuild it, and you understand the spine of modern LiDAR SLAM. --- *Built on [FAST-LIO2: Fast Direct LiDAR-Inertial Odometry](https://arxiv.org/abs/2107.06829) (Xu, Cai, Bai, Zhang, 2021), the original [FAST-LIO](https://arxiv.org/abs/2010.08196) and [ikd-Tree](https://arxiv.org/abs/2102.10808) papers, and the [HKU-MARS/FAST_LIO](https://github.com/hku-mars/FAST_LIO) source. C++ snippets are from the clean reimplementation [zlwang7/S-FAST_LIO](https://github.com/zlwang7/S-FAST_LIO); Python is simplified for teaching.*