Walid BENMAAROUF
Back to projects
PyTorchCustom Hybrid ViT-UNet architectureCNN encoder with Conv + ResBlocks and vertical inductive biasVision Transformer bottleneck operating on feature maps

LaneSight

AI-powered lane perception for real-world driving scenes.

Demo coming soonGitHub

Overview

LaneSight is an end-to-end multi-paradigm lane perception system designed for real-world driving scenes.

It combines deep semantic understanding (Hybrid ViT-UNet) with geometric lane modeling (UFLD), classical vision priors, and object-aware filtering (YOLO) into a unified inference pipeline.

The core model leverages a CNN encoder with a Vision Transformer bottleneck to capture both local lane structures and long-range spatial dependencies. It is trained using class-balanced Cross-Entropy and Dice loss to handle extreme foreground/background imbalance.

At inference time, lane hypotheses are validated, filtered, and fused using semantic confidence maps and vehicle awareness, producing robust lane masks even under occlusions, clutter, or non-ideal road geometries.

This project demonstrates full ownership of the computer vision stack: architecture design, loss engineering, dataset debugging, GPU-safe training, and multi-model fusion — bridging deep learning, geometry, and classical vision in a production-oriented setup.

At a glance

Sluglanesight
Tech57 items
RepoOpen

Demo

Preview video. (Muted looping hero + playable demo.)

Tech

Core ML

10 items
PyTorch
training, inference, autograd, AMP
Custom Hybrid ViT-UNet architecture
CNN encoder with Conv + ResBlocks and vertical inductive bias
Vision Transformer bottleneck operating on feature maps
not raw pixels
Multi-Head Self-Attention
MHSA
2D relative position bias
MLP blocks and DropPath
stochastic depth
Binary semantic segmentation
lane vs background
Losses
class-weighted Cross-Entropy + lane-only Dice loss
Optimization
AdamW, CosineAnnealingLR, gradient clipping

Geometry

5 items
UFLD
Ultra-Fast Lane Detection
Pretrained geometry-based lane detection model
Lane polyline extraction
Polyline resampling and normalization
Strong geometric lane priors

Perception Fusion

5 items
Multi-model fusion of ViT semantics, UFLD geometry, and YOLO context
Sampling ViT probability along lane polylines
Confidence-based lane filtering
Unified lane representation
Lane, LaneSet
Fusion of geometry + semantics + object awareness

Vision

11 items
OpenCV-based classical vision pipeline
Grayscale conversion and Gaussian smoothing
Canny edge detection
Region-of-Interest
ROI
Hough Transform for line detection
Lane polygon fitting
Temporal smoothing using EMA on lane slopes
Color-based priors using CIE Lab color space
White / yellow lane color anchors
Road color estimation
Soft edge weighting via color confidence maps

Detection

3 items
YOLOv8
Ultralytics
Vehicle detection for contextual awareness
Object-aware lane rejection
lanes crossing vehicles

Inference

4 items
Frame-wise inference on images and video streams
Resolution handling and dynamic resizing
Binary mask generation and overlay
Multi-source visualization
ViT / UFLD / YOLO / traditional

Experimentation

8 items
TensorBoard logging
Loss and learning-rate tracking
Gradient norm monitoring
Attention entropy analysis
Activation statistics inspection
Attention map capture
optional
NaN and numerical anomaly guards
Deterministic seeding for reproducibility

Scientific Stack

5 items
NumPy
SciPy
softmax for UFLD decoding
Python dataclasses
Typing
pathlib

Infra

6 items
Python 3
CUDA
cuDNN
GPU-first execution with safe CPU fallback
Windows-compatible training and inference
Modular repository architecture
training / inference / fusion / UI

Details

Next steps

  • • Enhance processing speed for real time use
  • • Implement on embedded electronic system