PyTorchCustom Hybrid ViT-UNet architectureCNN encoder with Conv + ResBlocks and vertical inductive biasVision Transformer bottleneck operating on feature maps

LaneSight

AI-powered lane perception for real-world driving scenes.

Demo coming soonGitHub

Overview Demo Gallery Tech Details

Overview

LaneSight is an end-to-end multi-paradigm lane perception system designed for real-world driving scenes.

It combines deep semantic understanding (Hybrid ViT-UNet) with geometric lane modeling (UFLD), classical vision priors, and object-aware filtering (YOLO) into a unified inference pipeline.

The core model leverages a CNN encoder with a Vision Transformer bottleneck to capture both local lane structures and long-range spatial dependencies. It is trained using class-balanced Cross-Entropy and Dice loss to handle extreme foreground/background imbalance.

At inference time, lane hypotheses are validated, filtered, and fused using semantic confidence maps and vehicle awareness, producing robust lane masks even under occlusions, clutter, or non-ideal road geometries.

This project demonstrates full ownership of the computer vision stack: architecture design, loss engineering, dataset debugging, GPU-safe training, and multi-model fusion — bridging deep learning, geometry, and classical vision in a production-oriented setup.

At a glance

Sluglanesight

Tech57 items

RepoOpen

Demo

Preview video. (Muted looping hero + playable demo.)

Gallery

Screenshots, flows, and key moments.

Tech

Core ML

10 items

PyTorch

training, inference, autograd, AMP

Custom Hybrid ViT-UNet architecture

CNN encoder with Conv + ResBlocks and vertical inductive bias

Vision Transformer bottleneck operating on feature maps

not raw pixels

Multi-Head Self-Attention

MHSA

2D relative position bias

MLP blocks and DropPath

stochastic depth

Binary semantic segmentation

lane vs background

Losses

class-weighted Cross-Entropy + lane-only Dice loss

Optimization

AdamW, CosineAnnealingLR, gradient clipping

Geometry

5 items

UFLD

Ultra-Fast Lane Detection

Pretrained geometry-based lane detection model

Lane polyline extraction

Polyline resampling and normalization

Strong geometric lane priors

Perception Fusion

5 items

Multi-model fusion of ViT semantics, UFLD geometry, and YOLO context

Sampling ViT probability along lane polylines

Confidence-based lane filtering

Unified lane representation

Lane, LaneSet

Fusion of geometry + semantics + object awareness

Vision

11 items

OpenCV-based classical vision pipeline

Grayscale conversion and Gaussian smoothing

Canny edge detection

Region-of-Interest

ROI

Hough Transform for line detection

Lane polygon fitting

Temporal smoothing using EMA on lane slopes

Color-based priors using CIE Lab color space

White / yellow lane color anchors

Road color estimation

Soft edge weighting via color confidence maps

Detection

3 items

YOLOv8

Ultralytics

Vehicle detection for contextual awareness

Object-aware lane rejection

lanes crossing vehicles

Inference

4 items

Frame-wise inference on images and video streams

Resolution handling and dynamic resizing

Binary mask generation and overlay

Multi-source visualization

ViT / UFLD / YOLO / traditional

Experimentation

8 items

TensorBoard logging

Loss and learning-rate tracking

Gradient norm monitoring

Attention entropy analysis

Activation statistics inspection

Attention map capture

optional

NaN and numerical anomaly guards

Deterministic seeding for reproducibility

Scientific Stack

5 items

NumPy

SciPy

softmax for UFLD decoding

Python dataclasses

Typing

pathlib

Infra

6 items

Python 3

CUDA

cuDNN

GPU-first execution with safe CPU fallback

Windows-compatible training and inference

Modular repository architecture

training / inference / fusion / UI

Details

Project info

Repositoryhttps://github.com/walid7shind/LaneSight

Next steps

• Enhance processing speed for real time use
• Implement on embedded electronic system

Back to projects