RepV

Safety-Separable Latent Spaces for Scalable Neurosymbolic Plan Verification


Abstract

As AI systems migrate to safety–critical domains, verifying that their actions comply with well-defined rules remains a challenge. Formal methods provide provable guarantees but demand hand-crafted temporal-logic specifications, offering limited expressiveness and accessibility. Deep learning approaches enable evaluation of plans against natural-language constraints, yet their opaque decision process invites misclassifications with potentially severe consequences.

We introduce RepV, a neurosymbolic verifier that unifies both views by learning a latent space where safe and unsafe plans are linearly separable. Starting from a modest seed set of plans labeled by an off-the-shelf model checker, RepV trains a lightweight projector that embeds each plan, together with a language model-generated rationale, into a low-dimensional space; a frozen linear boundary then verifies compliance for unseen natural-language rules in a single forward pass.

Beyond binary classification, RepV provides a probabilistic guarantee on the likelihood of correct verification based on its position in the latent space. This guarantee enables a guarantee-driven refinement of the planner, improving rule compliance without human annotations. Empirical evaluations show that RepV improves compliance prediction accuracy by up to 15% compared to baseline methods while adding fewer than 0.2 M parameters. Furthermore, our refinement framework outperforms ordinary fine-tuning baselines across various planning domains. These results show that safety-separable latent spaces offer a scalable, plug-and-play primitive for reliable neurosymbolic plan verification.


Framework Overview

We introduce RepV, a neurosymbolic verifier that bridges formal methods and deep representation learning by learning a latent space where safe and unsafe plans become linearly separable. Starting from a small seed set of plans labeled by an off-the-shelf model checker, RepV trains a lightweight projector that embeds each plan—along with its LLM-generated rationale—into a low-dimensional, safety-aware space. A frozen linear boundary then certifies compliance for unseen natural-language rules in a single forward pass, providing a probabilistic guarantee of rule satisfaction rather than a binary decision.

Beyond verification, RepV enables guarantee-guided refinement of foundation models: the probabilistic guarantee scores serve as reward signals to fine-tune or rank generated plans, improving the likelihood of producing specification-compliant actions without human intervention. This makes RepV an efficient and scalable framework for safety-critical reasoning and plan generation across diverse robotic and simulation environments.

Slideshow Video Coming Soon

Check back later for our presentation video

Navigation Examples

Task prompt: Fly in a square-shaped trajectory

Rule: Keep altitude below 10 meters and landing speed within 1 m/s

Task prompt: Go to main entrance and sit down

Rule: Bypass the obstacles and do not stop

Task prompt: Go straight 8 meters and turn left

Rule: Bypass the obstacles and do not stop