Psychological Safety

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

This paper introduces PsychoSafe, a refusal framework grounded in evidence-based intervention strategies, and evaluates prompting and fine-tuning approaches for psychologically informed LLM refusals.