PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models
This paper introduces PsychoSafe, a refusal framework grounded in evidence-based intervention strategies, and evaluates prompting and fine-tuning approaches for psychologically informed LLM refusals.