Evaluation on Neeladri Bhuiya

Evaluation on Neeladri Bhuiyahttps://zawedcvg.github.io/tags/evaluation/Recent content in Evaluation on Neeladri BhuiyaHugo -- 0.152.0enTue, 12 Nov 2024 00:00:00 +0000Seemingly Plausible Distractorshttps://zawedcvg.github.io/papers/paper1/Tue, 12 Nov 2024 00:00:00 +0000https://zawedcvg.github.io/papers/paper1/This paper shows that LLMs struggle with challenging multi-hop reasoning, but in more subtle ways than fine-tuned models