type
status
date
Oct 10, 2024 05:52 AM
slug
summary
tags
category
icon
password
Information Original
Benchmarks
LongBench
6 different task types:


RULER
- Retrieval: needle-in-a-haystack (from )LLMTest_NeedleInAHaystackGithubLLMTest_NeedleInAHaystackOwnergkamradtUpdatedOct 10, 2024
- Multi-hop tracing: tracing variables with a given value
- Aggregation (summarization)
- QA (with distracting information)

Infrastructure
Position Interpolation
Using RoPE:


