This replicates the dataset construction procedure from PaperOrchestra (arXiv:2604.05018) so you can build your own benchmark cases for testing paper-writing pipelines. Feed it a research paper PDF and it reverse-engineers three files: a sparse idea (high-level, no math), a dense idea (full LaTeX equations), and an experimental log (raw numbers, stripped of all narrative). The prompts are taken verbatim from Appendix F.2 and enforce strict rules like no citation leakage, no figure references in the log, and complete anonymization. Use this when you want to test whether your pipeline can reconstruct a real paper from just the raw materials, then score the output with paper-autoraters against the original.
npx -y skills add ar9av/paperorchestra --skill paper-writing-bench --agent claude-codeInstalls into .claude/skills of the current project.
Select a file.
sickn33/antigravity-awesome-skills
kubesphere/kubesphere
supercent-io/skills-template