BeClaude

phd-skills

725Community RegistryGeneralby Fatih Cagatay Akyon · MIT

Research skills for Claude Code — paper auditing, citation verification, experiment analysis, and methodology-first skills for academic workflows.

First seen 4/17/2026

Summary

This skill equips Claude Code with PhD-level research workflows for paper auditing, citation verification, experiment analysis, and methodology-first debugging.

  • It helps researchers and engineers catch costly AI mistakes before they waste compute, reproduce papers from arxiv, and run evidence-first experiments with discipline.

Overview

Catch AI mistakes before they cost weeks of compute. Reproduce papers from arxiv. Debug runs evidence-first. Compare experiments at the right epoch. Launch with discipline.

Built by Fatih Cagatay Akyon (1500+ citations, 7 patents) after 300+ Claude Code sessions, tens of critical AI mistakes caught the hard way, and thousands of hours of PhD research. Every guardrail in this plugin traces to a real mistake.

!Claude Code Plugin !MIT License !Zero Dependencies !No MCP Required


Why This Plugin Exists

Claude Code is powerful, but it makes research-specific mistakes that cost weeks of compute:

  • It typed "done?" as "dont?" and launched an unwanted upload of thousands of files
  • It analyzed my full dataset when I asked for a specific 4k/2k/2k split
  • It claimed a test covered a bug it had never actually verified
  • It never once looked at a figure it generated, just trusted the numbers
  • It restarted a 50-hour training job without diffing the config against the reference run, lost three days
  • It claimed an experiment was diverging based on a non-converged proxy metric, killed it before downstream eval would have shown the truth
  • It ran rm -rf on a path it had hallucinated from memory, lost local checkpoints

Other plugins give you more commands. This plugin gives you guardrails.


Install

code
claude plugin marketplace add fcakyon/phd-skills
claude plugin install phd-skills@phd-skills

The plugin works correctly the moment it is installed. Optional: run /phd-skills:setup for a 30-second tour of what was auto-detected and to opt into extras (notifications, allowlist, LaTeX).


Usage

Open Claude Code in your project directory, then:

  • /phd-skills:reproduce arxiv 2508.12345 reproduce a paper from arxiv URL through replication runs
  • "why is my loss diverging?" the debug skill auto-triggers, runs evidence-first probes
  • "compare run alpha to baseline" the compare skill auto-triggers, aligns at the same epoch
  • "launch the new training run" the launch skill auto-triggers, runs the pre-flight checklist
  • /loop 30m check experiment logs, notify me if metrics beat the baseline or if loss starts to diverge

Notifications (task completion, background agents) forward to ntfy / Slack / email after /phd-skills:setup.


What You Get

Commands

CommandWhat it does
`/phd-skills:xray`Audit paper against code and data (5 parallel dimensions)
`/phd-skills:factcheck`Verify BibTeX entries and cited claims against DBLP
`/phd-skills:gaps <topic>`Literature gap analysis with web confirmation
[/phd-skills:fortify [venue]](plugin/commands/fortify.md)Select strongest ablations + anticipate reviewer questions
`/phd-skills:setup`Auto-detection tour + optional extras
`/phd-skills:help`Show all features at a glance

Skills (auto-trigger, just describe what you need)

When you say...Skill activates
"reproduce this arxiv paper"Reproduce
"why is X failing / diverging / OOMing"Debug
"compare run A to baseline"Compare
"launch a new training run" / "kick off training"Launch
"design an ablation study"Experiment Design
"find related papers on X"Literature Research
"check if my numbers match the code"Paper Verification
"review my methods section for consistency"Paper Writing
"analyze dataset bias"Dataset Curation
"prepare code for open-source release"Research Publishing
"what will reviewers ask about this?"Reviewer Defense
"setup latex for CVPR"LaTeX Setup

Agents (Claude delegates automatically)

AgentWhat it doesSpecial
`paper-auditor`Cross-checks paper claims vs code and dataRuns in isolated worktree, remembers patterns across sessions
`experiment-analyzer`Analyzes results from wandb / neptune / tensorboard / mlflow / localHands off to compare and debug skills for discipline

Research Guardrails (run silently, you never invoke these)


How It Compares

phd-skillsflonat/claude-researchOthers
Commands to learn63913-20
Research integrity hooks11 (agent + 10 auto-detect)10
Paper reproduction (arxiv to runs)Yes (7-stage skill)NoNo
Paper-code consistency audit5-dimension parallelRead-only, no code cross-refNone
Experiment monitoring + SSH notificationsYes (ntfy / slack / email)NoNo
External dependenciesNonenpm + pip + MCP serversMCP required
Install time30 seconds10+ minutesVaries

Design Principles

  1. Methodology over scripts. Skills teach the approach, Claude generates code for your specific setup (wandb, neptune, local files, whatever)
  2. Human oversight first. Claude makes premature claims and jumps to conclusions. Every skill builds in verification checkpoints
  3. Actionable output. Ranked suggestions with specific fixes, never just a list of findings

License

MIT. Use it, fork it, adapt it to your research.

Thank you for the support!

![Star History Chart](https://www.star-history.com/#fcakyon/phd-skills&Date)

Contributors

<p align="center"> <a href="https://github.com/fcakyon/phd-skills/graphs/contributors"> <img src="https://contrib.rocks/image?repo=fcakyon/phd-skills" /> </a> </p>

Install & Usage

1
Create the skills directory
mkdir -p .claude/skills
2
Download the skill file
mkdir -p .claude/skills && curl -o .claude/skills/phd-skills.md https://raw.githubusercontent.com/fcakyon/phd-skills/main/SKILL.md
3
Invoke in Claude Code
/phd-skills

Use Cases

Audit a research paper for methodological flaws and citation accuracy before implementing it.
Reproduce experiments from an arxiv paper by comparing configs and logs against the reference run.
Debug a failed training run by analyzing metrics, logs, and code changes evidence-first.
Verify that a claimed test or fix actually covers the intended bug by checking code and outputs.
Compare two experiment runs at the correct epoch to avoid misleading proxy metrics.
Prevent accidental file operations by double-checking paths and commands before execution.

Usage Examples

1

/phd-skills audit paper https://arxiv.org/abs/2303.08774 for methodology and citation errors

2

/phd-skills reproduce experiment --paper-id 2303.08774 --config ./configs/exp1.yaml --logs ./logs/exp1

3

/phd-skills debug run --logs ./logs/failed_run --code ./src --metrics accuracy,loss

View source on GitHub
researchacademicphdpaper-writingcitationsexperimentslatex

Security Audits

LicensePassSourceWarnRepositoryPass

Frequently Asked Questions

What is phd-skills?

This skill equips Claude Code with PhD-level research workflows for paper auditing, citation verification, experiment analysis, and methodology-first debugging. It helps researchers and engineers catch costly AI mistakes before they waste compute, reproduce papers from arxiv, and run evidence-first experiments with discipline.

How to install phd-skills?

To install phd-skills: create the skills directory (mkdir -p .claude/skills), then run: mkdir -p .claude/skills && curl -o .claude/skills/phd-skills.md https://raw.githubusercontent.com/fcakyon/phd-skills/main/SKILL.md. Finally, /phd-skills in Claude Code.

What is phd-skills best for?

phd-skills is a skill categorized under General. It is designed for: research, academic, phd, paper-writing, citations, experiments, latex. Created by Fatih Cagatay Akyon.

What can I use phd-skills for?

phd-skills is useful for: Audit a research paper for methodological flaws and citation accuracy before implementing it.; Reproduce experiments from an arxiv paper by comparing configs and logs against the reference run.; Debug a failed training run by analyzing metrics, logs, and code changes evidence-first.; Verify that a claimed test or fix actually covers the intended bug by checking code and outputs.; Compare two experiment runs at the correct epoch to avoid misleading proxy metrics.; Prevent accidental file operations by double-checking paths and commands before execution..