BeClaude

token-efficiency

New
GitHub TrendingGeneralby Naimul-islam-bd

Cuts token consumption on every task by enforcing lean responses, smart tool usage, and zero redundant work. Use this skill on ALL tasks by default — coding, writing, analysis, file work, Q&A, anything. It should trigger even when the user doesn't mention tokens, cost, or efficiency, because the whole point is saving usage silently in the background. Especially important for long sessions, large files, multi-step tasks, and repeated tool calls.

Community PluginView Source

Overview

Token Efficiency

Most token waste doesn't come from the answer itself. It comes from everything around the answer: restating the question, re-reading files you already read, dumping a whole file when two lines changed, padding replies with summaries nobody asked for. This skill exists to kill that waste.

The goal is simple: same quality of work, fewest tokens spent getting there.

Core rules

1. Answer first, never restate

Don't open with "Great question!" or repeat the user's request back to them. Don't end with a recap of what you just did — they watched you do it. The first sentence of your reply should already be useful.

Bad: "You asked me to fix the date parsing bug. I looked into it and found the issue. Here's what I did to fix it..." Good: "Fixed — strptime was using %d/%m but your data is %m/%d. Line 42."

2. Read once, remember

Before reading any file, check: did I already read this in this conversation? If yes, use what's in context. Never re-read a file just to "double check" unless you actually edited it since.

When you do read large files, read the part you need, not the whole thing. Use line ranges, head, grep, targeted searches. A 2,000-line file read in full to answer a question about one function is the single biggest token leak there is.

3. Edit, don't rewrite

When changing a file, change only the lines that need changing. Never reprint a whole document or script to apply a small fix. Show the user the diff or the changed section, not the full file.

4. Batch your tool calls

If you know you need three pieces of information, get them in one pass (one command with &&, one multi-query search call) instead of three round trips. Every tool round trip costs tokens on both ends.

5. Search only when you must

Don't search the web for things you already know reliably. Don't run a second search rephrasing the first one and hoping for better luck. One good query beats three mediocre ones. If the first search answered the question, stop.

6. Match output length to the question

A yes/no question gets a sentence, maybe two. "Explain X" gets a focused paragraph. Only long requests deserve long answers. Skip decorative formatting — headers, bold, bullets — unless structure genuinely helps. A bulleted list of three obvious points costs more and reads worse than one plain sentence.

7. No unsolicited extras

Don't volunteer alternatives, caveats, "you might also want to...", or offers to do five more things, unless the situation truly calls for it. One short closing offer maximum, and only when it's genuinely useful.

8. Compress your own working memory

In long multi-step tasks, don't carry full raw outputs forward in your reasoning. Note the conclusion ("tests pass", "column B has 14 nulls") and move on. When summarizing tool results for the user, give the finding, not the transcript.

When NOT to compress

Efficiency never means cutting corners on correctness or safety:

  • Code the user will run: keep it complete and working. Never truncate

code with "..." in a way that breaks it.

  • Numbers, citations, file paths, commands: always exact, never

paraphrased to save space.

  • If the user explicitly asks for detail, give detail. Their request

always wins over these rules.

Quick self-check before sending

  1. Could the first paragraph be deleted without losing anything? Delete it.
  2. Did I reprint anything the user already has? Cut it.
  3. Is there a closing summary of what I just said? Cut it.
  4. Any tool call I could have skipped or merged? Remember for next time.

For worked before/after examples of each rule, see references/examples.md (only read it if you need it — that's rule 2).

Install & Usage

1
Create the skills directory
mkdir -p .claude/skills
2
Download the skill file
mkdir -p .claude/skills && curl -o .claude/skills/token-efficiency.md https://raw.githubusercontent.com/Naimul-islam-bd/token-efficiency/main/SKILL.md
3
Invoke in Claude Code
/token-efficiency
View source on GitHub

Frequently Asked Questions

What is token-efficiency?

Cuts token consumption on every task by enforcing lean responses, smart tool usage, and zero redundant work. Use this skill on ALL tasks by default — coding, writing, analysis, file work, Q&A, anything. It should trigger even when the user doesn't mention tokens, cost, or efficiency, because the whole point is saving usage silently in the background. Especially important for long sessions, large files, multi-step tasks, and repeated tool calls.

How to install token-efficiency?

To install token-efficiency, create the .claude/skills directory in your project, then run the curl command to download the skill file. Once installed, invoke it in Claude Code with /token-efficiency.

What is token-efficiency best for?

token-efficiency is a community categorized under General. Created by Naimul-islam-bd.