BeClaude

training-llms-megatron

New
SmitheryGeneralby davila7

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. Use when training models >1B parameters, need maximum GPU efficiency (47% MFU on H100), or require tensor/pipeline/sequence/context/expert parallelism. Production-ready framework used for Nemotron, LLaMA, DeepSeek.

First seen 5/22/2026

Install & Usage

1
Open your MCP config
~/.claude.json
2
Add the server config

Add the configuration to "mcpServers": { "training-llms-megatron": { "command": "...", "args": [] } }

3
Restart Claude Code
/mcp
View source on GitHub
ai-&-mlresearch

Security Audits

LicenseUnknownSourceWarnRepositoryPass

Frequently Asked Questions

What is training-llms-megatron?

Trains large language models (2B-462B parameters) using NVIDIA Megatron-Core with advanced parallelism strategies. Use when training models >1B parameters, need maximum GPU efficiency (47% MFU on H100), or require tensor/pipeline/sequence/context/expert parallelism. Production-ready framework used for Nemotron, LLaMA, DeepSeek.

How to install training-llms-megatron?

To install training-llms-megatron: open your mcp config (~/.claude.json), then add the config to "mcpServers": { "training-llms-megatron": { "command": "...", "args": [] } }. Finally, /mcp in Claude Code.

What is training-llms-megatron best for?

training-llms-megatron is a mcp categorized under General. It is designed for: ai-&-ml, research. Created by davila7.