BeClaude
Back to News
Policy2018-03-20

Variance reduction for policy gradient with action-dependent factorized baselines

Source: OpenAI

openaigpt