rllib

When users starts Serve cluster with a set of options (http options, checkpoint path) and then connects to it with a different set of options, we should either update it, or error out.

Currently we use a very ad-hoc procedure for scaling the quadratic component of NAF when used for exploration:
https://github.com/angelolovatto/raylab/blob/9820275b17ee085e1955a6d845c0bdf61333f8da/raylab/algorithms/naf/naf_policy.py#L150-L155

A possibly better alternative would be to scale it based on the desired average action stddev. Something like:

scale_tril * (1.0 / average_st

Nov	DEC	Jan
	18
2020	2021	2022

rllib

Here are 39 public repositories matching this topic...

ray-project / ray

[Serve] Calling serve.start(detached=True) with different options should error

[Serve] Warn if deployments have gcs uris and is checkpointed to external kv.

[RLlib] Deprecate Internally Maintained Probability Distributions In Favor Of Native TFP And torch.distributions Solutions

utiasDSL / gym-pybullet-drones

Draichi / T-1000

ChuaCheowHuan / gym-continuousDoubleAuction

druce / rl

DerwenAI / ray_tutorial

JacopoPan / a-minimalist-guide

DerwenAI / rllib_tutorials

angelolovatto / raylab

Scale tril by desired average action stddev

goshaQ / adaptive-tls

AhmetFurkanDEMIR / SuperMarioBrosRL

DerwenAI / gym_example

CN-UPB / DeepCoMP

dcos-labs / dcos-jupyterlab-service

akirasosa / aie-train

nicofirst1 / rl_werewolf

HumanCompatibleAI / better-adversarial-defenses

ChuaCheowHuan / PBT_MARL_watered_down

rlew631 / AutonomousVehicleSimulation

toanngosy / robustprosthetics

Senmumu / ray_project_doc

wullli / flatlander

mynkpl1998 / upgraded-octo-lamp

xdralex / pioneer

hybug / RL_Lab

eescriba / smart-cities-drl

stefanbschneider / mobile-env

thiagopbueno / model-aware-policy-optimization

Add Gym env for Navigation domain with bimodal dynamics distribution

Create Value function class

jthelin / HelloRayActors

ChuaCheowHuan / sagemaker_Ray_RLlib_custom_env

Improve this page

Add this topic to your repo