Reinforce
PPO agent for RL environments. This code is an adaptation of FOR.ai’s RL repository
Environments
- Gym environments (Cartpole, MountainCar, Pong etc)
- Gibson 3D Photo-realistic environment
Training
Environment
- Cartpole:
python train.py --hparams ppo_cartpole --sys local
- Pong:
python train.py --hparams ppo_pong --sys local
- MountainCar:
python train.py --hparams ppo_mountaincar --sys local
- Gibson:
python train.py --hparams ppo_gibson --sys local
To train asynchronously, set --num_workers
flag to the number of worker threads.
A complete list of hyperparameters used for each environment can be found here for gym envs and here for Gibson
Installation
pip install -r requirements.txt
To be able to train on the Gibson environment, you would need to install the following packages
- Install
habitat_sim
from here - Install
habitat_api
from here - Download the data from here and follow the structure mentioned on the repository
- Place the data in a folder named
data
underHabitat-TF/
Structure
rl/
├── agents
│ ├── agent.py
│ ├── algos
│ │ ├── action_function
│ │ │ ├── basic.py
│ │ │ ├── __init__.py
│ │ │ └── registry.py
│ │ ├── compute_gradient
│ │ │ ├── basic.py
│ │ │ ├── __init__.py
│ │ │ ├── registry.py
│ │ │ └── utils.py
│ │ ├── gibson_ppo.py
│ │ ├── __init__.py
│ │ ├── ppo.py
│ │ ├── registry.py
│ │ └── utils.py
│ ├── __init__.py
│ └── registry.py
├── envs
│ ├── configs
│ │ ├── baselines
│ │ ├── datasets
│ │ │ └── pointnav
│ │ ├── tasks
│ │ └── test
│ ├── env.py
│ ├── gibson.py
│ ├── gym_env.py
│ ├── habitat_env
│ │ └── gibson.py
│ ├── __init__.py
│ ├── registry.py
│ ├── reward_augmentation.py
│ └── utils.py
├── hparams
│ ├── defaults.py
│ ├── gibson_ppo.py
│ ├── __init__.py
│ ├── ppo.py
│ ├── registry.py
│ └── utils.py
├── __init__.py
├── memory
│ ├── __init__.py
│ ├── memory.py
│ ├── registry.py
│ └── simple.py
├── models
│ ├── basic
│ │ ├── basic.py
│ │ └── __init__.py
│ ├── gibson_model
│ │ ├── gibson_model.py
│ │ └── __init__.py
│ ├── __init__.py
│ ├── model.py
│ └── registry.py
└── utils
├── checkpoint.py
├── flags.py
├── __init__.py
├── logger.py
├── lr_schemes.py
├── rand.py
├── sys.py
└── utils.py
Acknowledgments
- for.ai
- Piotr Kozakowski
- Adapted from this repo
References
- Habitat paper
- FAIR habitat repositories
- Proximal Policy Optimization