Robot Learning

Reinforcement Learning Overview¶

General classification of RL

graph TD;
    A((Machine Learning));
    B[predictive, supervised];
    C[descriptive, unsupervised];
    D[active, reinforcement learning];
    A-->B;
    A-->C;
    A-->D;

Algorithms

graph TD;
    A((RL algorithms));
    B[Action Value Function];
    C[Policy Gradient];
    B1[<a href=https://en.wikipedia.org/wiki/Q-learning>Q-learning</a>];
    B2[SARSA];
    C1[Actor critic];
    C11[A3C]
    C12[ACKTER]
    C2[TRPO];
        C3[PPO];
    A-->B;
    A-->C;
    B-->B1;
    B-->B2;
    C-->C1;
    C-->C2;
    C-->C3;
    C1-->C11;
    C1-->C12;

How to handle your data?

Back-propagation

Deep Learning¶

-> MLP's

DNN¶

(Deep Neuronal Network)

CNN¶

RNN¶

Regression¶

Ordinary Least Square¶

Weighted Regression¶

Ridge Regression¶

Local Ridge Regression¶

Material¶

Youtube, RL-with-Pytorch by Sung Kim

Pytorch¶

Install it in Linux within a Anaconda environment:

    conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

Test if it worked:

python3
Python 3.6.3 |Anaconda, Inc.| (default, Oct 13 2017, 12:02:49) 
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
1.0.1.post2

see videos from Sung Kim.

Abbreviations¶

Symbols¶

Symbol	Name	Description
--	Policy	German: Handlung
--	Rollout	German: ?
--	Reward	German: ?
--	State	German: Zustand,
--	Action	German: Aktion,
--	Advantage Function
--	Value Function
--	Finite Horizon
--	Q-Value
--	On policy	Agent can pick actions, Agent follows his own policy
--	Off policy	Agent can not pick actions, Learns with exploration and from expert