Trading and Backtesting environment for training reinforcement learning agent or simple rule base algo.

TradingGym

Build Status

TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated the framework form. Not only traning env but also has backtesting and in the future will implement realtime trading env with Interactivate Broker API and so on.

This training env originally design for tickdata, but also support for ohlc data format. WIP.

Installation

git clone https://github.com/Yvictor/TradingGym.git
cd TradingGym
python setup.py install

Getting Started

import random
import numpy as np
import pandas as pd
import trading_env

df = pd.read_hdf('dataset/SGXTW.h5', 'STW')

env = trading_env.make(env_id='training_v1', obs_data_len=256, step_len=128,
                       df=df, fee=0.1, max_position=5, deal_col_name='Price', 
                       feature_names=['Price', 'Volume', 
                                      'Ask_price','Bid_price', 
                                      'Ask_deal_vol','Bid_deal_vol',
                                      'Bid/Ask_deal', 'Updown'])

env.reset()
env.render()

state, reward, done, info = env.step(random.randrange(3))

### randow choice action and show the transaction detail
for i in range(500):
    print(i)
    state, reward, done, info = env.step(random.randrange(3))
    print(state, reward)
    env.render()
    if done:
        break
env.transaction_details
  • obs_data_len: observation data length
  • step_len: when call step rolling windows will + step_len
  • df exmaple
index datetime bid ask price volume serial_number dealin
0 2010-05-25 08:45:00 7188.0 7188.0 7188.0 527.0 0.0 0.0
1 2010-05-25 08:45:00 7188.0 7189.0 7189.0 1.0 1.0 1.0
2 2010-05-25 08:45:00 7188.0 7189.0 7188.0 1.0 2.0 -1.0
3 2010-05-25 08:45:00 7188.0 7189.0 7188.0 4.0 3.0 -1.0
4 2010-05-25 08:45:00 7188.0 7189.0 7188.0 2.0 4.0 -1.0
  • df: dataframe that contain data for trading

serial_number -> serial num of deal at each day recalculating

  • fee: when each deal will pay the fee, set with your product.
  • max_position: the max market position for you trading share.
  • deal_col_name: the column name for cucalate reward used.
  • feature_names: list contain the feature columns to use in trading status.

gif

Training

simple dqn

  • WIP

policy gradient

  • WIP

actor-critic

  • WIP

A3C with RNN

  • WIP

Backtesting

  • loading env just like training
env = trading_env.make(env_id='backtest_v1', obs_data_len=1024, step_len=512,
                       df=df, fee=0.1, max_position=5, deal_col_name='Price', 
                        feature_names=['Price', 'Volume', 
                                       'Ask_price','Bid_price', 
                                       'Ask_deal_vol','Bid_deal_vol',
                                       'Bid/Ask_deal', 'Updown'])
  • load your own agent
class YourAgent:
    def __init__(self):
        # build your network and so on
        pass
    def choice_action(self, state):
        ## your rule base conditon or your max Qvalue action or Policy Gradient action
         # action=0 -> do nothing
         # action=1 -> buy 1 share
         # action=2 -> sell 1 share
        ## in this testing case we just build a simple random policy 
        return np.random.randint(3)
  • start to backtest
agent = YourAgent()

transactions = []
while not env.backtest_done:
    state = env.backtest()
    done = False
    while not done:
        state, reward, done, info = env.step(agent.choice_action(state))
        #print(state, reward)
        #env.render()
        if done:
            transactions.append(info)
            break
transaction = pd.concate(transactions)
transaction
step datetime transact transact_type price share price_mean position reward_fluc reward reward_sum color rotation
2 1537 2013-04-09 10:58:45 Buy new 277.1 1.0 277.100000 1.0 0.000000e+00 0.000000e+00 0.000000 1 1
5 3073 2013-04-09 11:47:26 Sell cover 276.8 -1.0 277.100000 0.0 -4.000000e-01 -4.000000e-01 -0.400000 2 2
10 5633 2013-04-09 13:23:40 Sell new 276.9 -1.0 276.900000 -1.0 0.000000e+00 0.000000e+00 -0.400000 2 1
11 6145 2013-04-09 13:30:36 Sell new 276.7 -1.0 276.800000 -2.0 1.000000e-01 0.000000e+00 -0.400000 2 1
... ... ... ... ... ... ... ... ... ... ... ... ... ...
211 108545 2013-04-19 13:18:32 Sell new 286.7 -1.0 286.525000 -2.0 -4.500000e-01 0.000000e+00 30.650000 2 1
216 111105 2013-04-19 16:02:01 Sell new 289.2 -1.0 287.416667 -3.0 -5.550000e+00 0.000000e+00 30.650000 2 1
217 111617 2013-04-19 17:54:29 Sell new 289.2 -1.0 287.862500 -4.0 -5.650000e+00 0.000000e+00 30.650000 2 1
218 112129 2013-04-19 21:36:21 Sell new 288.0 -1.0 287.890000 -5.0 -9.500000e-01 0.000000e+00 30.650000 2 1
219 112129 2013-04-19 21:36:21 Buy cover 288.0 5.0 287.890000 0.0 0.000000e+00 -1.050000e+00 29.600000 1 2

128 rows × 13 columns

exmaple of rule base usage

  • ma crossover and crossunder
env = trading_env.make(env_id='backtest_v1', obs_data_len=10, step_len=1,
                       df=df, fee=0.1, max_position=5, deal_col_name='Price', 
                       feature_names=['Price', 'MA'])
class MaAgent:
    def __init__(self):
        pass
        
    def choice_action(self, state):
        if state[-1][0] > state[-1][1] and state[-2][0] <= state[-2][1]:
            return 1
        elif state[-1][0] < state[-1][1] and state[-2][0] >= state[-2][1]:
            return 2
        else:
            return 0
# then same as above
Comments
  • step_len is added twice to check it exceeds len(self.price)

    step_len is added twice to check it exceeds len(self.price)

    https://github.com/Yvictor/TradingGym/blob/8af2bb02250797c580b37c7ef43aab21b2d968c7/trading_env/envs/training_v1.py#L183-L207

    Maybe line 207 should be changed from

     if self.step_st+self.obs_len+self.step_len >= len(self.price):
    

    to

     if self.step_st+self.obs_len >= len(self.price):
    
  • Devs/restruc

    Devs/restruc

    • separate training and backtesting mode and work in process realtime mode.
    • origin env as v0 and add new env v1
    • v1 has tick level transactions detail and support to return the transactions and fluctuant reward in observation state.
  • Update training_v1.py

    Update training_v1.py

    enter_price = self.chg_price[0] -> enter_price = self.chg_price[-1]

    Since actual trading occurs at the last time step, the enter_price should point to the last entry of the price array

  • AttributeError: 'trading_env' object has no attribute 'backtest_done'

    AttributeError: 'trading_env' object has no attribute 'backtest_done'

    Hey! Great project, really excited about it but having a small problem.

    Can't run the agent, env.backtest is causing a problem, so in the trading_env.make function I added the parameter backtest=1, and now the backtest_done is popping up as having to attribute.

    This seems to be a problem with threading, from what I can read online.

    Has anyone had success beyond the point of this error? Is it a package version I can downgrade?

    Attached my notebook. trading-gym-andy.zip

    no matter what I add to trading_env.make, including backtest_done=0 or 1, it fails to see it as an attribute... wtf!

  • Error while rendering environment?

    Error while rendering environment?

    Thank you for this I was looking for something like this. when I am trying out this I am getting a assertion error like this

    self.ax.lines.remove(self.price_plot[0]) AttributeError: 'trading_env' object has no attribute 'ax'

    data frame sample

    >>> df.head()
       serial_number        Date   Open   High    Low  Close      Volume
    0              0  1998-01-02  13.63  16.25  13.50  16.25   6411700.0
    1              1  1998-01-05  16.50  16.56  15.19  15.88   5820300.0
    2              2  1998-01-06  15.94  20.00  14.75  18.94  16182800.0
    3              3  1998-01-07  18.81  19.00  17.31  17.50   9300200.0
    4              4  1998-01-08  17.44  18.62  16.94  18.19   6910900.0
    
    

    I want to use Volume column alone as the my observation state. This is the whole code

    import random
    import numpy as np
    import pandas as pd
    import trading_env
    
    df = pd.read_csv('./dataset/AAPL.csv')
    
    df.rename(columns={'Unnamed: 0':'serial_number'},inplace=True)
    
    env = trading_env.make(env_id='training_v1', obs_data_len=50, step_len=14,
                           df=df, fee=0.1, max_position=5, deal_col_name='Close', 
                           feature_names=['Volume'])
    
    
    env.reset()
    env.render()
    
    state, reward, done, info = env.step(random.randrange(3))
    
    ### randow choice action and show the transaction detail
    for i in range(500):
        print(i)
        state, reward, done, info = env.step(random.randrange(3))
        print(state, reward)
        env.render()
        if done:
            break
    
    env.transaction_details
    

    IMPORTANT: Can you please tell me what is the action space index for buy, sell and hold. I can see action space consists of three integers but can't find where its mapped to.

  • Devs/restruc

    Devs/restruc

    • abandon v0
    • v1 support for customize return transaction into state
    • support render with show(bool) and directly save or get the fig to apply other action like use ipywidgets to show realtime update in jupyter notebook

    todo

    • [ ] support with ohlc data with action(buy or sell) in next bar open
  • RL example strategy

    RL example strategy

    Hi, thanks for another new great gym environment!

    Its not actual issue, more like question: I am curious about actual RL trained strategies examples (not necessary profitable), because I didn't find one and OpenAI didn't pay much attention to financial gyms. It will be much easier to learn on existed examples, how tune architecture or parameters. I would be very grateful if you point me out useful links if you know some. Thanks

This project uses reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can learn to read tape. The project is dedicated to hero in life great Jesse Livermore.

Reinforcement-trading This project uses Reinforcement learning on stock market and agent tries to learn trading. The goal is to check if the agent can

Dec 1, 2022
Trading environnement for RL agents, backtesting and training.

TradzQAI Trading environnement for RL agents, backtesting and training. Live session with coinbasepro-python is finaly arrived ! Available sessions: L

Oct 30, 2022
A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)
A general-purpose, flexible, and easy-to-use simulator alongside an OpenAI Gym trading environment for MetaTrader 5 trading platform (Approved by OpenAI Gym)

gym-mtsim: OpenAI Gym - MetaTrader 5 Simulator MtSim is a simulator for the MetaTrader 5 trading platform alongside an OpenAI Gym environment for rein

Nov 30, 2022
Deep Reinforcement Learning based Trading Agent for Bitcoin
Deep Reinforcement Learning based Trading Agent for Bitcoin

Deep Trading Agent Deep Reinforcement Learning based Trading Agent for Bitcoin using DeepSense Network for Q function approximation. For complete deta

Nov 14, 2022
Multi-agent reinforcement learning algorithm and environment
 Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

Sep 20, 2022
A Python-based development platform for automated trading systems - from backtesting to optimisation to livetrading.
A Python-based development platform for automated trading systems - from backtesting to optimisation to livetrading.

AutoTrader AutoTrader is Python-based platform intended to help in the development, optimisation and deployment of automated trading systems. From sim

Nov 27, 2022
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Nov 15, 2022
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing
An image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testingAn image base contains 490 images for learning (400 cars and 90 boats), and another 21 images for testing

SVM Données Une base d’images contient 490 images pour l’apprentissage (400 voitures et 90 bateaux), et encore 21 images pour fait des tests. Prétrait

Nov 30, 2021
​TextWorld is a sandbox learning environment for the training and evaluation of reinforcement learning (RL) agents on text-based games.

TextWorld A text-based game generator and extensible sandbox learning environment for training and testing reinforcement learning (RL) agents. Also ch

Nov 23, 2022
A resource for learning about deep learning techniques from regression to LSTM and Reinforcement Learning using financial data and the fitness functions of algorithmic trading

A tour through tensorflow with financial data I present several models ranging in complexity from simple regression to LSTM and policy networks. The s

Nov 23, 2022
Algo-burn - Script to configure an Algorand address as a "burn" address for one or more ASA tokens

Algorand Burn Address This is a simple script to illustrate how a "burn address"

May 10, 2022
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

Nov 7, 2022
Scalable, event-driven, deep-learning-friendly backtesting library
Scalable, event-driven, deep-learning-friendly backtesting library

...Minimizing the mean square error on future experience. - Richard S. Sutton BTGym Scalable event-driven RL-friendly backtesting library. Build on

Dec 1, 2022
Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)
Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)

Gated-Attention Architectures for Task-Oriented Language Grounding This is a PyTorch implementation of the AAAI-18 paper: Gated-Attention Architecture

Nov 5, 2022
Reinforcement Learning with Q-Learning Algorithm on gym's frozen lake environment implemented in python

Reinforcement Learning with Q Learning Algorithm Q learning algorithm is trained on the gym's frozen lake environment. Libraries Used gym Numpy tqdm P

Nov 10, 2021
A library of multi-agent reinforcement learning components and systems
A library of multi-agent reinforcement learning components and systems

Mava: a research framework for distributed multi-agent reinforcement learning Table of Contents Overview Getting Started Supported Environments System

Dec 4, 2022
Pytorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.

Off-Policy Multi-Agent Reinforcement Learning (MARL) Algorithms This repository contains implementations of various off-policy multi-agent reinforceme

Nov 25, 2022
Urban mobility simulations with Python3, RLlib (Deep Reinforcement Learning) and Mesa (Agent-based modeling)
Urban mobility simulations with Python3, RLlib (Deep Reinforcement Learning) and Mesa (Agent-based modeling)

Deep Reinforcement Learning for Smart Cities Documentation RLlib: https://docs.ray.io/en/master/rllib.html Mesa: https://mesa.readthedocs.io/en/stable

May 15, 2022
Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning
Rethinking the Importance of Implementation Tricks in Multi-Agent Reinforcement Learning

RIIT Our open-source code for RIIT: Rethinking the Importance of Implementation Tricks in Multi-AgentReinforcement Learning. We implement and standard

Dec 1, 2022