In the next two weeks, IB’s Collegiate Olympiad starts. The following describes the system I’m entering. I previously mentioned that it uses TWS’s ActiveX API to connect to IB through Matlab and listed some other info not covered below. Links to all the Matlab files are at the bottom of this post.
System Process Flow
1. Load Data – modules for Yahoo and IB
2. Preprocess
3. Prediction Engine
4. Position Sizing
5. Execution
+ Backtest
Process Flow Description
1. Historical data, including the most recent period is downloaded from Interactive Brokers.
2. OHLC numbers are converted into periodic returns, and the put in the proper ordering, newest to oldest.
3. Support vector regression with a Gaussian kernel, using parameters (C, γ) chosen by sliding window validation, is used to predict the next period’s return for each security/contract. These predictions are normalized and weighted by a confidence value. (code outline below)
4. (manual for now)
5. Send the basket of orders to IB through the TWS ActiveX API
+ Backtesting is relatively efficient because most validation folds are redundant
Prediction Engine Code Outline
Initialize parameters
Pre-allocate array mem...
In the next two weeks, IB’s Collegiate Olympiad starts. The following describes the system I’m entering. I previously mentioned that it uses TWS’s ActiveX API to connect to IB through Matlab and listed some other info not covered below. Links to all the Matlab files are at the bottom of this post.
System Process Flow
1. Load Data – modules for Yahoo and IB
2. Preprocess
3. Prediction Engine
4. Position Sizing
5. Execution
+ Backtest
Process Flow Description
1. Historical data, including the most recent period is downloaded from Interactive Brokers.
2. OHLC numbers are converted into periodic returns, and the put in the proper ordering, newest to oldest.
3. Support vector regression with a Gaussian kernel, using parameters (C, γ) chosen by sliding window validation, is used to predict the next period’s return for each security/contract. These predictions are normalized and weighted by a confidence value. (code outline below)
4. (manual for now)
5. Send the basket of orders to IB through the TWS ActiveX API
+ Backtesting is relatively efficient because most validation folds are redundant
Prediction Engine Code Outline
Initialize parameters
Pre-allocate array memory
Data error checking and preprocessing
For each contract (i.e. security)
For each parameter permutation (C, γ)
For each validation fold
Train the SVM on the training sample
Make test prediction and compare to known test sample
Save all processing-time and prediction data
End
Calculate validation performance of (C, γ)
End
Chart results for human inspection
Predict the next period’s returns
End
Confidence based on validation accuracy: correlation
Choose out- & under-performers based on prediction z-scores*confidence
Final Words
The schematic sketch turned out to be too wordy so I used this format instead. All of the information above is mirrored in the system’s code, and is intended to be used as a reader’s guide. Without this, I doubt the code’s comprehensibility. Some parts are simply very complex and may be hard to conceptualize not having been the original inventor, ex. the 6-D array ‘storTestPred’. If you are especially interested in a certain part, such as the sliding window validation or confidence values, please leave a comment or send me an email and I explain in more detail, maybe posting a video if it would be clearer. If you want to do a test run with yahoo data, download all the files to Matlab’s current directory and then execute sys2.m and predictionengine.m. Make sure you have no variables lying around by restarting Matlab or typing >> clear. Both are scripts so you can just do it like this at the Matlab prompt: >> sys2 [press enter, then wait a few seconds], >> predictionengine [press enter, then wait a minute or two]. Some numbers and charts will pop up at the end- you need to understand the code in order to understand these results.
I’m not worried about the system’s strategy being “arbed away” by sharing because of its generality and flexibility. Also it’s probably challenging to understand the code if you didn’t spend many hours writing it. And generally, I don’t subscribe to the secretiveness of the trading subculture. I hope bits are useful. Much can be pared away to create a general system framework. The validation and data pulling components should be especially useful for that. Actually, I may gut some of the internals and make a template that’s a little more fun and easy to play with later. Feel free to comment on anything.
Files: loaddata.m makeportfolio.m predictionengine.m preprocess.m svmpredict.mexw32 svmtrain.mexw32 sys2.m libsvm-mat-2.87-1.zip, or LIBSVM from the authors’ website– using the first one on the list of “Interfaces to LIBSVM”. My version of LIBSVM was slightly modified by me to suppress some useless output that was slowing down validation so I don’t know if the author’s version will work exactly the same- but it should. Here’s everything in one zip file if you are ok with potentially virus-infested zip files (personally I don’t trust them)