The Artificial Neural Network
Download
Report
Transcript The Artificial Neural Network
DOCTORAL SCHOOL OF FINANCE AND BANKING DOFIN
ACADEMY OF ECONOMIC STUDIES
Forecasting the BET-C Stock Index with
Artificial Neural Networks
MSc Student: Stoica Ioan-Andrei
Supervisor: Professor Moisa Altar
July 2006
Stock Markets and Prediction
Predicting stock prices - goal of every investor trying to achieve profit on the
stock market
predictability of the market - issue that has been discussed by a lot of
researchers and academics
Efficient Market Hypothesis - Eugene Fama
three forms:
Weak: future stock prices can’t be predicted using past stock prices
Semi-strong: even published information can’t be used to predict future
prices
Strong: market can’t be predicted no matter what information is available
Stock Markets and Prediction
Technical Analysis
‘castles-in-the air’
investors behavior and reactions according to these anticipations
Fundamental Analysis
‘firm foundations’
stocks have an intrinsic value determined by present conditions and
future prospects of the company
Traditional Time Series Analysis
uses historic data attempting to approximate future values of a time series
as a linear combination
Machine Learning - Artificial Neural Networks
The Artificial Neural Network
computational technique that benefits
from techniques similar to those
employed in the human brain
1943 - W.S. McCulloch and W. Pitts
attempted to mimic the ability of the
human brain to process data and
information and comprehend patterns
and dependencies
The human brain - a complex,
nonlinear and parallel computer
The neurons:
elementary information
processing units
building blocks of a
neural network
The Artificial Neural Network
semi-parametric approximation method
Advantages:
ability to detect nonlinear dependencies
parsimonious compared to polynomial expansions
generalization ability and robustness
no assumptions of the model have to be made
flexibility
Disadvantages:
has the ‘black box’ property
training requires an experienced user
training takes a lot of time, fast computer needed
overtraining overfitting
undertraining underfitting
The Artificial Neural Network
y f ( x) sin x ln x
The Artificial Neural Network
y sin x
The Artificial Neural Network
Overtraining/Overfitting
The Artificial Neural Network
Undertraining/Underfitting
Architecture of the Neural Network
Types of layers:
input layer: number of neurons = number of inputs
output layer: number of neurons = number of outputs
hidden layer(s): number of neurons = trial and error
Connections between neurons:
fully connected
partially connected
The activation function:
threshold function
piecewise linear function
sigmoid functions
The feed forward network
n
nk ,t k ,0 k ,i * xi ,t
i 1
N k ,t L(nk ,t )
^
1
n
1 e k ,t
m
y t 0 k * N k ,t
k 1
m = number of hidden layer neurons
n = number of inputs
The Feed forward Network with Jump Connections
n
nk ,t k , 0 k ,i * xi ,t
i 1
N k ,t L(nk ,t )
^
1
n
1 e k ,t
m
n
k 1
i 1
y t 0 k * N k ,t i xi ,t
The Recurrent Neural Network - Elman
n
m
i 1
k 1
nk ,t k ,0 k ,i * xi ,t k * nk ,t 1
N k ,t L(nk ,t )
^
1
n
1 e k ,t
m
y t 0 k * N k ,t
k 1
allows the neurons to depend on their own lagged values building ‘memory’ in their evolution
Training the Neural Network
Objective: minimizing the discrepancy between real data and the output of the network
T
^
min () ( yt y t ) 2
t 1
^
y t f ( x t ; )
Ω - the set of parameters
Ψ – loss function
Ψ nonlinear nonlinear optimization problem
- backpropagation
- genetic algorithm
The Backpropagation Algorithm
alternative to quasi-Newton gradient descent
Ω0 – randomly generated
(1 0 ) 0
ρ – learning parameter, in [.05,.5]
after n iterations: μ=0.9, momentum parameter
n n1 n1 ( n1 n2 )
problem: local minimum points
The Genetic Algorithm
based on Darwinian laws
Population Creation: N random vectors of weights
Selection (Ωi Ωj) parent vectors
Crossover & Mutation C1,C2 children vectors
Election Tournament: the fittest 2 vectors passed to the next
generation
Convergence: G* generations
G* - large enough so there are no significant changes in the
fitness of the best individual for several generations
Experiments and Results
Data
BET-C stock index – daily closing prices, 16 April 1998 until 18 May 2006
daily returns: Rt ln Pt ln Pt 1 ln Pt
Pt 1
20
conditional volatility - rolling 20-day standard deviation:
BDS-Test for nonlinear dependencies:
H0: i.i.d. data
BDSm,ε~N(0,1)
Series
m=2
Vt
_
( Rt i R t ) 2
i 1
m=3
19
m=4
ε=1
ε=1.5
ε=1
ε=1.5
ε=1
ε=1.5
OD
16.6526
17.6970
18.5436
18.7202
19.7849
19.0588
ARF
16.2626
17.2148
18.3803
18.4839
19.7618
18.9595
Experiments and Results
3 types of Ann's:
feed-forward network
feed-forward network with jump connections
recurrent network
Input: [Rt-1 Rt-2 Rt-3 Rt-4 Rt-5] & Vt
Output: next-day-return Rt
Training: genetic algorithm & backpropagation
Data divided in:
training set – 90%
test set – 10%
one-day-ahead forecasts - static forecasting
Network:
trained 100 times
best 10 – SSE
best 1 - RMSE
Experiments and Results
Evaluation Criteria
In-sample Criteria
T
R
2
^
( yt yt )2
t 1
T
( yt y t ) 2
T
1
t 1
^
( yt y t ) 2
t 1
T
( yt y t ) 2
T
^
T
SSE ( y t y t ) t2
2
t 1
t 1
t 1
Out-of-sample Criteria
T
HR
T
I
t 1
T
RMSE
(y
t 1
^
t
yt )
2
T
1 T
MAE yˆ t yt
T t 1
t
^
1, if yt y t 0
It
0
^
ROI y t sign ( y t )
t 1
T
RP
^
yt sign ( y t )
t 1
T
| y
t 1
t
|
Pesaran-Timmerman Test for Directional Accuracy:
H0 : signs of the forecast and those of the real data are independent
DA~N(0,1)
Experiments and Results
ROI - trading strategy based on the sign forecasts:
+ buy sign
- sell sign
Finite differences:
y f ( x1 ,..., xi hi ,..., xn ) f ( x1 ,..., xi ,...xn )
xi
hi
Benchmarks
Naïve model: Rt+1=Rt
buy-and-hold strategy
AR(1) model – LS – overfitting:
RMSE
MAE
hi 10 6
Experiments and Results
Naïve
AR(1)
FFN – no vol
FFN
FFN-jump
RN
R2
-
0.079257
0.083252
0.083755
0.084827
0.091762
SSE
-
0.332702
0.331258
0.331077
0.330689
0.328183
RMSE
0.015100
0.011344
0.011325
0.011304
0.011332
0.011319
MAE
0.011948
0.008932
0.008929
0.008873
0.008867
0.008892
HR
55.77% (111)
56.78% (113)
57.79% (115)
59.79% (119)
59.79% (119)
59.79% (119)
ROI
0.265271
0.255605
0.318374
0.351890
0.331464
0.412183
RP
15.02%
14.47%
18.02%
19.92%
18.77%
23.34%
PT-Test
-
-
14.79
15.01
15.01
14.49
B&H
0.2753
0.2753
0.2753
0.2753
0.2753
0.2753
FFN
Volatility
-0.1123
FFN-jump
-0.1358
RN
-0.1841
Experiments and Results
Actual, fitted ( training sample)
Experiments and Results
Actual, fitted ( test sample)
Conclusions
RMSE and MAE < AR(1) no signs of overfitting
R2 < 0.1 forecasting magnitude is a failure
sign forecasting ~60% success
Volatility:
improves sign forecast
finite differences negative correlation
perceived as measure of risk
trading strategy: outperforms naïve model and buy-and-hold
quality of the sign forecast – confirmed by Pesaran-Timmerman
test
Further development
Volatility: other estimates
neural classificator: specialized in sign forecasting
using data outside the Bucharest Stock Exchange:
T-Bond yields
exchange rates
indexes from foreign capital markets