Index tracking and enhanced indexation using a parametric approach
Seguimiento de índices e indexación mejorada utilizando un enfoque paramétrico
Luis ChavezBedoya^{a,*}, John R. Birge^{b}
a Universidad Esan, Lima, Perú.
^{b} University of Chicago Booth School of Business, Chicago, United States of America.
ABSTRACT
]]> Based on the work of Brandt et al. (2009), we formulate an index tracking and enhanced indexation model using a parametric approach. The portfolio weights are modeled as functions of assets characteristics and similarity measures of the assets with the index to track. This approach permits handling nonlinear and nonconvex objectives functions that are difficult to incorporate in existing index tracking and enhanced indexation models. Additionally, this approach gives the investor more information about the portfolio holdings since the optimization is performed over portfolio strategies. Finally, an empirical implementation and an analysis of selected characteristics are presented for the S&P500 index.Keywords: Index tracking, Enhanced indexation, Parametric.
RESUMEN
Basándonos en el trabajo de Brandt et al. (2009), formulamos un modelo de seguimiento de índices e indexación mejorada utilizando un enfoque paramétrico. Los pesos de cartera se modelan como funciones de características de activos y medidas de similitud de los activos con el índice objeto de seguimiento. Este enfoque permite tratar funciones de objetivos no lineales y no convexos, difíciles de incorporar en modelos de indexación mejorada y seguimiento de índices existentes. Además, proporciona al inversor más información sobre los valores en cartera porque la optimización se lleva a cabo en torno a estrategias de portafolio. Por último, se presenta una implementación empírica y un análisis de características seleccionadas del índice S&P500.
Palabras clave: Seguimiento de índices, Indexación mejorada, Paramétrico.
1. Introduction
Index tracking is a type of passive management strategy which consists of designing a portfolio (tracking portfolio or index fund) to replicate the behavior of a broad market index. The popularity of index funds, as mentioned in (Cornuejols and Tütüncü, 2007), relies on both theoretical (market efficiency) and empirical (performance and costs) reasons. If the market is efficient, it is not possible to obtain superior riskadjusted returns by active management portfolio strategies. Because the market portfolio captures the efficiency of the market through diversification, it is a theoretically reasonable strategy to invest in an index fund. Moreover, many empirical studies show that, on average, active portfolio managers do not outperform the major indices. Also, active management generally incurs costly research activities and compensation to the fund managers. These costs can be avoided by an index tracking strategy.
Index tracking also has two substrategies: full replication and partial replication. In a full replication strategy, all the names in the index are bought (and held) in the exact proportions as they appear in the index. On the other hand, the partial replication strategy holds fewer assets than the total number of assets in the index; however, the assets to include and their weights need to be determined. For example, (Cornuejols and Tütüncü, 2007) and (Canakgoz and Beasley, 2008), for example, mention that the main disadvantages of the full replication strategy are the high transaction costs to rebalance all the positions in the index, the difficulty to hold very small proportions of some stocks, and the illiquidity of certain stocks (especially in smallcap indices). For (Rudd, 1980), the main advantage of partial replication is the decrease in administrative overhead and administration costs (custodial and accounting).
]]> Enhanced indexation or enhanced index tracking, in the words of (Canakgoz and Beasley, 2008) "aims to reproduce the performance of a stock market index, but to generate excess return (return over and above the return achieved by the index)". Usually, the objective in the enhanced indexation problem is to maximize alpha while the beta of the portfolio remains close to one, or to maximize excess return (over the index) by keeping the tracking error bounded to some quantity. The main distinction with the tracking problem is that enhanced indexation can be seen as an active strategy to beat a benchmark under the same risk.Different from the index tracking approach, enhanced indexation (using fewer assets than the index) lacks the theoretical foundation of the market efficiency, since holding fewer assets cannot ensure full diversification of the portfolio. Additionally, many of the approaches use only information contained in the return (price) series. Without additional information or differences in expected returns forecasts, it is hard to identify the "mispriced" securities to include in the portfolio. With lower diversification than the index, it is logical to expect the generation of positive excess returns over the index as a tradeoff with the portfolio's beta, the overall risk, or the correlation coefficient with the index. Therefore, it is crucial to analyze carefully the insample and outofsample performance of the enhanced tracking portfolios to analyze the corresponding riskreturn tradeoff and the consistency of the tracking strategy.
In this paper, we propose a parametric approach for index tracking and enhanced indexation based on the work of (Brandt et al., 2009). We formulate a multiobjective nonlinear optimization problem with a small number of decision variables. Our objective function decomposes (approximately) the variance of the tracking error or the portfolio beta into the correlation coefficient of the portfolio and the index and the ratio of their standard deviations. Notice that when the correlation coefficient is close to one and the standard deviations are close to each other, the variance of the tracking error goes to zero and the beta of the portfolio goes to one. Consequently, this specification of the tracking component is more flexible than the ones existing in the literature. For the enhanced component, although other objectives can be proposed, we usually maximize the average excess returns over the index.
As in the portfolio optimization approach of (Brandt et al., 2009), we use information inside and outside the return series to build our portfolios. However, due to the time horizon (daily or weekly) of typical index tracking problems, we exclude some of the factors used in (Brandt et al., 2009) which explain the crosssection of assets returns. For example, we omit the booktomarket ratio because daily book values are not observable. We consider it of interest to include information on the similarity of the stock's returns with the index returns, under the premise that it is reasonable that stocks behaving similarly to the index have potential to form a part of the tracking portfolio. Also, we include some measures of "momentum" of the stocks to beat the index assuming that the "momentum" will continue in the near future.
In the parametric approach, the portfolio weights are functions of selected characteristics of the stocks. We want to determine how to assign importance to these characteristics depending on the particular tradeoff defined in the objective function. By assigning importance levels, it is straightforward to find portfolio strategies and not just portfolio weights. We can then analyze how these strategies change according to the importance given to the tracking or enhanced components of the objective. These features give the investor more information about the portfolio holdings than typical approaches in the literature. Also, the parametric approach is flexible enough to handle cardinality constraints, transactions cost (in various ways), and lower and upper bounds on the portfolio weights.
We implement the parametric approach to build a tracking or enhanced tracking portfolio^{1} for the S&P500 index, using as characteristics market capitalization, alpha and beta of the individual stocks. The empirical results show that holding stocks with high market capitalization results in tracking portfolios with high correlation coefficient with the index. Stocks with beta close to one are useful to keep the ratio of standard deviations close to one, while including stocks with high alpha is used to increase the excess returns over the index. The insample performance was similar to other models in the literature, and the outofsample performance was very robust, especially for the tracking component of the objective. Additionally, the level of turnover was acceptable, and, from examining the maximum and minimum portfolio weights, the tracking portfolios were welldiversified.
The organization of the paper is as follows, Section 2 presents a brief literature review. In Section 3, we describe the typical enhanced tracking model. In Section 4, we develop in detail the parametric approach. In Section 5, we add some refinements to the "plain" parametric model including transaction costs and lower/upper bounds on the portfolio weights. Section 6 describes some characteristics that can be used in the parametric approach. In Section 7, we present the empirical application for the S&P500 index. We conclude in Section 8.
2. Literature review
Since the late 1970s the problem of index tracking has drawn attention from the financial and operations research literature. Now we have many different approaches, in particular to design index tracking funds using partial replication. Next, we mention some of the most common approaches for index tracking. The following is by no means a complete literature survey of index tracking. For further information, the reader can look at the references mentioned in each of the cited documents.
Commonly, the index tracking problem with partial replication is formulated as a mixedinteger programming model, which is challenging to solve to optimality using "traditional" integer programming and optimization techniques. This structure leaves room for the use of a variety of heuristics, metaheuristics and other solution approaches. It is typical in this approach to minimize some function that measures the distance between the index returns (or normalized prices) and those of the tracking portfolio in a specific calibration period. Another common objective in these formulations is to try to construct a tracking portfolio with beta (relative to the benchmark) close to one. Those models contain the assumption that the returns (or prices) will have the same statistical behavior in the next period(s). Complete formulations in this framework include transactions costs, rebalancing policies and other features and constraints. In this line, we refer to (Beasley et al., 2003), (Gaivoronski et al., 2005), (Canakgoz and Beasley, 2008), and the references therein.
]]> Markowitztype formulations are commonly used in index tracking where the tracking error variance (variance of the portfolio that is long in the tracking portfolio and short in the index) is minimized under other appropriate constraints. For this formulation, it is necessary to estimate the covariance matrix of all assets in the index, for which available data may not be sufficient. Instead, the returnsbased style analysis of (Sharpe, 1988, Sharpe, 1992), based on the minimization of the tracking error relative to a weighted combination of indices, can be used. For example, related to this kind of formulation, (Derigs and Nickel, 2003) estimated the covariance matrix and the expected returns using a linear factor model based on macroeconomic variables. They solve the tracking problem using a simulated annealing based on metaheuristic. While such factor models are tractable, the general problem with tracking error variance minimization is the difficulty in estimating a covariance matrix for the returns of all assets in the index.Another type of model considers the inclusion of other variables (most commonly economic ones) in addition to the returns (or price series). For example, in (Oh et al., 2005), the stocks forming the index fund are computed in two steps. In the first step, stocks are ranked by a priority function that includes "fundamental" variables (standard error of the asset's beta, average trading amount and average market capitalization). Initial weights of the index tracking portfolio are then selected using a heuristic procedure. In the second step, a genetic algorithm is used to optimize the relative weights of the selected stocks, with the constraint that the portfolio beta should be close to one.
(Corielli and Marcellino, 2006) introduced the idea of reproducing a linear factor model structure for index tracking. They assume that stock prices evolve according to a linear factor model, and form an objective to build a tracking portfolio with the same factor structure as the index. However, to implement the procedure, the factor loading matrix needs to be estimated in order to find the optimal portfolio weights. Additionally, the cardinality of the tracking portfolio is satisfied by a heuristic procedure which works by ordering the factors according to their correlation with the index, and then includes those stocks that replicate the factors with a decided accuracy.
Clustering techniques are also used to construct tracking portfolios. For example, (Focardi and Fabozzi, 2004) describe an indextracking methodology based on timeseries clustering. They argue that because the estimation of all the covariances between assets of a broad market index (needed for a next period optimization) is computationally burdensome and produces noisy results, a hierarchical clustering of the asset's time series is a more robust way to reveal the correlation structure. In their application, they used the Euclidean distance between stock prices as the basis of their clustering. In a similar direction, (Cornuejols and Tütüncü, 2007) present an indextracking problem based on clustering stocks with similar correlation coefficients of returns. To find the stocks in the tracking portfolio, they solved a large scale integer programming model, for which, Lagrangian relaxation and subgradient methods can produce good upper bounds. After selecting the stocks, the weights are determined proportionally to their market capitalization.
Other interesting techniques for index tracking are the cointegration approach of (Alexander, 1999) and (Alexander and Dimitriu, 2005) and the stochastic programming approach of (Stoyan and Kwon, 2007). As an additional example in continuous time, (Yao et al., 2006) formulate the index tracking problem as a stochastic optimal control problem and solved it using semidefinite programming.
In the case of enhanced indexation, since the cardinality constraint (to hold fewer assets than the index) is usually imposed, almost all of the approaches and techniques for partial replication can be used after some modifications. The reader interested in the enhanced indexation literature can consult Section 2.2 of (Canakgoz and Beasley, 2008).
3. The enhanced indexation problem
Suppose from time t to t + 1 the index has a return w ν_{t}_{ + 1}, and at time t, N_{t} stocks^{2} form the index. Each stock i has a return r_{i}_{,t + 1} from date t to t + 1 and an associated vector of firm characteristics y_{i}_{,t}observed at date t. These characteristics can be related to the explanation of returns, e.g., the market capitalization of the stock, the booktomarket ratio, lagged returns, etc., and to similarity measures with the index, e.g., correlation, meanabsolute deviation with respect to the index, maximum deviation with respect to the index, etc.
Originally, the investor, who tries to track an index or to form an enhanced index tracking portfolio^{3} at time t, wants to solve the following problem (P) by selecting the appropriate portfolio weights x_{i}_{,t} fori = 1,…, N_{t}:
]]> where p_{t}_{ + 1} is the return of the tracking portfolio and is the correlation coefficient (conditional to the information up to time t) of the returns of the index and the tracking portfolio, more specifically:
Additionally, O_{t} is a measure of outperformance of the tracking portfolio with respect to the index. Some common choices of O_{t} are given by
In the objective function, we have that λ_{1} ≥ 0 measures the weight given to the correlation coefficient between the tracking portfolio and the index. In the same way, λ_{2} ≥ 0 is the weight given to the "risk" component, i.e., the part of the objective function used to avoid constructing tracking portfolios with higher standard deviation than the index. Since the tracking portfolio is less diversified than the index, the standard deviation of the tracking portfolio tends to be higher than the one of the index. This objective then tends to match both standard deviations, i.e., to make the ratio close to one. For the enhanced component (with λ_{3} ≥ 0), expression (2) represents the probability (also conditional to the information up to time t) that the tracking portfolio has a greater return than the index. In a similar way, (3) represents the excess return of the tracking portfolio with respect to the index.
Summarizing the objective of problem (P), we are maximizing a multiobjective function using linear scalarization, in which the tracking component is given by the correlation coefficient minus the ratio of the standard deviations of the returns of the tracking portfolio and the index, while the enhanced component is measured by either the excess return of the tracking portfolio with respect to the index or the probability of beating the returns of the index. Additionally, by setting the parameters λ_{1},λ_{2} and λ_{3}, we determine the implicit tradeoffs between the different components of the objective. For example, λ_{1} > 0, λ_{2} > 0 and small values of λ_{3} correspond mostly to a trackingonly strategy. Next, we will discuss how our objective is related to common objectives in the literature, as well as, the advantages of using both the correlation coefficient and the ratio of standard deviations to measure the tracking performance.
The objective function of the returnsbased style analysis (RBSA) of (Sharpe, 1988, Sharpe, 1992) is to minimize the conditional variance of the tracking error given by We can expand the tracking error to have
If and , then the variance of the tracking error will be close to zero. Therefore, the enhanced component of our objective function correctly measures the ability of the enhanced index fund manager to contribute to the portfolio performance (in a sense that the performance is separated from the tracking error). Additionally, the separation of the conditional correlation coefficient and the conditional standard deviations ratio gives more freedom to the design of the tracking portfolio.
In the enhanced index tracking literature, two common minimization objectives^{4} (with ) are given by:
]]>where in (5), we have that β and α come from the following linear regression model: .
Notice that in (4), which is a similar objective to the one used in (Beasley et al., 2003), one can show that minimizing is equivalent to minimizing . Therefore, we are indirectly trying to match the first moments of v_{t}_{ + 1} and p_{t}_{ + 1}. This fact will directly affect the weight given to the enhanced component.
Objective (5), which is used in (Canakgoz and Beasley, 2008), clearly separates the tracking component from the enhanced component; but, even in the case of β = 1, we could have that . We could find αpositive strategies (with β close to 1) but most likely with higher overall risk. Since,
If we have that and , then β will be close to one. Therefore, our objective function (in the tracking component) indirectly minimizes the variance of the portfolio and tries to achieve values of β close to one.
In the constraints of (P), we have that supp represents the cardinality of a set, and K_{t} is a positive integer (smaller than N_{t}) representing the maximum number of stocks with positive weight in the tracking portfolio at time t. The last constraint of the formulation imposes some lower or upper bound constraints on the tracking portfolio.
Under the assumptions that the index return for each t is an i.i.d. random variable, the vector of returns of the stocks for each t is a multivariate i.i.d. random vector (the index returns and the stock returns are not independent). Considering and we can reformulate problem (P) using its sample counterpart that avoids the time dependence of the portfolio weights, i.e., ensures for all t. We can then find the optimal portfolio weights x by solving the following problem (PI) with a calibration period [1,T]
4. Parametric approach for enhanced indexation
]]> Instead of solving (PI) using mixed integer programming techniques, we use the parametric approach in (Brandt et al., 2009) to formulate an alternative (but not equivalent) problem where the portfolio weights are specified as a function of the stocks characteristics by
The function f should take into account the last three constraints of the original formulation (P) to produce feasible portfolio allocations for the enhanced index model. If we consider l_{i,t} = 0 and u_{i,t} = 1 for all i, a possible function f (with no closed form) can be obtained using the following steps:
An important aspect of the parameterization is that the coefficients θ are invariant across assets and time. Constant coefficients across time mean that the coefficients that maximize the objective are the same for all dates; therefore, they also maximize the investor's objective unconditionally. This fact implies that we can formulate the following unconditional optimization problem (PP) with respect to θ:
and generates weights x_{i,t} that satisfy the cardinality constraint, upper and lower bounds, and sum to one. It is then possible to estimate the coefficients θ by maximizing the corresponding sample analog (SPP):
where the definitions of the terms are the same as in (PI). The only variables to be computed are the coefficients θ that are imbedded in . The portfolio weights x_{i,t} have been parameterized by a function of the stocks’ characteristics. Now, we only need to find the vector θ that usually contains only a few elements. Therefore, the dimensionality of the problem has been dramatically reduced, but the new difficulty with this parameterization is that it generates a nonconcave, nondifferentiable and nonlinear unconstrained problem which has to be solved using appropriate optimization techniques. Again, notice that problems (PI) and (SPP) are not equivalent, i.e., the optimal x_{i,t} can be different in the two models.
The elements of the vector θ can be directly compared (due to the normalization of the characteristics). This comparison gives intuition about the class of stocks that are going to be included in the tracking portfolio. Notice that, by finding θ, we are basically finding a trading strategy. Additionally, K_{t} is not fixed to a constant K during the calibration period, so, we can control the cardinality of the tracking portfolio in general ways. However, in our numerical examples we fix K_{t} = K for all t. In the next section, we present a series of refinements of the basic model to allow the inclusion of portfolio weight constraints and transaction costs.
]]> 5. Refinements and extensions5.1. Upper and lower bounds on portfolio weights
The portfolio x_{t}, resulting from the simple policy in the last section, is not likely to satisfy the lower and upper bounds I_{t} and u_{t}. However, we can address this deficiency by solving a LP problem to find new optimal weights . If we denote as the set of assets selected in the "initial" tracking portfolio x_{t}, we have the following LP model called (UB1):
From the LP above, the optimal portfolio weights will be given by for all , where we assume the feasibility of (UB1). An alternative LP is given in (Cornuejols and Tütüncü, 2007), which we call (UB2):
As in the case of (UB1), this also assumes the feasibility of (UB2). Consequently, we can construct a portfolio weight function (that satisfies the lower and upper limits constraints) by following the steps in Section 4 and adding one more step (Step 5), which consists of solving either (UB1) or (UB2) (if possible).
5.2. Transaction costs and turnover
Transaction costs and turnover are two very important variables in any portfolio optimization problem since they tell us how costly it is to implement the optimal strategy and how much the portfolio changes over time. Even though there are multiple approaches for rebalancing a tracking portfolio, we only describe two general cases which can be handled by the parametric approach.
Recall that at time t our optimal tracking portfolio (from the optimization problem) is x_{t}. The current tracking portfolio is no longer x_{t}_{−1} (tracking portfolio chosen at time t − 1) due to the realized return in the period t − 1 to t. We call the current tracking portfolio (unbalanced) as . For every asset i, we have
]]>action cost assigned at time tto stock i. Hence, we can proceed with the optimization problem described in Section 4.
However, it may not be optimal to rebalance the tracking portfolio completely from to x_{t}. Now, we apply a boundarytype policy for transaction costs inspired by the work of (Leland, 2000) and also considered in (Brandt et al., 2009). First, define with (notice that is not necessarily less than or equal than K_{t}) and introduce a threshold ¿_{t} > 0 that "limits" the amount of rebalancing under a certain norm ·_{t}. The parameter α_{t} is such that , and we let be
The transaction cost policy is not to rebalance if x_{t} and are close enough under the norm ·_{t} and the parameter ε_{t} (so that both the norm and the threshold define the notrade region) and, depending of the magnitude of the transaction costs, to rebalance to some "intermediate" allocation between x_{t}and (given by the value of ) or to rebalance to the "optimal" allocation x_{t}. If we denote as the tracking portfolio chosen for time t in the presence of the transaction costs policy, we have
where the function g is such that it produces feasible allocations for the enhanced index tracking problem and ·_{t} is usually the Euclidean distance but scaled by the cardinality of the portfolio K_{t}, Finally, notice that it is also possible to establish set frequency of rebalancing, i.e., daily, weekly, etc.
6. Selection of appropriate characteristics
In this section, we consider stock characteristics that can be used to construct the weights of the tracking portfolio, i.e., the vector y_{i} in the portfolio weights function f. In (Brandt et al., 2009), the characteristics were selected based on their capacity to explain the crosssection of expected returns. Consequently, market capitalization, book to market ratio and lagged return were included in their corresponding empirical application.
Those characteristics will be identified with y^{mkt}, y^{btm} and y^{ret}, and their coefficients in θ will be denoted by θ_{mkt},θ_{btm} and θ_{ret}. However, for tracking purposes it would be beneficial to include some characteristics that can reflect the ability of a stock to track or beat the index. Based on the tracking objective functions of (Gaivoronski et al., 2005), and (Oh et al., 2005), and the enhanced tracking objectives of (Canakgoz and Beasley, 2008), the following characteristics can be included:
]]>7. Numerical example  enhanced tracking of the S&P500
In this section, we present an empirical application of the parametric index tracking and enhanced indexation model. We consider the S&P500 as the index with the objective as defined in Section 3. We report the optimization results, more specifically, the insample and outofsample performance of the tracking portfolios. First, we present some relevant information about the data used in this particular application.
7.1. Data
To implement the model, the returns of the index, the returns of the stocks and the time series of characteristics are needed. The time frequency chosen was daily and the calibration period was 124 days to be consistent with common applications in the literature that use 75–250 days for calibration. For example, (Gaivoronski et al., 2005) used different lengths of the calibration period: 75, 150 and 250 trading days. Our calibration period corresponds to the period 2011/10/03 to 2012/03/30. The outofsample performance is evaluated using the next 42 days (two months of trading) and corresponds to the period 2012/04/02–2012/05/31. The daily index returns and stock returns were obtained using the CRSP database^{5} and information of index constituents at the end of each month was obtained through the Compustat database. The final number of stocks was 475 with the following criteria to include a stock in the sample:

presence in the S&P500 at the end of each month of the calibration period;

presence of a complete history of returns during the calibration period;

maintenance of symbol during the calibration period (i.e., no change of stock name).
For characteristics, we used market capitalization, alpha and beta deviation. For market capitalization, for each date t we multiply the price of the stock times the number of shares outstanding. For other characteristics, i.e., alpha and beta deviation, we used trading days to compute the appropriate value of the definitions in Section 6. Finally, based on some empirical studies of (DeMiguel et al., 2009), was considered to be the equally weighted portfolio in Step 1 of Section 4.
]]> 7.2. Working with three characteristics: market capitalization, beta deviation and alphaAs noted earlier, we work with three characteristics: market capitalization, beta deviation and alpha. Their weights are given by θ_{mkt},θ_{β} and θ_{α}, respectively. Market capitalization was selected due to its capacity to explain returns as well as the fact that the S&P500 is a marketcapitalization index. Alpha and beta deviation contain important information about a particular stock relative to the index. In particular, α is used to assess the ability of a stock to outperform the index and can also be considered as a measure of momentum. Beta is used to assess the risk of the stock compared to the index. Additionally, both are computed from the same OLS regression, and have been used recently in the enhanced indexation problem (for example, in (Canakgoz and Beasley, 2008)). Note that y^{β} in Section6 was defined as β−1. Consequently, low values of this characteristic indicate that the particular beta is close to 1, and after the crosssection normalization it will take negative values.
In this part of the paper, we consider various cases of the objective function in (PP). In the first case, we maximize the correlation coefficient of the returns of the index and the tracking portfolio, i.e., λ_{1} = 1,λ_{2} = 0, and λ_{3} = 0. The second case includes in the objective the ratio of the standard deviations, i.e.,λ_{1} = 1, λ_{2} = 1, and λ_{3} = 0. While we do not include the enhanced component in these two initial cases, we observe different behavior of the objective due to different effects of the characteristics. These cases are important since they correspond to versions of a classical index tracking problem that considers the correlation coefficient and the variance of the tracking portfolio. In particular, we wish to track the S&P500 using 75 stocks (i.e., 15% of its components). To construct the function f, we follow the steps given in Section 4 using as initial tracking portfolio the equally weighted portfolio. Also, we do not include transaction costs, or lower and upper bounds on the portfolio weights. The objective is computed using the calibration period of 124 days corresponding to the trading days between 2011/10/03 and 2012/03/30 (as mentioned in the Data section). Since we are using a calibration period, it is clear that we are using problem (SPP), which is the sample analog of problem (PP).
Fig. 1, Fig. 2 correspond to the case of λ_{1} = 1, λ_{2} = 0, and λ_{3} = 0 with K = 75. In Fig. 1, we show two surfaces, the upper one corresponds to fixing θ_{mkt} = 6 and moving θ_{α} and θ_{β} between −6 and 6 in 0.5 steps, and the lower surface corresponds to θ_{mkt} = −6 using the same range for the other two coefficients. Notice that higher values of θ_{mkt} include stocks with greater presence in the index in the tracking portfolio. As expected, this fact increases the correlation coefficient of the tracking portfolio and the index. With θ_{mkt} = 6 we obtain correlation coefficients higher than 0.99; while using θ_{mkt} = −6, we obtain maximum values that are approximately 0.89.
]]>
In Fig. 2, we show a color map of the surface corresponding to θ_{mkt} = 6. From this figure, we can observe the influence of θ_{β} and θ_{α} in the maximization of the correlation coefficient. Notice that the values of θ_{α} that maximize the objective is centered on zero; therefore, alpha appears to have little relevance for maximizing the correlation coefficient. The case of θ_{β} is relatively similar by presenting values centered at zero for the maximum values of the objective.
Next, we include in the objective the ratio of the standard deviations; recall that by giving weight to this part of the objective (λ_{2} > 0) we aim to reduce the variance of the tracking portfolio with respect to the index. For motivational purposes, we now use λ_{1} = 1, λ_{2} = 0, and λ_{3} = 0 and K = 75. The new objective is a more complete tracking objective since we both maximize the correlation coefficient and keep the ratio of the variances close to 1. Under the same conditions as in the previous case, we similarly display Fig. 3, Fig. 4.
]]>
In Fig. 3, we again observe that for tracking purposes, giving more weight to stocks with high market capitalization results in better tracking performance (both correlation and standard deviations ratio). Additionally, as we can observe in Fig. 4, higher values of the objective function correspond to small values of θ_{β}, i.e., stocks with beta close to 1 are more likely to be considered. Therefore, by including stocks with a beta close to 1, we can expect somehow to match the variances of the index and the tracking portfolio. The effect of θ_{α} for tracking purposes is not very significant since the higher values of the objective occur around θ_{α} = 0. Finally, as we can observe from all the previous results, depending of the nature of the objective, the weights given to the characteristics are different and are in line with empirical facts in the financial data.
We also include a case with the enhanced component in the objective function. The outperformance measure selected was the average daily excess returns of the portfolio over the index (in percentage). In Fig. 5, Fig. 6 (color map), we show the results for the case λ_{1} = 1, λ_{2} = 1, and λ_{3} = 3 (always keepingK = 75) and fixing θ_{mkt} = 6. We can observe that the higher values of the objective occur in the left upper corner of the plot i.e., low θ_{β} and high θ_{α}. This indicates that including high alpha stocks increases the possibility of higher returns of the tracking portfolio while high values of θ_{mkt} and low values of θ_{β}collaborate with the strict tracking component (correlation and variance).
]]>
7.3. Optimizing with three characteristics
In this part, we continue with the objective given in (PP) with coefficients given by θ = (θ_{mkt}, θ_{α}, θ_{β}). Again, we do not include transaction costs or lower and upper bounds on the portfolio weights. We study the effects of changing K, λ_{1} and λ_{3} in the objective function. We impose a constraint of the standard deviation ratio in the optimization problem to avoid exploring the effect of λ_{2}. Insample and outofsample results are presented.
More important than finding the optimal value of the problem (PP), we focus on the effects and behavior of the selected characteristics in the enhanced indexation problem. Again, since we are optimizing over a fixed period of history, we are basically solving problem (SPP) which is the sample counterpart of problem (PP). Because of that, we sketch our "optimization" procedure as follows:

Step 1 Construct a grid of values of the coefficients.

Step 2 Evaluate the (particular) objective function for each of the vectors of the grid.
 ]]> Step 3 Impose a constraint on the standard deviations ratio to filter the combination of coefficients to be considered.

Step 4 Sort the (filtered) vector of coefficients by their objective value. Group the sorted data inton subgroups.

Step 5 For the subgroup with the greatest values of the objective, report the average value of the coefficients. Those values are considered estimates of the "optimal" values.

Step 6 Evaluate the average coefficients found in Step 5 in the corresponding objective function. Report the objective function value as well as other relevant variables.
The results correspond to θ = (θ_{mkt}, θ_{α}, θ_{β}) with −6 ≤ θ_{mkt}, θ_{α}, θ_{β} in 0.5 intervals. This produces 15 625 vectors of characteristics to evaluate for each selected K. Also, we impose the constraint that the ratio of standard deviations should be less than or equal to 1.05. After filtering the characteristics that satisfy the aforementioned constraint in Step 3, we took n = 150 to be the number of subgroups in Step 4. To define θ_{α} and θ_{β}, we used a time window of 42 trading days. The measure of outperformance O_{t} is the excess return of the tracking portfolio over the index. The cardinality values considered were K = 25, 30, 40 and 50. These are usual values for K; for example, (Canakgoz and Beasley, 2008) in their empirical application used 40 stocks of a universe of 457 to track the S&P500. While no transactions costs are considered in this section, they are considered in the next section.
We also used a nonlinear optimization package, KNITRO 6.0, to solve the optimization problem to verify the accuracy of the answers given by our procedure. The results were similar on average; however, the optimal solution found with the solver was sensitive to the starting point, which suggests performing a preliminary exploration analysis to select appropriate starting points. Also, note that the lack of structure in the optimization problem makes it difficult to claim optimality.
]]> Table 1, Table 2, Table 3, Table 4 present detailed information about the insample and outofsample performance of the "optimal" tracking portfolios of this section. Each table corresponds to a particular choice of K(K = 25, K = 30, K = 40, K = 50) and shows the following information: λ_{3} (with λ_{1} always fixed at 1), the "optimal" coefficients (using the optimization procedure explained before), the objective function values (recall that the correlation coefficient and the standard deviation ratio are dimensionless but the average daily return was taken in percentage), the correlation coefficient (ρ) of the index and the "optimal" tracking portfolio returns during the calibration/testing period, the standard deviation ratio (SD.R.) of the returns of the tracking portfolio and the index during the calibration/testing period, the average yearly annualized excess return of the tracking portfolio over the index during the calibration/testing period (Ret.%/y), the average probability of beating the index over the calibration/testing period (Prob %), alpha in percentage per year (α%/y) and beta (β) of the regression of the portfolio returns against the index returns during the calibration/testing period, the average daily turnover as defined in Section 5.2 (T%/d), and the average maximum (w_{max}) and minimum (w_{min}) average weight of the tracking portfolios (given in percentage). Notice that some of variables are computed as averages during the calibration/testing period since every day the tracking portfolios change due to different values of the (normalized) characteristics.
]]>
7.3.1. Insample performance for θ = (θ_{mkt}, θ_{α}, θ_{β})
First, we study the behavior of the "optimal" coefficients as a function of the cardinality parameter K and the coefficient λ_{3}. Additionally, we assess the effect of λ_{3} and K on the correlation coefficient, the portfolio return, the probability of outperformance, turnover and the average maximum and minimum weight given to an asset in the tracking portfolio.
Fig. 7 shows the optimal value of the vector of coefficients. We observe that all plots show similar behavior. The "optimal" policies then appear independent of the cardinality of the tracking portfolio. For low values of λ_{3}, the more important characteristic is θ_{mkt} which takes positive values (around 4); on the other hand, θ_{α} and θ_{β} do not appear to be very significant. For tracking purposes, this suggests giving more weight to stocks with high market capitalization (relative to other stocks in the index). As we increase λ_{3}, and give more importance to the excess return of the tracking portfolio, we tend to increase the participation of stocks with high alpha and low beta deviation. By increasing θ_{α}, we put preference to stocks with a recent history of relatively high alpha; but, to maintain the variance ratio constant, it is necessary to include stocks with low beta deviation (β−1), i.e., stocks with beta close to one. Also, notice that, after certain values of λ_{3}, the behavior of the optimal characteristics is practically constant (λ_{3} ≈ 3 for K ≈ 25, λ_{3} ≈ 1.5 for K = 30, 40 and 50). In conclusion, the tracking performance is mostly given by stocks with high market capitalization, while the enhancement performance is given by stocks with high alpha; however, it is necessary to include stocks with low beta deviation to control the standard deviation ratio and to maintain relatively high correlation with the index.
]]>
The behavior of the portfolio weights is shown in Fig. 8. We report the average maximum and minimum weights in the "optimal" tracking portfolios as a function of λ_{3}. In general, we can observe that the tracking portfolios tend to be welldiversified, avoiding excessive concentration in certain assets. They also avoid very small positions in the assets, especially for low values of K. For example, when K = 25, the average maximum weight is close to 7% while the average minimum weight is around 3%. For K = 50, the average maximum weight is close to 4% while the minimum weight is around 1.3%. Finally, as expected, the maximum and minimum weights are decreasing functions of K.
Fig. 9 shows the behavior of the correlation coefficient (ρ) of the tracking portfolio and the index for different values of λ_{3} and K Notice that in general (for a fixed K), ρ is a decreasing function of λ_{3}, i.e., under "constant" variance the enhancement component tends to reduce the correlation to search for more profitable stocks. When λ_{3} = 0 with K = 50, we have ρ ≈ 0.98 and with K = 25 we have thatρ ≈ 0.97. However, for λ_{3} = 5 and K = 50 we have ρ ≈ 0.95, and ρ ≈ 0.93 when K = 25. Additionally, as expected, the correlation coefficient is generally higher for higher values of K (especially for low values of λ_{3}). This observation indicates that diversification improves the performance relative to ρ.
]]>
The behavior of the standard deviation ratio is shown in Fig. 10. By construction, the ratio should be less than 1.05 (recall that the reported values are reported for the vector of average coefficients). Notice that for low values of λ_{3}, the variance ratio tends to be smaller due to the presence of high market capitalization stocks that generally are less volatile than the index. But, as λ_{3} increases, so do positions in riskier stocks, increasing the ratio. Fig. 11, Fig. 12 show, respectively, the behavior of the expected annualized return of the tracking portfolio (in excess of the index), and the probability that the tracking portfolio beats the index. Both variables present the similar logical pattern, i.e., as λ_{3}increases, the corresponding variable tends to increase; but, for high values of λ_{3}, both variables tend to stabilize. Hence, in general, the higher the K, the higher the return and the probability of outperformance. This means that more αaggresive and diversified portfolios show higher probability of beating the index under a similar variance.
]]>
Fig. 13 displays the insample beta (β) for the different tracking portfolios. For K = 25, K = 30 and low and moderate values of λ_{3}, the tracking portfolios tend to produce values of β lower than one. This can be explained by the presence of stocks with high market capitalization which usually have smaller betas than one. However, for λ_{3} approximately greater than 2.5, beta remains almost constant fluctuating between 0.98 and 0.99. In the case of K = 40 and K = 50, because of the greater level of diversification, beta fluctuates between 0.98 and 1.02 for all values of λ_{3}, i.e., basically the loss in ρ is compensated by an increment in the ratio of standard deviations.
]]>
Notice that an investor can construct a tracking portfolio that matches the returns of the index (excess return close to zero) by choosing the appropriate tradeoff parameter λ_{3}. For example, in the case ofK = 50, the corresponding λ_{3}, that generates a tracking portfolio with approximately zero excess return over the index is between 0.75 (annualized excess returns of −1.20%) and 1 (annualized excess returns of 0.91%). To find a λ_{3}, that generates closer returns, the investor can perform a line search. Additionally, note that tracking policies for low values of λ_{3} depend heavily on the inclusion of stocks with relative high market capitalization. Consequently, this generates portfolio strategies that are heavily concentrated in one class of assets and might produce excessive dependence on the performance of that group. This observation recommends choosing tracking policies which offer greater levels of diversification even though they may lose some tracking performance.
7.3.2. Outofsample performance for θ = (θ_{mkt}, θ_{α}, θ_{β})
The outofsample period consists of the 42 trading days (2 months) immediately following the calibration period. The main issue of the outofsample performance is to check whether the optimal insample portfolio policy (vector of coefficients) is robust, i.e., whether the main properties of the insample period are preserved. To perform a "hard" outofsample test, we maintain the value of the vector of coefficients fixed at their optimal insample value during the testing period, i.e., we do not update the value of the vector as new information arrives. We present the comparison between the insample and outofsample performance for all the variables considered in the previous section and additionally includes the tracking portfolio turnover.
First, we compare the insample and outofsample objectives. The information is summarized in Fig. 14, which includes four plots, each corresponding to a particular value of K (25, 30, 40 or 50). In general, the insample and outofsample objectives are relatively close. Recall that the objective is the sum of the sample correlation coefficient of returns and λ_{3} times the average daily excess return of the tracking portfolio over the index (in percentage). Notice that for values of λ_{3} approximately less than 1.5, we observe that both performances are similar. This suggests that the tracking performance is more robust than the enhancement performance, i.e., high market capitalization and moderate values of alpha and beta deviation generate robust portfolio policies. On the other hand, for higher values of λ_{3}, the insample performance is generally better outofsample. These facts confirm the intuition that finding a profitable portfolio policy is harder than finding a good trackingonly portfolio policy.
]]> Fig. 15, Fig. 16 contain the insample and outofsample behavior of the correlation coefficient (ρ) and the standard deviation ratio. In the case of ρ, we can observe that outofsample performance improves as K increases, especially for low values of λ_{3}. Also, as λ_{3} increases, the insample and outofsample results present some significant but reasonable differences. Aggressive portfolio policies tend to lose outofsample performance in ρ. With respect to the standard deviation ratios, we observe that, in the outofsample period, the constraint on the standard deviations ratio (less than or equal to 1.05) is maintained with the exception of two cases corresponding to very aggressive allocations.
Fig. 17, Fig. 18 exhibit the insample and outofsample performance of the annualized average return and the probability to beat the index. In general, both plots show the same pattern. We can clearly observe that the robustness in these variables is relatively low compared with the robustness of the tracking variables (correlation and standard deviation ratio). However, an important aspect is that the insample trend is captured in the outofsample period for high values of λ_{3} (more than 2). Fig. 19 is related to the beta of the tracking portfolios. We generally observe similar insample and outofsample performance of β for low values of λ_{3}. For other values of λ_{3}, we observe a reduction of β to a level of approximately 0.95. This reduction is expected since the values of ρ have been reduced while maintaining a very stable ratio of standard deviations.
]]>
]]>
In general, in the outofsample period, stocks with high market capitalization had good performance. This fact generated a positive effect for policies that relied mostly on large values of y^{mkt} and low vales of the other characteristics; in general, this occurs for low values of λ_{3}. This excessive dependence can be avoided with the selection of λ_{3} to produce more diversified portfolios in terms of characteristics. Moreover, notice that the higher market capitalization performance mostly affected the excess returns, the probability of beating the index, and alpha, i.e., the outperformance variables. The correlation coefficient, ratio of standard deviation, and beta were robust out of sample.
Recall that in this application, we are implicitly considering daily rebalancing to the "new" optimal portfolio weights. The insample and outofsample average turnover is shown in Fig. 20. We can observe that both turnovers are similar; but, the turnover of the insample period tends to be smaller than that of the outofsample period, especially for high values of λ_{3}. The turnover increases with λ_{3}since more aggressive portfolio policies generally involve high turnover. For small values of λ_{3}, our tracking portfolio is mostly composed with largecap stocks. Since market capitalization is a stable characteristic, its behavior in the crosssection is generally maintained through time. Therefore, the turnover of such portfolio policies tends to be low. Contrarily, alpha and beta deviation are less stable in time and consequently in the crosssection, producing high levels of turnover. In general, low levels of turnover correspond to including stocks with stable characteristics over time.
7.4. Transaction costs
In this section, we implement the parametric approach including transaction costs. We only consider the full rebalancing strategy described in Section 5.2. We assume equal transaction costs for all assets and dates, i.e., δ_{i,t} = δ for all i and t. Considering the parameters for transaction costs in (Brandt et al., 2009), (Canakgoz and Beasley, 2008) and (Gaivoronski et al., 2005), we consider two cases: δ = 0.1% and δ = 0.2% (notice that the case without the inclusion of transaction costs corresponds to having δ = 0.0%). First, we study the insample behavior of the optimal characteristics and the other variables as a function of δ then, we analyze and compare the insample and outofsample performance of the tracking portfolios. Since the results obtained are similar for the values of Kconsidered in the previous section, we restrict the results by considering only the case of K = 50. The numerical results are contained in Table 5, Table 6.
]]>
7.4.1. Insample performance with transaction costs
Fig. 21 is analogous to Fig. 7 but for different values of δ (in particular δ = 0.1% and δ = 0.2%). We observe a similar pattern in all the plots, with the difference that the λ_{3} for which θ_{α} and θ_{β} start to become more significant (values generally greater than 1) is an increasing function of δ Therefore, the presence of transaction costs makes the strategy of holding high market capitalization stocks more dominant since that policy generates low turnover. To clarify this point further, Fig. 22 shows the optimal values of θ_{mkt}, θ_{α} and θ_{β} obtained with different values of δ as a function of λ_{3}. We can observe from the left plot that, in general, as δ increases, θ_{mkt} also increases. From the central plot, we can observe a similar behavior for θ_{α} (increasing as λ_{3} increases) but the aggressiveness of the positions in θ_{α} decreases in δ. In the case of θ_{β}, we again observe similar patterns, with the difference that for moderate values of λ_{3} the strategies become more dependent on stocks with positive θ_{β}(selecting stocks with beta moderately different than 1) to achieve a certain level of return due to the lower turnover generated by θ_{β}(compared with θ_{α}). For high values of λ_{3} and as δ increases, the strategy become less dependent on including stocks with beta close to 1 (small θ_{β}) since the position in θ_{α} is not as aggressive as in the case of zero transaction costs.
]]>
Fig. 23 shows the insample behavior of the objective, ρ, and the average annualized return as a function of λ_{3} and δ. The main observation to notice is that for δ = 0.2% the maximization of the objective is achieved mostly by the maximization of ρ, which is considerably larger (for λ_{3} greater than 1) than in the cases of δ = 0.0% and δ = 0.1%. Fig. 24 presents the same information but for the ratio of standard deviations, beta, and average turnover. We observe that the ratio is consistently greater than one but the beta of the portfolio moves approximately between 0.985 and 1.02. Notice that for low values of λ_{3}, the portfolio beta for δ = 0.2% fluctuates around one due to the stability of the ratio of standard deviations. In general, the riskiness of the portfolios is compensated by the decrease in ρ to obtain portfolios with beta close to one. We can observe that because of transactions costs, the portfolios incur less turnovers. The average turnover for δ = 0.2% (especially for medium to large values of λ_{3}) is significantly lower than that of δ = 0.0% and δ = 0.1%.
]]>
Finally, notice that the portfolios with close to zero excess return over the index were obtained for λ_{3}between 1.25 and 1.375 for δ = 0.1% and between 1.375 and 1.5 for δ = 0.2%. For the case of zero transaction costs, the range of λ_{3} was between 0.75 and 1. More transaction costs then imply giving more weight to outperformance in the objective to find a tracking portfolio with excess return close to zero.
7.4.2. Outofsample performance with transaction costs
Insample and outofsample profiles for δ = 0.1% and δ = 0.2% corresponding to the optimal objective function value are relatively different as shown in Fig. 25. This difference can be explained by the performance of the correlation coefficient and the excess return over the index that can be seen in Fig. 26, Fig. 27, respectively. Remember that in the case of no transaction costs the tracking component was more robust than the enhanced component. In the presence of transaction costs, we observe the same behavior; therefore, the good outofsample performance for δ = 0.2% is justified mostly by its reliance on the correlation coefficient (having greater values of ρ compared with the other cases). In the case of the excess returns, good performance for low values of λ_{3} can be explained by the excellent returns obtained by the stocks with higher market capitalization during the testing period, which cannot be generalized for other periods.
]]>
]]>
From Table 5, Table 6, we can observe that outofsample values for the ratio of standard deviations are well below the 1.05 limit. Consequently, the variance of the tracking portfolios (relative to the one of the index) is preserved during the testing period even in the case of aggressive allocations in θ_{α}. Moreover, Fig. 28 shows the insample and outofsample betas that are maintained in the range 0.96–1.02. Notice that the tendency outofsample is to produce values of beta slightly smaller than one due to the presence of stocks with higher y^{mkt}. The turnover of the portfolios almost presents no difference insample and outofsample, which is important since it is closely related to the stability of the characteristics over time. The aforementioned fact is shown in Fig. 29. Finally, the average maximum and minimum portfolio weights remain stable, avoiding increasing aggressive allocations with the increment of λ_{3}.
]]>
8. Conclusions and final remarks
We have developed a parametric approach for index tracking and enhanced indexation that has advantages over existing models. First, the parametric model optimizes over stock's characteristics (that can be seen as strategies) and not over portfolio weights. The portfolio weights are the result of the chosen strategy, giving a qualitative idea of the portfolio composition. Second, this approach reduces the dimensionality of the optimization problem compared with mixedinteger programming methods. Nonetheless, a lowdimensional, unconstrained nonlinear optimization problem needs to be solved. Third, the proposed objective function summarizes typical objective functions in the index tracking and enhanced indexation literature.
By maximizing an objective including the correlation coefficient (between the portfolio and the index) minus the ratio of standard deviations (portfolio with respect to the index) plus an outperformance measure (typically the excess return of the portfolio over the index), we can control the importance given to the tracking objective (correlation plus standard deviation) and the enhanced objective. By design, the tracking portfolios under the parametric approach try to achieve a beta of one with respect to the index and to minimize the variance of the tracking error simultaneously. Other models consider only one of these variables, leaving the other free. Also, our objective correctly separates the tracking objective from the enhanced component, giving more freedom to the modeler to decide on particular weights given to each part of the objective.
The selection of appropriate stock characteristics in the parametric model is a delicate issue. We have proposed the use of similarity measures of the stocks with the index as characteristics, plus others that explain the cross section of asset returns. Although we did not evaluate all possible characteristics to use, market capitalization, alpha and beta produced insample and outofsample results that are in line with economic intuition. Holding stocks with high market capitalization resulted in constructing portfolios with high correlation coefficient with an index that is capweighted as is the S&P500. Alpha, on the other hand, was used to obtain higher levels of excess returns over the index, while stocks with beta close to one were useful to keep the ratio of standard deviations close to one.
The outofsample performance showed that the correlation coefficient and the ratio of standard deviations have a robust behavior, i.e., the insample and outofsample performances were very close to each other. However, the outperformance part was less robust. This fact is expected since many empirical studies have shown that forecasting the correlation and standard deviations is more accurate than forecasting expected returns. However, we were able to capture some outofsample momentum with the inclusion of alpha as a characteristic. Additionally, our results showed that the only way to extract more excess returns while keeping the standard deviations close to each other is by a reduction in the correlation coefficient with the index. Also, more aggressive strategies (in the sense of trying to achieve greater excess returns over the index) involve higher levels of turnover since the cross sectional behavior of the characteristics that collaborate with the outperformance component are more volatile than those involved in trackingonly purposes.
The main problem of our approach (with the set of characteristics and the particular example we have chosen) is that the trackingonly strategies rely very strongly on positions in stocks with high market capitalization (relative to the others in the index). Therefore, the portfolio is exposed to the performance of this particular group of stocks. In some cases, that strategy can be profitable (as in our case) and in other cases not. It is then recommended to choose strategies that mix different types of assets to avoid excessive exposure to a particular sector or group of stocks. This can be controlled by the tradeoff parameters in the objective function or by including more characteristics.
Finally, we mention additional research that can be done with respect to the parametric model for index tracking and enhanced indexation. For example, the implementation of the more sophisticated transaction costs policies described in the paper, the analysis of performance of the parametric model under other choices of characteristics, trading frequencies, time horizons and indices (e.g., it could be of interest to see if market capitalization continues to be important to track other types of indices that are not marketcap weighted), the use of multiple disjoint insample (calibration) and outofsample periods, and the elaboration of an empirical comparison of the parametric approach with other widely recognized models in the literature.
Acknowledgment
We thank Jeremy Staum for his valuable comments and suggestions.
]]>REFERENCES
Alexander, C. (1999). Optimal hedging using cointegration. Philosophical Transactions of The Royal Society of London Series A – Mathematical Physical and Engineering Sciences, 357, 2039–2058.
Alexander, C., & Dimitriu, A. (2005). Index and statistical arbitrage: Tracking error or cointegration? Journal of Portfolio Management, 31, 50–63.
Beasley, J. E., Meade, N., & Chang, T. J. (2003). An evolutionary heuristic for the index tracking problem. European Journal of Operational Research, 148(3), 621–643.
Bienstock, D. (1996). Computational study of a family of mixedinteger quadratic programming models. Mathematical Programming, 74, 121–140.
Brandt, M. W., SantaClara, P., & Valkanov, R. (2009). Parametric portfolio choices: Exploiting characteristics in the cross section of equity returns. Review of Financial Studies, 22(9), 3411–3447.
Canakgoz, N., & Beasley, J. E. (2008). Mixedinteger programming approaches for index tracking and enhanced indexation. European Journal of Operational Research, 196, 384–399.
Corielli, F., & Marcellino, M. (2006). Factor based index tracking. Journal of Banking and Finance, 30, 2215–2233.
Cornuejols, G., & Tütüncü, R. (2007). Optimization methods in ﬁnance. Cambridge University Press.
]]> DeMiguel, V., Garlappi, L., & Uppal, R. (2009). Optimal versus naive diversiﬁcation: How inefﬁcient is the 1/N portfolio strategy? Review of Financial Studies, 22, 1915–1953.Derigs, U., & Nickel, N. (2003). Metaheuristic based decision support for portfolio optimization with a case study on tracking error minimization in passive portfolio management. OR Spectrum, 25, 345–378.
Focardi, S., & Fabozzi, F. (2004). A methodology for index tracking based on timeseries clustering. Quantitative Finance, 4, 417–425.
Gaivoronski, A., Krylov, S., & van der Wijst, N. (2005). Optimal portfolio selection and dynamic benchmark tracking. European Journal of Operational Research, 163, 115–131.
Leland, H. (2000). Optimal portfolio implementation with transaction costs and capital gain taxes. In Working paper. Berkeley: University of California.
Oh, K., Kim, T., & Min, S. (2005). Using genetic algorithm to support portfolio optimization for index fund management. Expert Systems with Application, 28, 371–379.
Rudd, A. (1980). Optimal selection of passive portfolios. Financial Management, 57–66.
Sharpe, W. (1988). Determining a fund’s effective asset mix. Investment Management Review, 59–69.
Sharpe, W. (1992). Asset allocation: Management style and performance measurement. Journal of Portfolio Management, 7–19.
Stoyan, A., & Kwon, B. (2007). A twostage stochastic mixedinteger programming approach to the index tracking problem. In Working Paper. University of Toronto.
]]> Yao, D., Zhang, S., & Zhou, X. (2006). Tracking a ﬁnancial benchmark using a few assets. Operations Research, 54, 232–246.

During our discussion, we frequently use the term tracking portfolio to refer, in general, to the solutions of the optimization problems that will be described in Section 3.

In a more general case an index can be tracked not only taking positions in its components, but also using other stocks outside the index and other assets such commodities. However, the last case will generate complications in the definition of the relevant characteristics.

From now on, we will use the generic name "tracking portfolio" to identify the investor''s portfolio.

We will formulate the objectives in terms of conditional expectations; however, under independence and stationary assumptions, we can think of them as unconditional expectations and after that as sample counterparts over a calibration period. More detail will be given later in this section.

In the case of the stocks they correspond to the holding period return on CRSP (including cash and price adjustments).
Received 29 November 2013 ]]> Accepted 26 March 2014
Corresponding author. l.chavezbedoya@gmail.com
]]>