Matlab Documentation for Very Simple Markov-Perfect Industry Dynamics: Empirics

December 2018

This software package lists and describes the programs used in Abbring et al. (2017).

 ✶ We thank Emanuel Marcu for excellent research assistance and our students for comments. The research of Jaap Abbring is financially supported by the Netherlands Organisation for Scientific Research (NWO) through Vici grant 453-11-002. § CentER, Department of Econometrics & OR, Tilburg University. E-mail: jaap@abbring.org. † Economic Research, Federal Reserve Bank of Chicago, and CentER, Tilburg University. E-mail: jcampbell@frbchi.org. ‡ QuantCo, Inc. E-mail: jan.tilly@quantco.com. ◊ Business School, National University of Singapore. E-mail: yangnan@nus.edu.sg.

In this software package, we present Matlab programs that implement the estimation algorithm introduced in Abbring et al. (2017) on simulated data. This software package is intended to serve as a platform for experimentation and teaching. In the paper, we use C++ code to obtain estimation results for our empirical application on movie theaters and to perform a large number of simulations.

This Matlab package can be downloaded from here. The code can be executed in Matlab by running the script example.m.

This documentation is structured as follows. First, we briefly review the model presented in Abbring et al. (2017), which is a special case of the model presented in Abbring et al. (2018). Second, we introduce the algorithm that computes the equilibrium and show how to compute the model's likelihood function. Third, we discuss all the necessary ingredients to generate data from the model. Fourth, we put all of the above together in the script example.m, where we create a synthetic sample and estimate the underlying primitives.

### 1Model

Consider a local market. Time is discrete and indexed by $t\in\mathbb{N}\equiv\{1,2,\ldots\}$. In period $t$, firms that have entered in the past and not yet exited serve the market. Each firm has a name $f\in{\cal F}\equiv{\cal F}_0 \cup\left(\mathbb{N}\times\{1,2,\ldots,\check{\jmath}\}\right).$ Initial incumbents have distinct names in ${\cal F}_0$, while potential entrants' names are from $\mathbb{N}\times\{1,2,\ldots,\check{\jmath}\}$. The first component of a potential entrant's name gives the period in which it has its only opportunity to enter the market, and the second component gives its position in that period's queue of $\check{\jmath} \lt \infty$ firms. Aside from the timing of their entry opportunities, the firms are identical.

We divide each period into two subperiods, the entry and survival stages. Period $t$ begins on the left with the entry stage. If $t=1$, nature sets the number $N_1$ of firms serving the market in period $1$ and the initial demand state $C_1$. If $t \gt 1$, these are inherited from the previous period. We assume that $C_t$ follows a first-order Markov process and denote its support with $\cal C$. Throughout the paper, we refer to $C_t$ as “demand,” but it can encompass any observed, relevant, and time-varying characteristics of the market, depending on the empirical context. In our empirical application, $C_t$ is the local market's residential population.

Each incumbent firm serves the market and earns a surplus $\pi(N_t,C_t)$. We assume that

• $\exists \check{\pi} \lt \infty$ such that $\forall (n,c)\in\mathbb{N}\times\cal C$, $\mathbb{E}[\pi(n,C^\prime)|C=c]\leq\check{\pi}$;
• $\exists \check{n}\in\mathbb{N}$ such that $\forall n \gt \check{n}$ and $\forall c\in\cal C$, $\pi(n,c) = 0$; and
• $\forall (n,c)\in\mathbb{N}\times\cal C$, $\pi(n,c) \geq \pi(n+1,c)$.

Here and throughout; we denote the next period's value of a generic variable $Z$ with $Z^\prime$, random variables with capital Roman letters, and their realizations with the corresponding small Roman letters. The first assumption is technical and allows us to restrict equilibrium values to the space of bounded functions. The second assumption allows us to restrict equilibrium analysis to markets with at most $\check{n}$ firms. It is not restrictive in empirical applications to oligopolies. The third assumption requires the addition of a competitor to reduce weakly each incumbent's surplus.

After incumbents earn their surpluses, nature draws the current period's shock to continuation and entry costs, $W_t$, from a distribution $G_W$ with positive density everywhere on the real line. Then, period $t$'s cohort of potential entrants $\{t\}\times\mathbb{N}$ make entry decisions in the order of the second component of their names. We denote firm $f$'s entry decision with $a^f_E\in\left\{0,1\right\}$. An entrant ($a^f_E=1$) pays the sunk cost $\varphi \exp(W_{t})$, with $\varphi \gt 0$. A firm choosing not to enter ($a^f_E=0$) earns a payoff of zero and never has another entry opportunity. Such a refusal to enter also ends the entry stage, so firms remaining in this period's entry cohort that have not yet had an opportunity to enter never get to do so. Since the next firm in line faces exactly the same choice as did the firm that refused to enter, it will make the same refusal decision in any symmetric equilibrium. Since every period has at least one firm refusing an available entry opportunity, the model is one of free entry.

We denote the number of firms in the market after the entry stage, the sum of the incumbents and the actual entrants, with $N_{E,t}$. Suppose that the names of these active firms are $f_1,\ldots,f_{N_{E,t}}$. In the subsequent survival stage, they simultaneously decide on continuation with probabilities $a_S^{f_1},\ldots, a_S^{f_{N_{E,t}}}\in[0,1]$. After these decisions, all survival outcomes are realized independently across firms according to the chosen Bernoulli distributions. Firms that survive pay a fixed cost $\exp(W_t)$. A firm can avoid this cost by exiting to earn zero. Firms that have exited cannot reenter the market later. The $N_{t+1}$ surviving firms continue to the next period, $t+1$. The period ends with nature drawing a new demand state $C_{t+1}$ from the conditional distribution $G_C(\cdot\; | \;C_t)$. All firms discount future profits and costs with the discount factor $\rho\in[0,1)$.

We assume that, for each market, the data contain information on $N_t$, $C_t$, and possibly some time-invariant market characteristics $X$ that shift the market's primitives. The market-level cost shocks $W_t$ are not observed by the econometrician and serve as the model's structural econometric errors. Because they are observed by all firms and affect their payoffs from entry and survival, they make the relation between the observed demand state $C_t$ and the market structure $N_t$ statistically nondegenerate.

The assumptions on $\{C_t, W_t\}$ make it a first-order Markov process satisfying a conditional independence assumption. This ensures that the distribution of $(N_{t},C_{t})$ conditional on $(N_{t^\star},C_{t^\star})$ for all ${t^\star} \lt t$ depends only on $(N_{t-1},C_{t-1})$, so we require only the model's transition rules to calculate the conditional likelihood function.

#### 11Value Functions and Entry Rules

We begin by implementing the computational algorithm to compute the equilibrium value functions that we present in the paper. The post-survival value function $v_S(n,c)$ is computed recursively by iterating on a sequence of Bellman equations.

Recall the definitions of the entry thresholds in the paper,

(1)

$\overline w_{E}(n,c) = \log v_{S}\left(n, c\right) - \log\left(1 + \varphi\right).$

The post-survival value function is given by

(2)

$\begin{split} v_S(n, c) = \rho \mathbb E\big[ \pi(n, C')\; +&\int_{\overline{w}_E(n + 1, C')}^{\log v_S(n, C')} &\left(- \exp(w) + v_S(n, C')\right) d G_W(w) \\ + \sum_{n' = n+1}^{\check n}\;&\int_{\overline{w}_E(n' + 1, C')}^{\overline{w}_E(n', C')} &\left(- \exp(w) + v_S(n', C')\right) d G_W(w) \big| C=c\big], \end{split} \label{vS_2}$

The above is the key equation that we will use to numerically compute the equilibrium. First, we consolidate the econometric error and obtain

(3)

$\begin{split} v_S(n, c) = \rho \mathbb E\big[ \pi(n, C')\; +&v_S(n, C') \int_{\overline{w}_E(n + 1, C')}^{\log v_S(n, C')} d G_W(w) + \sum_{n' = n+1}^{\check n} v_S(n', C')\int_{\overline{w}_E(n' + 1, C')}^{\overline{w}_E(n', C')} d G_W(w) \\ - \int_{-\infty}^{\log v_S(n, C')} \;&\exp(w) d G_W(w) \big| C=c\big]. \end{split} \label{vS_3}$

Second, we invoke the distributional assumption on $W$,

(4)

$W \sim N(-\frac{1}{2}\omega^2,\omega^2),$

which gives us a closed form solution for the partial expectation,

(5)

$\int_{-\infty}^{\log v_S(n, C')} \exp(w) d G_W(w) = \left[1 - \Phi\left(\frac{ \frac{1}{2}\omega^2 - \log v_S(n,C')}{\omega}\right)\right] \label{partialExpectation}$

where $\Phi(\cdot)$ refers to the standard normal cumulative distribution function. The remaining two integrals in equation (3) can be expressed using the cumulative distribution function of $W$:

(6)

$\int_{\overline{w}_E(n + 1, C')}^{\log v_S(n, C')} d G_W(w) = G_W\left[\log v_S(n,C')\right] - G_W\left[\log v_S(n+1,C') - \log (1 + \varphi)\right] \label{pSureSurvivalNoEntry}$

(7)

$\int_{\overline{w}_E(n' + 1, C')}^{\overline{w}_E(n', C')} d G_W(w) = G_W\left[\log v_S(n',C')-\log(1 + \varphi)\right] - G_W\left[\log v_S(n'+1,C') - \log(1 + \varphi)\right] \label{pEntrySet}$

We now implement the value function iteration on equation (3) in Matlab.

The function valueFunctionIteration requires the arguments Settings and Param. It returns the following four matrices:

• vS is a matrix of dimension $(\check n + 1) \times \check c$ that stores the equilibrium post-survival value functions. By definition, the last row consists of zeros (and exists mostly for computational convenience).
• pEntry is a matrix of dimension $(\check n + 1) \times \check c$ that stores the equilibrium entry probabilities, i.e. each element contains the probability that at least row firms are active post-entry under demand state column. Again, the last row consists of zeros. Thus, the $n^{\text{th}}$ row of pEntry stores $\Pr\left(w \lt \overline w_E(n, :)\right)$.
• pEntrySet is a matrix of dimension $(\check n + 1) \times \check c$ that stores the equilibrium probabilities that exactly row firms are active post-entry under demand state column. Thus, the $n^{\text{th}}$ row of pEntrySet stores $\Pr\left(\overline w_E(n + 1, :) \leq w \lt \overline w_E(n, :)\right)$.
• pStay is a matrix of dimension $\check n \times \check c$ that stores the equilibrium probabilities that row firms find certain survival profitable under demand state column. Thus, the $n^{\text{th}}$ row of pStay stores $\Pr\left(w \lt \overline w_S(n, :)\right)$.

The goal is to construct the $(\check n + 1) \times \check c$ matrix vS, in which the last row is set to zero by construction. Each column in vS constitutes a fixed point to the Bellman equation characterized by equation (3). The procedure will follow a backward recursion on the number of firms, from $\check n$ back to 1. For $n = \check n$, the Bellman equation has $v_S(\check n ,c)$ on both sides as entry cannot occur. For $n \lt \check n$, the Bellman equation will depend only on $v_S(n',c)$ for $n' \gt n$, the values of which are already in memory because we are iterating backwards.

 valueFunctionIteration.m 44 function [vS, pEntry, pEntrySet, pStay] = valueFunctionIteration.m 45 valueFunctionIteration(Settings, Param)

Allocate and intialize the post-survival value functions, the entry probabilities, and the probability of sure survival.

 valueFunctionIteration.m 49 vS = zeros(Settings.nCheck + 1, Settings.cCheck); valueFunctionIteration.m 50 pEntry = zeros(Settings.nCheck + 1, Settings.cCheck); valueFunctionIteration.m 51 pEntrySet = zeros(Settings.nCheck + 1, Settings.cCheck); valueFunctionIteration.m 52 pStay = zeros(Settings.nCheck, Settings.cCheck);

Now we begin the backward recursion, starting from nCheck. We will iterate on vS(n, :) using equation (3), which we map into the relevant Matlab variables below:

(8)

$\begin{split} v_S(n,c) &= \rho \mathbb E\bigg[\; & \overbrace{ \pi(n,C')}^{\textbf{flowSurplus}} - \overbrace{\left[1 - G_W\left(\omega ^ 2 - \log v_S(n,C')\right)\right]}^{\textbf{partialExp}} \\ & + & \overbrace{v_S(n, C')\left(G_W\left[\log v_S(n,C')\right] - G_W\left[\log v_S(n + 1,C') - \log(1+\varphi)\right]\right)}^{\textbf{valueSureSurvNoEntry}} \\ & + & \overbrace{\sum_{n'=n + 1}^{\check n} v_S(n', C')\left(G_W\left[\log v_S(n',C')-\log(1+\varphi)\right] - G_W\left[\log v_S(n' + 1,C') - \log(1 + \varphi)\right]\right)}^{\textbf{valueAdditionalEntry}} \bigg| C=c\bigg]. \end{split}$

The expectation operator with respect to $C'$ is implemented by left-multiplying the inside of the expectation in the equation above (a column vector with cCheck elements) by the transition probability matrix Param.demand.transMat. The iteration is complete when the difference vSdiff between vS(n, :) and its update n vSPrime does not exceed the stopping criterion, Settings.tolInner. Start by initializing vSdiff to 1 (which exceeds Settings.tolInner). We pre-compute $\omega ^ 2$ and the demand grid (transposed) at the beginning, so we do not have to do so repeatedly inside the loops below.

 valueFunctionIteration.m 75 thetaW2 = Param.omega ^ 2; valueFunctionIteration.m 76 gridTrans = exp(Settings.logGrid)'; valueFunctionIteration.m 78 for n = Settings.nCheck:-1:1 valueFunctionIteration.m 80 % initialize the iteration counter and the criterion value valueFunctionIteration.m 81 iter = 0; valueFunctionIteration.m 82 vSdiff = 1; valueFunctionIteration.m 84 % take the nth row out of vS, transpose it, and iterate on this column vector valueFunctionIteration.m 85 vSn = vS(n, :)'; valueFunctionIteration.m 87 % pre-compute flow surplus so we don't have to do so repeatedly inside the while loop valueFunctionIteration.m 88 flowSurplus = gridTrans * Param.k(n) / n; valueFunctionIteration.m 90 % pre-compute value from additional entry, because this is known valueFunctionIteration.m 91 valueAdditionalEntry = valueFunctionIteration.m 92 sum(pEntrySet((n + 1):end, :) .* vS((n + 1):end, :))'; valueFunctionIteration.m 94 % get row (n+1) out of pEntry and store it as column valueFunctionIteration.m 95 pEntrynPlus1 = pEntry(n + 1, :)'; valueFunctionIteration.m 97 % iterate until convergence valueFunctionIteration.m 98 while (vSdiff > Settings.tolInner && iter < Settings.maxIter) valueFunctionIteration.m 100 % increment iteration counter valueFunctionIteration.m 101 iter = iter + 1; valueFunctionIteration.m 103 % only compute the log once valueFunctionIteration.m 104 logvSn = log(vSn); valueFunctionIteration.m 106 % compute the probability that incumbents survive with prob 1 valueFunctionIteration.m 107 pStayn = normcdf(logvSn, -0.5 * thetaW2, Param.omega); valueFunctionIteration.m 109 % compute partial expectation of cost shock valueFunctionIteration.m 110 partialExp = 1 - normcdf(0.5 * thetaW2 - logvSn, 0, Param.omega); valueFunctionIteration.m 112 % assemble value function update valueFunctionIteration.m 113 valueSureSurvNoEntry = (pStayn - pEntrynPlus1) .* vSn; valueFunctionIteration.m 114 vSnPrime = (Param.rho * Param.demand.transMat * (flowSurplus valueFunctionIteration.m 115 - partialExp + valueSureSurvNoEntry + valueAdditionalEntry)); valueFunctionIteration.m 116 vSdiff = max(abs(vSn - vSnPrime)); valueFunctionIteration.m 118 % update value function valueFunctionIteration.m 119 vSn = vSnPrime; valueFunctionIteration.m 120 end valueFunctionIteration.m 122 if (iter == Settings.maxIter) valueFunctionIteration.m 123 error('value function iteration failed'); valueFunctionIteration.m 124 end valueFunctionIteration.m 126 % store equilibrium objects valueFunctionIteration.m 127 vS(n, :) = vSn; valueFunctionIteration.m 128 pStay(n, :) = pStayn; valueFunctionIteration.m 129 pEntry(n, :) = normcdf(logvSn - log((1 + Param.phi)), -0.5 * thetaW2, Param.omega); valueFunctionIteration.m 130 pEntrySet(n, :) = pEntry(n, :) - pEntry(n + 1, :); valueFunctionIteration.m 132 end

Note that we only need to compute the entry probabilities outside of the value function iteration. We use the variable iter to keep track of the number of iterations for each value function iteration. Whenever iter exceeds Settings.maxIter, the value function iteration is terminated and a error is returned. This concludes valueFunctionIteration.

 valueFunctionIteration.m 140 end

#### 12Survival Rules

Our equilibrium computation algorithm valueFunctionIteration is fast because we do not need to completely characterize firms' survival strategies to compute the equilibrium value functions. We know that when firms mix between staying and exiting, the must receive a continuation value of zero. Firms use mixed strategies whenever the cost shock falls into the interval

(9)

$\overline w_S(n, c) \leq w \lt w_S(1, c),$

which means that it is not profitable for all $n$ active firms to continue, but the market is profitable enough for at least one firm to continue.

What we have not computed thus far is how firms mix between continuation and exit when the cost shock falls into this interval. The mixing probabilities $a_S$ are implicitly defined by the indifference condition

(10)

$\label{eq:indifference1} \sum_{n'=1}^{n} {n - 1 \choose n' - 1} a_S^{n' - 1}\left(1-a_S\right)^{n-n'}\left(- \exp(w)+v_{S}(n',c)\right)=0.$

The indifference condition states that when $n$ active firms all use the survival rule a_S, the expected value from using survival rule a_S equals zero.

We compute the solution to (10) in mixingProbabilities.

This function computes the mixing probability for a market with N firms, demand state C, and cost shock W, where the post-survival value function is given by vS. The inputs N, C, and W are all of the same dimension.

The function returns the mixing probabilities aS which is of the same dimension as the inputs N, C, and W.

 mixingProbabilities.m 12 function aS = mixingProbabilities(N, C, W, vS)

The function will solve (10) for $a_S$ using Matlab's roots function. For the roots function to work, we need to transform (10) into polynomial form, which is given by

(11)

$\sum_{n'=0}^{n_E-1} \underbrace{\left[\sum_{i=0}^{n'} \underbrace{ \underbrace{(-1)^{n'-i}}_{\textbf{signCoef}} \underbrace{\frac{(n_E-1)!}{i!(n_E-1-n')!(n'-i)!}}_{\textbf{nCk}} \underbrace{\left(-\exp(w) + v_S(i + 1,c) \right)}_{\textbf{continuationValue}}}_{\textbf{matCoef}}\right]}_{\textbf{vecCoef}} a_S^{n'} =0,$

where the relevant Matlab variables are marked in bold font.

We allocate a vector of NaNs with the survival strategies that will subsequently be filled and then loop over each element in N.

 mixingProbabilities.m 32 aS = NaN(size(N)); mixingProbabilities.m 34 for iX = 1:length(N)

Store the post-entry number of active firms in a scalar nE. Allocate the matrix matCoef to store the coefficients of the polynomial above. Then assemble the coefficients by looping from nE-1 to 0.

 mixingProbabilities.m 40 nE = N(iX); mixingProbabilities.m 41 matCoef = zeros(nE); mixingProbabilities.m 42 for jX = (nE - 1):-1:0 mixingProbabilities.m 43 signCoef = (-1) .^ (jX - (0:jX)); mixingProbabilities.m 44 nCk = factorial(nE - 1) / factorial(nE - 1 - jX) ./ (factorial(0:jX) .* factorial(jX - (0:jX))); mixingProbabilities.m 45 continuationValue = (-exp(W(iX)) + vS(1:(jX + 1), C(iX)))'; mixingProbabilities.m 46 matCoef(nE - jX, 1:(jX + 1)) = signCoef .* nCk .* continuationValue; mixingProbabilities.m 47 end mixingProbabilities.m 49 vecCoef = sum(matCoef, 2);

We then compute the candidate values for the mixing probabilities using roots, and nullify ([]) all values that are smaller than 0 or larger than 1. When the only root is really close to 0 or to 1, Matlab may return a couple of undesired complex roots as a bonus. We nullify these candidates as well. Finally, we pick the remaining root (which we know exists and is unique).

 mixingProbabilities.m 57 mixprobcand = roots(vecCoef); mixingProbabilities.m 58 mixprobcand(mixprobcand<0 | mixprobcand>1) = []; mixingProbabilities.m 59 mixprobcand(real(mixprobcand) ~= mixprobcand) = []; mixingProbabilities.m 60 if(length(mixprobcand) ~= 1) mixingProbabilities.m 61 error('The number of roots between 0 and 1 is not equal to 1.'); mixingProbabilities.m 62 end mixingProbabilities.m 63 aS(iX) = mixprobcand; mixingProbabilities.m 65 end

This concludes mixingProbabilities.

 mixingProbabilities.m 67 end

### 2Likelihood

Before turning to the computation of the likelihood, we first review the likelihood construction described in the paper. Suppose we have data for $\check r$ markets. Each market is characterized by $\check t$ observations that include the demand state $C_{t,r}$ and the number of active firms $N_{t,r}$. We wish to estimate the parameter vector

(12)

$\theta \equiv (\theta_C, \theta_P, \theta_W) \equiv ( (\mu_C, \sigma_C), (k, \varphi), \omega).$

The likelihood contribution of a single market-level observation, i.e. a transition from $(c, n)$ to $(c', n')$ for market $r$ is given by

(13)

$\ell_{t+1,r}\left(\theta\right) = f_{N_{t+1,r},C_{t+1,r}}\left(n',c' | C_{t,r}=c,N_{t,r}=n;\theta_C,\theta_P,\theta_W\right)$

where $f_{N_{t+1,r},C_{t+1,r}}$ stands for the joint density of $(N_{t+1,r},C_{t+1,r})$ conditional on $(N_{t,r},C_{t,r})$. Notice that $C_{t+1,r}$ is drawn by nature according to $G_{C,r}(\cdot|C_{t,r})$ independently of $N_{t+1,r}$. Moreover, by the structure of the game, firms' decisions, which determine $N_{t+1,r}$, are made prior to the draw of $C_{t+1,r}$ and are therefore not affected by $C_{t+1,r}$. Hence, we can write the likelihood contribution as the product of the conditional densities:

(14)

$\ell_{t+1,r}\left(\theta\right) = f_{C_{t+1,r}}\left(c' | C_{t,r}=c;\theta_C\right) \cdot f_{N_{t+1,r}}\left(n' | C_{t,r}=c,N_{t,r}=n;\theta_C,\theta_P,\theta_W\right)$

where $f_{C_{t+1,r}}$ and $f_{N_{t+1,r}}$ denote the conditional densities. The expression for the conditional density of $N_{t+1,r}$ equals $p\left(N_{r,t+1}\;|\;N_{r,t},C_{r,t};\theta\right)$. That is,

(15)

$f_{N_{t+1,r}}\left(n' | C_{t,r}=c,N_{t,r}=n;\theta_C,\theta_P,\theta_W\right) = \Pr(N_{r,t+1}=n'|N_{r,t}=n,C_{r,t}=c;\theta) \equiv p\left(N_{r,t+1}\;|\;N_{r,t},C_{r,t};\theta\right)$

The conditional density of $N_{t+1,r}$ is the probability that market $r$ with $n$ firms in demand state $c$ has $n'$ firms next period. Also, the conditional density of the demand process is given in the paper by the function $g_C$. That is,

(16)

$f_{C_{t+1,r}}\left(c' | C_{t,r}=c;\theta_C\right) = g_C \left(C_{t+1,r} |C_{t,r}; \theta_C\right)$

The likelihood function is then defined as in the paper:

(17)

$\mathcal{L}\left(\theta\right)= \mathcal{L}_C\left( \theta_C\right) \cdot \mathcal{L}_N\left(\theta\right)$

where

(18)

$\mathcal{L}_C\left(\theta_C\right) = \prod_{r=1}^{\check r} \prod_{t=1}^{\check t-1} g_C\left(C_{t+1,r} |C_{t,r}; \theta_C \right)$

(19)

$\mathcal{L}_N\left(\theta\right) = \prod_{r=1}^{\check r} \prod_{t=1}^{\check t-1} p\left(N_{r,t+1}\; | \;N_{r,t},C_{r,t};\theta \right)$

$\mathcal{L}_C\left(\theta_C\right)$ can be calculated easily from demand data alone, with no need to solve the model, as $g_C\left(C_{t+1,r} |C_{t,r};\theta_C\right)$ translates to entries in the transition matrix of the demand process. In contrast, computing $\mathcal{L}_N\left(\theta \right)$ requires solving for the equilibrium of the model.

The three-steps estimation procedure is as described in Rust (1987):

1. Estimate $\theta_C$ with $\tilde\theta_C\equiv\arg \max_{\theta_C}{\cal L}_C(\theta_C)$.
2. Estimate $(\theta_P, \theta_W)$ with $(\tilde\theta_P,\tilde\theta_W)\equiv\arg\max_{(\theta_P,\theta_W)}{\cal L}_N(\theta_P,\tilde\theta_C,\theta_W)$.
3. Estimate $\theta$ by maximizing the full likelihood function $\hat\theta\equiv\arg\max_\theta{\cal L}(\theta)$, using $\tilde\theta\equiv(\tilde\theta_P,\tilde\theta_C,\tilde\theta_W)$ as starting value.

The first two steps are thus used for providing starting values for the full information maximum likelihood (FIML) in Step 3. As can be seen from experimenting with the code, the first two steps provide very good starting values for the FIML, which therefore converges after only a small number of iterations. Note that it is the second step which gives the procedure the name NXFP: solving the model entails solving for the fixed point of the value functions, and this is nested within the optimization procedure that maximizes the likelihood.

The starting values for the first step are directly calculated from the data as the mean and standard deviation of the innovations of logged demand. The starting values for $\theta_P$ and $\theta_W$ in the second step are randomly drawn from a uniform distribution with support $[1,5]$.

The next subsections describe each of the three steps. Standard errors are computed using the outer-product-of-the-gradient method. Since the FIML is asymptotically efficient, while the estimators in the first two steps are not, we only discuss the computation of standard errors in the third step.

#### 21Likelihood Step 1: Estimate $\theta_C$

This function computes the first step likelihood function from demand data. This function takes as inputs the Data structure (from which the matrix data.C will be used), the Settings structure, and the estimates vector which consists of $\mu_C$ and $\sigma_C$. The output of the function is the negative log likelihood ll and the $(\check t -1)\cdot \check r \times1$ vector of likelihood contributions likeCont. Notice that the data contains $\check t$ time periods, which gives us $\check t -1$ transitions to construct the likelihood function.

 likelihoodStep1.m 11 function [ll, likeCont] = likelihoodStep1(Data , Settings, estimates)

We look for transitions from $C_{t,r}$ to $C_{t+1,r}$ for $t=1,\ldots,\check t-1$. Start by constructing two matrices of size $(\check t -1)\times \check r$ named from and to, which include $C_{t,r}$ and $C_{t+1,r}$, respectively, for $t=1,\ldots,\check t-1$ and $r=1,\ldots,\check r$. Thus, from and to are vectors with as many elements as there are transitions in the data.

 likelihoodStep1.m 19 from = Data.C(1:(Settings.tCheck - 1), 1:Settings.rCheck); likelihoodStep1.m 20 to = Data.C(2:Settings.tCheck, 1:Settings.rCheck);

We allocate the $(\check t -1)\cdot \check r \times1$ likelihood contribution vector and set its value to NaN:

 likelihoodStep1.m 24 likeCont = NaN(size(from(:),1), 1);

Assign muC and sigmaC, the values with respect to which we will maximize this likelihood function from the input estimates.

 likelihoodStep1.m 28 muC = estimates(1); likelihoodStep1.m 29 sigmaC = estimates(2);

Now, compute the likelihood for each transition that is observed in the data given muC and sigmaC. For all transitions, calculate likelihood contributions according to Tauchen(1986). We make a distinction between transitions to interior points of the demand grid and transitions to points on the boundary.

Transitions to an interior point yield a likelihood contribution of

(20)

$\Pi_{i,j} = Pr\left[C'=c_{[j]} |C=c_{[i]}\right] = \Phi\left(\frac{\log c_{[j]} - \log c_{[i]} +\frac{d}{2}-\mu_C}{\sigma_C}\right) - \Phi\left(\frac{\log c_{[j]} - \log c_{[i]} -\frac{d}{2}-\mu_C}{\sigma_C}\right)$

Similarly, transition to the lower and upper bound yield likelihood contributions of

(21)

$\Pi_{i,1} = Pr\left[C'=c_{[1]} |C=c_{[i]}\right] = \Phi\left(\frac{\log c_{[1]} - \log c_{[i]} +\frac{d}{2}-\mu_C}{\sigma_C}\right)$

and

(22)

$\Pi_{i,\check c} = Pr\left[C'=c_{[\check c]} |C=c_{[i]}\right] = 1-\Phi\left(\frac{\log c_{[\check c]} - \log c_{[i]} -\frac{d}{2}-\mu_C}{\sigma_C}\right),$

respectively.

We then take the log of the contributions, sum them up, and return the negative log-likelihood contribution.

 likelihoodStep1.m 63 selectInteriorTransitions = to > 1 & to < Settings.cCheck; likelihoodStep1.m 65 likeCont(selectInteriorTransitions) likelihoodStep1.m 66 = normcdf((Settings.logGrid(to(selectInteriorTransitions)) likelihoodStep1.m 67 - Settings.logGrid(from(selectInteriorTransitions)) + Settings.d / 2 - muC) / sigmaC) likelihoodStep1.m 68 - normcdf((Settings.logGrid(to(selectInteriorTransitions)) likelihoodStep1.m 69 - Settings.logGrid(from(selectInteriorTransitions)) - Settings.d / 2 - muC) / sigmaC); likelihoodStep1.m 71 likeCont(to == 1) = normcdf((Settings.logGrid(1) - likelihoodStep1.m 72 Settings.logGrid(from(to == 1)) + Settings.d / 2 - muC) / sigmaC); likelihoodStep1.m 74 likeCont(to == Settings.cCheck) = 1 - normcdf((Settings.logGrid(Settings.cCheck) likelihoodStep1.m 75 - Settings.logGrid(from(to == Settings.cCheck)) - Settings.d / 2 - muC) / sigmaC); likelihoodStep1.m 77 ll = -sum(log(likeCont));

This concludes likelihoodStep1.

 likelihoodStep1.m 80 end

#### 22Likelihood Step 2: Estimate $(\theta_P,\theta_W)$

We now construct the likelihood contributions that result from the number of firms evolving from $n$ to $n'$. Recall that we obtain cost-shock thresholds for entry and survival, defined by $\overline w_{E}(n,c) \equiv \log v_{S}(n,c) - \log\left(1 + \varphi\right)$ and $\overline w_{S}(n,c)\equiv \log v_{S}(n,c)$. We consider five mutually exclusive cases.

• Case I: $\mathbf{n' \gt n}$. If the number of firms increases from $n$ to $n'$, then it must be profitable for the $n'$th firm to enter, but not for the $(n'+1)$th: $\overline w_{E}(n'+1,c)\leq w \lt \overline w_{E}(n',c)$. The probability of this event is

(23)

$\label{app:eq:llhcontr1} G_W\left[\overline w_{E}(n',c)\right]-G_W\left[\overline w_{E}(n'+1,c)\right].$

• Case II: $\mathbf{0 \lt n' \lt n}$. If the number of firms decreases from $n$ to $n'$, with $0 \lt n' \lt n$, then the realization of $W$ must be such that firms exit with probability $a_{S}(n,c,w)\in(0,1)$. Thus, the cost shock must be high enough so that $n$ firms cannot survive profitably, $w\geq\overline w_{S}(n,c)$, but low enough for a monopolist to survive profitably, $w \lt \overline w_{S}(1,c)$. Given such value $w$, $N'$ is binomially distributed with success probability $a_{S}(n,c,w)$ and population size $n$. Hence, the probability of observing a transition from $n$ to $n'$ with $0 \lt n' \lt n$ equals

(24)

$\label{app:eq:llhcontr2} \int_{\overline w_{S}(n,c)}^{\overline w_{S}(1,c)} {n \choose n'} a_{S}(n,c,w)^{n'}\left[1-a_{S}(n,c,w)\right]^{n-n'} g_W(w)dw,$

where $g_W$ is the density of $G_W$. The integrand in (24) involves the mixing probabilities $a_{S}(n,c,w)$. We discuss how we compute this integral in detail below.

• Case III: $\mathbf{n'=0, n \gt 0}$. All firms exiting can be the result of two events. First, it is not profitable for even a single firm to continue, $w \geq \overline w_{S}(1,c)$. Second, it is profitable for some but not all firms to continue, $\overline w_{S}(n,c) \leq w \lt \overline w_{S}(1,c)$, firms exit with probability $a_S(n,c)\in(0,1)$ as in Case II, and by chance none of the $n$ firms survives. The probability of these events is

(25)

$\label{app:eq:llhcontr3} 1-G_W\left[\overline w_{S}(1,c)\right] +\int_{\overline w_{S}(n,c)}^{\overline w_{S}(1,c)}\left[1-a_{S}(n,c,w)\right]^{n} g_W(w)dw.$

• Case IV: $\mathbf{n'=0, n=0}$. In this case, the market is populated by zero firms and it is not profitable for a monopolist to enter. The probability of this event is given by

(26)

$\label{app:eq:llhcontr4} 1-G_W\left[\overline w_{E}(1,c)\right].$

• Case V: $\mathbf{n' = n \gt 0}$. If there is neither entry nor exit, then either no firm finds it profitable to enter and all $n$ incumbents find it profitable to stay, $\overline w_{E}(n+1,c) \leq w \lt \overline w_{S}(n,c),$ or the $n$ incumbents mix as in Cases II and III, but by chance end up all staying. The probability of these events is

(27)

$\label{app:eq:llhcontr5} G_W\left[\overline w_{S}(n,c)\right]-G_W\left[\overline w_{E}(n+1,c)\right] + \int_{\overline w_{S}(n,c)}^{\overline w_{S}(1,c)} a_{S}(n,c,w)^{n} g_W(w)dw.$

We compute the likelihood using the function likelihoodStep2 that requires as inputs the structures Data, Settings, and Param, and the vector estimates as inputs. It returns the scalar valued negative log-likelihood function ll and a column vector of length $\check r \times (\check t - 1)$ containing the market-time-specific likelihood contributions, likeCont.

 likelihoodStep2.m 76 function [ll, likeCont] = likelihoodStep2(Data, Settings, Param, estimates)

We start by mapping the vector estimates into the corresponding elements in the Param structure. We do this using anonymous functions that are defined in the structure Settings. By construction, Param.k is a vector of length $\check{n}$. Param.phi and Param.omega are scalars.

 likelihoodStep2.m 83 Param.k = Settings.estimates2k(estimates); likelihoodStep2.m 84 Param.phi = Settings.estimates2phi(estimates); likelihoodStep2.m 85 Param.omega = Settings.estimates2omega(estimates);

Now we use valueFunctionIteration to solve the model by iterating on the post-survival value function. We also retrieve pStay, pEntry and pEntrySet, which are the probability of certain survival and the entry probabilities as described in valueFunctionIteration above.

 likelihoodStep2.m 93 [vS,pEntry,pEntrySet,pStay] = valueFunctionIteration(Settings, Param);

Next we collect the transitions observed in the data and vectorize them. The column vectors from, to, and demand are all of length $(\check t - 1) \times \check r$.

 likelihoodStep2.m 99 vec = @(x) x(:); likelihoodStep2.m 100 from = vec(Data.N(1:Settings.tCheck - 1, 1:Settings.rCheck)); likelihoodStep2.m 101 to = vec(Data.N(2:Settings.tCheck, 1:Settings.rCheck)); likelihoodStep2.m 102 demand = vec(Data.C(1:(Settings.tCheck - 1), 1:Settings.rCheck));

Here and throughout we will convert subscripts to linear indices using the Matlab function sub2ind.

We store the likelihood contributions in five vectors, each corresponding to one of the five cases outlined above. We allocate these vectors here and set all of their elements to zero.

 likelihoodStep2.m 111 llhContributionsCaseI = zeros(size(from)); likelihoodStep2.m 112 llhContributionsCaseII = zeros(size(from)); likelihoodStep2.m 113 llhContributionsCaseIII = zeros(size(from)); likelihoodStep2.m 114 llhContributionsCaseIV = zeros(size(from)); likelihoodStep2.m 115 llhContributionsCaseV = zeros(size(from));

Case I: We store all of the likelihood contributions resulting from entry in the vector llhContributionsCaseI.

 likelihoodStep2.m 119 selectMarketsCaseI = to > from; likelihoodStep2.m 120 llhContributionsCaseI(selectMarketsCaseI) = likelihoodStep2.m 121 pEntrySet(sub2ind(size(pEntrySet), likelihoodStep2.m 122 to(selectMarketsCaseI), likelihoodStep2.m 123 demand(selectMarketsCaseI)));

Case II: We store all of the likelihood contributions resulting from exit to a non-zero number of firms in the vector llhContributionsCaseII.

 likelihoodStep2.m 128 selectMarketsCaseII = from > to & to > 0; likelihoodStep2.m 129 llhContributionsCaseII(selectMarketsCaseII) = likelihoodStep2.m 130 mixingIntegral(from(selectMarketsCaseII), likelihoodStep2.m 131 to(selectMarketsCaseII), likelihoodStep2.m 132 demand(selectMarketsCaseII), likelihoodStep2.m 133 vS, Param, Settings);

Note that this case involves computing the integral over mixed strategy play, which we do in the function mixingIntegral. We document its content below.

Case III: We store all of the likelihood contributions resulting from transitions to zero (from a positive number of firms) in llhContributionsCaseIII.

 likelihoodStep2.m 141 selectMarketsCaseIII = to == 0 & from > 0; likelihoodStep2.m 142 llhContributionsCaseIII(selectMarketsCaseIII) = likelihoodStep2.m 143 1 - pStay(1, demand(selectMarketsCaseIII))' + likelihoodStep2.m 144 mixingIntegral(from(selectMarketsCaseIII), likelihoodStep2.m 145 to(selectMarketsCaseIII), likelihoodStep2.m 146 demand(selectMarketsCaseIII), likelihoodStep2.m 147 vS, Param, Settings);

Case IV: We store all of the likelihood contributions resulting from when the number of active firms remains at zero in llhContributionsCaseIV.

 likelihoodStep2.m 152 selectMarketsCaseIV = to == 0 & from == 0; likelihoodStep2.m 153 llhContributionsCaseIV(selectMarketsCaseIV) = likelihoodStep2.m 154 1 - pEntry(1, demand(selectMarketsCaseIV))';

Case V: We store all of the likelihood contributions resulting from the number of firms staying the same in llhContributionsCaseV.

 likelihoodStep2.m 158 selectMarketsCaseV = from == to & to > 0; likelihoodStep2.m 159 llhContributionsCaseV(selectMarketsCaseV) = likelihoodStep2.m 160 pStay(sub2ind(size(pStay), likelihoodStep2.m 161 from(selectMarketsCaseV), likelihoodStep2.m 162 demand(selectMarketsCaseV))) - likelihoodStep2.m 163 pEntry(sub2ind(size(pEntry), likelihoodStep2.m 164 from(selectMarketsCaseV) + 1, likelihoodStep2.m 165 demand(selectMarketsCaseV))) + likelihoodStep2.m 166 mixingIntegral(from(selectMarketsCaseV), likelihoodStep2.m 167 to(selectMarketsCaseV), likelihoodStep2.m 168 demand(selectMarketsCaseV), likelihoodStep2.m 169 vS, Param, Settings);

Finally, we sum up the likelihood contributions from the five cases and return the negative log likelihood function. When ll is not real valued, the negative log likelihood is set to inf.

 likelihoodStep2.m 175 ll = -sum(log(llhContributionsCaseI + likelihoodStep2.m 176 llhContributionsCaseII + likelihoodStep2.m 177 llhContributionsCaseIII + likelihoodStep2.m 178 llhContributionsCaseIV + likelihoodStep2.m 179 llhContributionsCaseV)); likelihoodStep2.m 181 if(isnan(ll) || max(real(ll)~=ll) == 1) likelihoodStep2.m 182 ll = inf; likelihoodStep2.m 183 end

If two outputs are requested, we also return the likelihood contributions:

 likelihoodStep2.m 188 if(nargout == 2) likelihoodStep2.m 189 likeCont = llhContributionsCaseI + likelihoodStep2.m 190 llhContributionsCaseII + likelihoodStep2.m 191 llhContributionsCaseIII + likelihoodStep2.m 192 llhContributionsCaseIV + likelihoodStep2.m 193 llhContributionsCaseV; likelihoodStep2.m 194 end

This concludes likelihoodStep2.

We still need to specify what exactly happens in the function mixingIntegral.

Computing the integral resulting from mixed strategy play

For a given survival strategy $a_S(n,c,w)$, the likelihood contribution from (purely) mixed strategy play is given by

(28)

$\label{eq:likelihood_mixing_integral} \int_{\log v_S(n,c)}^{\log v_S(1,c)} {n \choose n'} a_S(n,c,w)^{n'} \left(1-a_S(n,c,w)\right)^{n-n'} g_W(w) dw.$

The term inside the integral is the probability mass function of a binomial distribution function with success probability $a_S(n,c,w)$. The survival strategies are defined by the indifference condition

(29)

$\label{eq:indifference2} \sum_{n'=1}^{n} {n - 1 \choose n' - 1} a_S^{n' - 1}\left(1-a_S\right)^{n-n'} \left(- \exp(w)+v_{S}(n',c)\right)=0.$

In principle, we could compute the integral in (28) directly by naively numerically integrating over $W$. In practice, it is computationally convenient to do a change of variables and integrate over the survival strategies $a_S(n,c,\cdot)$ instead. To make an explicit distinction between the survival strategy $a_S(n,c,w)$, which is a function of $n$, $c$, and $w$, and the variable of integration, which is just a scalar. We will refer to the latter as $p$. Thus, for a given value of $p$, we need to find the value of $w$ such that $p = a_S(n,c,w)$.

Equation (29) defines the inverse $a_S^{-1}(p;c,n)$ for which

(30)

$a_S^{-1}(a_S(n,c,w);c,n) = w.$

This inverse function can be solved for analytically and it is given by

(31)

$\underbrace{a_S^{-1}(p;c,n)}_{\textbf{aSinv}} = \log\left(\underbrace{\sum_{n'=1}^{n} {n - 1 \choose n' - 1} p^{n' - 1}\left(1 - p\right)^{n-n'} v_{S}(n',c) }_{\textbf{expaSInv}}\right)$

Then note that $a_S^{-1}(1;c,n) = \log v_S(n,c)$ and $a_S^{-1}(0;c,n) = \log v_S(1,c)$. We can write the likelihood contribution as an integral over $p$:

(32)

$\begin{split} &\int_{1}^{0} {n \choose n'} p^{n'}\left(1 - p\right)^{n-n'} \times \frac{da_S^{-1}(p;c,n)}{dp}g_{W}\left[a_S^{-1}(p;c,n)\right] dp \\ = &-\int_{0}^{1} {n \choose n'} p^{n'}\left(1 - p\right)^{n-n'} \times \frac{da_S^{-1}(p;c,n)}{dp}g_{W}\left[a_S^{-1}(p;c,n)\right] dp \\ \approx &-\sum_{jX=1}^J {n \choose n'} p_{jX}^{n'} \left(1 - p_{j}\right)^{n-n'} \times \underbrace{\underbrace{\frac{da_S^{-1}(p_{j};c,n)}{dp}}_{\textbf{daSinvdP}} \underbrace{g_{W}\left[a_S^{-1}(p_{j};c,n)\right]}_{\textbf{normaSinv}} \underbrace{w_{j}}_{\textbf{intWeights}}}_{\textbf{mixingDensity}}, \end{split} \label{llContrMixing}$

where $p_{1}, ..., p_{J}$ refer to the Gauss-Legendre nodes and $w_{1}, ..., w_{J}$ to the corresponding weights. Notice that the integration bounds are now 0 and 1 since if $w \lt \log v_S(n,c)$ the firms surely survive and when $w \gt \log v_S(1,c)$ the firms surely exit. Differentiation of $a_S^{-1}(p;c,n)$ gives

(33)

$\underbrace{\frac{da_S^{-1}(p;c,n)}{dp}}_{\textbf{daSinvdP}} = \overbrace{ \sum_{n'=1}^{n} \overbrace{{n - 1 \choose n' - 1} \left(p^{n'-2} (1 - p)^{(n-n' - 1)}\left((n' - 1) (1 - p) - p (n-n')\right)\right)}^{\textbf{dbinomialPmfdP}} v_{S}(n',c)}^{\textbf{dexpaSInvdP}} \frac{1}{\underbrace{\exp(a_S^{-1}(p;c,n))}_{\textbf{expaSInv}}}$

Now, compute the matrix mixingDensity using (34). mixingDensity is of dimension Settings.integrationLength by Settings.cCheck by Settings.nCheck. It is defined as

(34)

$\text{mixingDensity(j,c,n)} = \underbrace{\frac{da_S^{-1}(p_{j};c,n)}{dp}}_{\textbf{daSinvdP}} \underbrace{g_{W}\left[a_S^{-1}(p_{j};c,n)\right]}_{\textbf{normaSinv}} \underbrace{w_{j}}_{\textbf{intWeights}}. \label{mixingDensity}$

The element $(p_{j}, c, n)$ gives us the density of the mixing probability $p_{j}$ when demand equals $c$ and the current number of incumbents is $n$.

In the function mixingIntegral we compute the integral in equation (28) for a range of different combinations of $n$, $n'$, and $c$ using the change of variable introduced above. The function mixingIntegral takes as arguments the vectors from, to, and demand, which correspond to $n$, $n'$, and $c$, respectively. The function also takes as argument vS, the equilibrium post-survival value functions, and the Param and Settings structures. The function returns a vector llhContributionsMixing that is of the same dimension as the inputs from, to, and demand.

 mixingIntegral.m 102 function [llhContributionsMixing] = mixingIntegral.m 103 mixingIntegral(from, to, demand, vS, Param, Settings)

Note that mixed strategy play is only relevant for markets with at least two firms. We define the auxiliary variable expaSInv, which equals expaSInv = exp(aSinv(:, :, n)). dexpaSInvdP is another auxiliary variable that stores the derivative of expaSInv with respect to p. Then assemble expaSInv and dexpaSInvdP by looping over the possible outcomes from mixing nPrime=0,...,n. nPrime=0 refers to the case when all firms leave due to mixed strategy play and nPrime=n refers to the case when all firm stay due to mixed strategy play. Then, use val to compute aSinv and compute the derivative with respect to p and store it as daSinvdP. Lastly, we compute the mixing density.

We pre-compute nchoosekMatrixPlusOne which is a matrix of size $\check n + 1$ by $\check n + 1$, where element $(i,j)$ contains $i - 1 \choose j - 1$. The copious naming and indexing convention is owed to the fact that Matlab indexing starts at one, not zero, so element $(1,1)$ corresponds to $0 \choose 0$. Pre-computing this matrix is helpful, because factorial operations are computationally demanding.

 mixingIntegral.m 125 nchoosekMatrixPlusOne = ones(Settings.nCheck + 1); mixingIntegral.m 127 for nX=2:Settings.nCheck mixingIntegral.m 128 for iX=0:(nX - 1) mixingIntegral.m 129 nchoosekMatrixPlusOne(nX + 1, iX + 1) = nchoosek(nX, iX); mixingIntegral.m 130 end mixingIntegral.m 131 end

We then set up to compute the matrix mixingDensity, as given by (34).

 mixingIntegral.m 136 mixingDensity = zeros(Settings.integrationLength, mixingIntegral.m 137 Settings.cCheck, mixingIntegral.m 138 Settings.nCheck); mixingIntegral.m 140 aSinv = zeros(Settings.integrationLength, mixingIntegral.m 141 Settings.cCheck, mixingIntegral.m 142 Settings.nCheck); mixingIntegral.m 144 daSinvdP = zeros(Settings.integrationLength, mixingIntegral.m 145 Settings.cCheck, mixingIntegral.m 146 Settings.nCheck); mixingIntegral.m 148 p = Settings.integrationNodes; mixingIntegral.m 149 w = Settings.integrationWeights; mixingIntegral.m 151 for n = 2:Settings.nCheck mixingIntegral.m 153 expaSInv = zeros(length(p), Settings.cCheck); mixingIntegral.m 154 dexpaSInvdP = zeros(length(p), Settings.cCheck); mixingIntegral.m 156 for nPrime = 1:n mixingIntegral.m 158 nChoosek = nchoosekMatrixPlusOne(n, nPrime); mixingIntegral.m 159 binomialPmf = nChoosek .* repmat(p .^ (nPrime - 1) mixingIntegral.m 160 .* (1 - p) .^ (n - nPrime), 1, Settings.cCheck); mixingIntegral.m 161 dbinomialPmfdP = nChoosek .* repmat(p .^ (nPrime-2) mixingIntegral.m 162 .* (1 - p) .^ (n - nPrime - 1) mixingIntegral.m 163 .* ((nPrime - 1) .* (1 - p) - p .* (n - nPrime)), 1, Settings.cCheck); mixingIntegral.m 164 repvS = repmat(vS(nPrime, :), Settings.integrationLength,1); mixingIntegral.m 165 expaSInv = expaSInv + binomialPmf .* repvS ; mixingIntegral.m 166 dexpaSInvdP = dexpaSInvdP + dbinomialPmfdP .* repvS; mixingIntegral.m 168 end mixingIntegral.m 170 aSinv(:, :, n) = log(expaSInv); mixingIntegral.m 171 daSinvdP(:, :, n) = dexpaSInvdP ./ expaSInv; mixingIntegral.m 173 intWeights = repmat(w, 1, Settings.cCheck); mixingIntegral.m 174 normaSinv = normpdf(aSinv(:, :, n), -.5*Param.omega^2, Param.omega); mixingIntegral.m 175 mixingDensity(:, :, n) = daSinvdP(:, :, n) .* normaSinv .* intWeights; mixingIntegral.m 176 end

With the matrix mixingDensity in hand, we can compute the likelihood contributions from mixed strategy play using (32). Note that this time, we cannot avoid the use of loops altogether. However, we only need to loop over all those observations where “purely” mixed strategy play does in fact occur.

 mixingIntegral.m 184 llhContributionsMixing = zeros(length(from), 1); mixingIntegral.m 185 for jX = 1:length(from) mixingIntegral.m 186 if(from(jX) > 1) mixingIntegral.m 187 nChoosek = nchoosekMatrixPlusOne(from(jX) + 1, to(jX) + 1); mixingIntegral.m 188 llhContributionsMixing(jX) = mixingIntegral.m 189 - sum(nChoosek .* p .^ to(jX) .* (1 - p) .^ (from(jX) - to(jX)) mixingIntegral.m 190 .* mixingDensity(:, demand(jX), from(jX))); mixingIntegral.m 191 end mixingIntegral.m 192 end

Note that it is possible to improve the performance of the code by only calling this function once instead of three times as we currently do in the likelihood function computation. However, that makes the code somewhat more difficult to follow, which is why we opted for the slightly slower version.

 likelihoodStep2.m 203 end

#### 23Likelihood Step 3: Estimate $(\theta_C, \theta_P, \theta_W)$

This function computes the full information likelihood. This step only involves taking the product of the likelihood contributions from the first two steps. Here we will also go over the computation of standard errors. The function likelihoodStep3 requires the structures Data, Settings, and Param, and the vector estimates as inputs. The arguments returned include the negative log likelihood function ll, a vector of standard errors se, a vector of the likelihood contributions likeCont, and the covariance matrix covariance.

 likelihoodStep3.m 12 function [ll, se, likeCont, covariance] = likelihoodStep3(data, Settings, Param, estimates)

Create the transition probability matrix based on the estimates for muC and sigmaC which are stored as the last two elements in estimates. Then retrieve the vectors of likelihood contributions from the first two steps and compute the negative log likelihood function of the third step:

 likelihoodStep3.m 19 Param = markov(Param, Settings, estimates(end - 1), estimates(end)); likelihoodStep3.m 20 [~, likeCont1] = likelihoodStep1(data, Settings, estimates(end - 1:end)); likelihoodStep3.m 21 [~, likeCont2] = likelihoodStep2(data, Settings, Param, estimates(1:end - 2)); likelihoodStep3.m 22 ll = -sum(log(likeCont1) + log(likeCont2));

This completes the construction of the full information negative log likelihood function. When only one output argument is requested, then the function is done.

 likelihoodStep3.m 27 if(nargout==1) likelihoodStep3.m 28 return; likelihoodStep3.m 29 end

When three output arguments are requested, the function returns the likelihood contributions, which are simply the element-by-element product of the likelihood contributions vectors from the first two steps. In this case, we return no standard errors, i.e. we set se=[].

 likelihoodStep3.m 35 if(nargout==3) likelihoodStep3.m 36 se = []; likelihoodStep3.m 37 likeCont = likeCont1 .* likeCont2; likelihoodStep3.m 38 return; likelihoodStep3.m 39 end

Now, consider the case, when exactly two output arguments are requested. In this case, we want to compute the standard errors. As discussed in the paper, standard errors are computed using the outer-product-of-the-gradient estimator of the information matrix. When two output arguments are requested when calling likelihootStep3.m, the function will return standard errors in addition to the log likelihood function. When three output arguments are requested, the function also returns the likelihood contributions. Define the matrix of perturbations, which is simply a diagonal matrix with Settings.fdStep on each diagonal element.

 likelihoodStep3.m 52 epsilon = eye(length(estimates)) * Settings.fdStep;

Next, get the likelihood contributions at the parameter values in estimates:

 likelihoodStep3.m 57 likeCont = likeCont1 .* likeCont2;

Now, given the likelihood contribution $\ell(\theta) \equiv \ell(\theta_j, \theta_{-j})$ we compute for each parameter $\theta_j$ the positively and negatively perturbed likelihood contributions $\ell(\theta_j+\epsilon,\theta_{-j})$ and $\ell(\theta_j-\epsilon,\theta_{-j})$. The gradients of the negative log likelihood contributions are then computed using central finite differences:

(35)

$\frac{\partial \log\left(\ell\left( \theta\right)\right)}{\partial \theta_j} = \frac{\partial \log\left( \ell\left(\theta\right)\right)}{\partial \ell\left(\theta\right)} \cdot \frac{d \ell\left(\theta\right))}{d \theta_j} \approx \frac{\partial \log\left(\ell\left(\theta\right)\right)}{\partial \ell\left(\theta\right)} \cdot \frac{ \ell\left( \theta_j+\epsilon,\theta_{-j}\right) - \ell\left( \theta_j-\epsilon,\theta_{-j}\right))}{2\epsilon}$

The matrix of gradient contributions gradCont has $(\check t - 1)\cdot \check r$ rows and one column for each parameter with respect to which we are differentiating the logged likelihood contributions.

 likelihoodStep3.m 83 gradCont = zeros(Settings.rCheck*(Settings.tCheck - 1), length(estimates)); likelihoodStep3.m 85 for j = 1:length(estimates) likelihoodStep3.m 86 [~, ~, likeContribPlus] = likelihoodStep3(data, Settings, Param, estimates + epsilon(j, :)); likelihoodStep3.m 87 [~, ~, likeContribMinus] = likelihoodStep3(data, Settings, Param, estimates - epsilon(j, :)); likelihoodStep3.m 88 gradCont(:, j) = (likeContribPlus - likeContribMinus) / (2 * Settings.fdStep) ./ likeCont; likelihoodStep3.m 89 end

We now have the matrix gradCont where each column is the score with respect to an estimable parameter. This matrix is used to compute the Hessian (full information matrix). We take the sum of the outer-product of the market specific gradients over all markets. A market specific gradient is a row in gradCont. Looping over all rows, in each iteration we compute the outer product of a market specific gradient and add them all up:

 likelihoodStep3.m 99 Hessian = zeros(length(estimates)); likelihoodStep3.m 100 for iX=1:size(gradCont, 1) likelihoodStep3.m 101 Hessian = Hessian + gradCont(iX, :)' * gradCont(iX, :); likelihoodStep3.m 102 end

The covariance matrix can be obtained from inverting the Hessian. The standard errors are the square roots of the diagonal entries of the covariance matrix.

 likelihoodStep3.m 108 covariance = inv(Hessian); likelihoodStep3.m 109 se = sqrt(diag(covariance))';

This concludes likelihoodStep3.

 likelihoodStep3.m 112 end

### 3Data

Here we describe how to generate a synthetic sample with data on the number of active firms and the number of consumers for $\check r$ markets and $\check t$ time periods. The data generation process consists of two functions.

• randomFirms simulates firms' entry and exit decisions conditional on the current number of active firms, the realization of the demand state, and the realization of the cost shock.
• dgp generates a synthetic panel data set containing the demand state and number of active firms.

#### 31Draw Number of Firms

The function randomFirms randomly draws a realization of the surviving number of active firms $N'$ given the current period's initial number of active firms $n$, demand state $c$ and cost shock $w$. The computation of the number of firms follows the construction of the unique symmetric Nash equilibrium of the one-shot survival game, which takes advantage of the monotonicity of $v_S(n,c)$. Given $n$, $c$, and $w$, the possible realizations of $N'$ can be classified using the following cases:

• $v_S(1,c) \leq \exp (w)$: In this case, even a monopolist would not be willing to remain active. Thus, all firms leave the market for sure and there is no entry. This case is only relevant when $n \gt 0$, i.e. when there is a positive number of incumbents. If the number of incumbents is equal to zero, and the above condition holds, then there will not be any entry and the number of firms remains at zero.
• $v_S(n,c) \leq \exp (w) \lt v_S(1,c)$: In this case, a monopolist finds survival profitable, but the current number of firms $n$ does not. Thus, some firms will be leaving the market and firms' survival strategies satisfy $a_S(n,c, w) \in (0,1)$. $N'$ follows a binomial distribution with $n$ trials and success probability $a_S(n,c,w)$. This case is only well defined when $n \gt 1$.
• $v_S(n,c) \gt \exp (w)$: All $n$ incumbents find survival profitable. In addition, there may be some entry. We will use the entry strategies to compute $n'$. In fact, we will count for how many $n'$ the entry condition $v_S(n',c) \gt (1+\varphi) \exp (w)$ is satisfied. This case is only relevant if $n \lt \check n$, i.e. the current number of incumbents is strictly less than $\check n$. If $n=\check n$, there will not be any entry by construction.

The function takes as inputs last period's post-survival number of active firms N, the current number of consumers C, the realized cost shock W, and the equilibrium post-survival value function, vS. N, C, and W are column vectors of length rCheck containing one entry per market. The output is the column vector Nprime of length rCheck, which contains the post-survival number of active firms in each market.

 randomFirms.m 45 function Nprime = randomFirms(N, C, W, Settings, Param, vS) randomFirms.m 47 Nprime = NaN(size(N)); randomFirms.m 49 for nX = 1:Settings.rCheck randomFirms.m 51 switch (true) randomFirms.m 52 % All firms leave (only relevant if N(nX)>0) randomFirms.m 53 case N(nX) > 0 && (vS(1, C(nX)) <= exp(W(nX))) randomFirms.m 54 Nprime(nX) = 0; randomFirms.m 56 % Some firms leave (only relevant if N(nX) > 1) randomFirms.m 57 case N(nX) > 1 && (vS(max(1,N(nX)), C(nX)) <= exp(W(nX))) randomFirms.m 58 aS = mixingProbabilities(N(nX), C(nX), W(nX), vS); randomFirms.m 59 Nprime(nX) = binornd(N(nX), aS); randomFirms.m 61 % All incumbents stay; there may be entry. randomFirms.m 62 case N(nX) < Settings.nCheck && (vS(max(1, N(nX)), C(nX)) > exp(W(nX))) randomFirms.m 63 Nprime(nX) = N(nX) + sum(vS((N(nX) + 1):Settings.nCheck, C(nX)) randomFirms.m 64 - (1 + Param.phi) .* exp(W(nX)) > 0); randomFirms.m 66 % Remaining cases include N(nX)=0 and no entry, or N(nX)=Settings.nCheck and no exit). randomFirms.m 67 otherwise randomFirms.m 68 Nprime(nX) = N(nX); randomFirms.m 69 end randomFirms.m 70 end

This concludes randomFirms.

 randomFirms.m 72 end

#### 32Assemble Data Set

This function generates data $(N,C,W)$ for Settings.rCheck markets and Settings.tCheck time periods. Demand in the first period is drawn from the ergodic distribution. To eliminate initial condition effects, Settings.tBurn+Settings.tCheck periods will be generated, but only the last Settings.tCheck periods will be used. The function requires the structures Settings and Param as inputs. The function returns a Data structure with three $\check t \times \check r$ matrices N, C, and W, where each entry corresponds to some time period $t$ in some market $r$. N contains the number of active firms. C contains the index for the demand grid logGrid that corresponds to the demand state. W consists of realizations of the cost shocks. In practice, W is unobserved to the econometrician and thus we will not use it during the estimation. Here, we return W as additional information that can be used to debug the program or to compute additional statistics, such as the average realized fixed costs or the average realized entry costs.

 dgp.m 20 function Data = dgp(Settings, Param) dgp.m 22 N = NaN(Settings.tBurn + Settings.tCheck, Settings.rCheck); dgp.m 23 C = NaN(Settings.tBurn + Settings.tCheck, Settings.rCheck);

We now compute the post-survival equilibrium value functions using the function valueFunctionIteration:

 dgp.m 27 vS = valueFunctionIteration(Settings,Param);

The cost shocks are iid across markets and periods, so we can draw them all at once and store them in the matrix W which is of dimension Settings.tBurn + Settings.tCheck by Settings.rCheck.

 dgp.m 32 W = Param.omega * randn(Settings.tBurn + Settings.tCheck, Settings.rCheck) -0.5 * Param.omega ^ 2;

Next, we draw an initial demand state from the ergodic distribution of the demand process for each market using the randomDiscr function, which is documented in the appendix. With the initial demand state for each market in hand, for each time period we use the transition matrix to draw the current demand state given the previous state:

 dgp.m 40 C(1,:) = randomDiscr(repmat(Param.demand.ergDist, 1, Settings.rCheck)); dgp.m 41 for t = 2 : Settings.tBurn + Settings.tCheck dgp.m 42 C(t, :) = randomDiscr(Param.demand.transMat(C(t - 1, :), :)'); dgp.m 43 end

The initial number of firms in each market is drawn randomly from a discrete uniform distribution on $\{{1,2,\ldots ,\check n }\}$. In each period following the first one, the number of firms is generated using the randomFirms function, which randomly draws realizations of the cost shocks and then uses the firms' equilibrium strategies to update the number of active firms. We draw the numbers of firms separately for the burn-in phase and the real sample, because for the latter we also want to store the realized fixed and entry costs.

 dgp.m 54 N(1,:) = randsample(Settings.nCheck, Settings.rCheck, true); dgp.m 56 for t = 2:Settings.tBurn+Settings.tCheck dgp.m 57 N(t, :) = randomFirms(N(t - 1, :)', C(t - 1, :)', W(t - 1, :)', Settings, Param, vS); dgp.m 58 end

Now we have the matrices N, C, and W; each of dimension Settings.tBurn + Settings.tCheck by Settings.rCheck. From this, we store only the last Settings.tCheck periods in the structure Data:

 dgp.m 63 Data.C = C((end - Settings.tCheck + 1):end, :); dgp.m 64 Data.N = N((end - Settings.tCheck + 1):end, :); dgp.m 65 Data.W = W((end - Settings.tCheck + 1):end, :);

This concludes the function dgp.

 dgp.m 68 end

### 4Example Script: example.m

The script starts with setting the seed for the random number generators. Setting the seed ensures that the results of this program can be replicated regardless of where and when the code is run. That is, once the seed is set, users can make changes to the code and be sure that changes in the results are not because different random values were generated.

 example.m 8 s = RandStream('mlfg6331_64', 'Seed', 12345); example.m 9 RandStream.setGlobalStream(s);

Next, we define two variable structures. The Settings structure collects various settings used in the remainder of the program. The Param structure collects the true parameter values to be used in the generation of the data.

The NFXP algorithm crucially depends on two tolerance parameters. The tolerance parameters tolInner and tolOuter refer to the inner tolerance for the iteration of the value function when the model is solved, and the outer tolerance for the likelihood optimizations, respectively. As discussed in the paper, we follow Dube et al. (2012) by ensuring that the inner tolerance is smaller than the outer tolerance to prevent false convergence. maxIter stores the maximum number of iteration steps that are allowed during the value function iteration.

 example.m 26 Settings.tolInner = 1e-10; example.m 27 Settings.tolOuter = 1e-6; example.m 28 Settings.maxIter = 1000;

The maximum number of firms that can ever be sustained in equilibrium is defined as nCheck.

 example.m 33 Settings.nCheck = 5;

There are three parameters that govern the support of the demand process. cCheck is the number of possible demand states. lowBndC and uppBndC are the lower and upper bounds of the demand grid, respectively.

 example.m 40 Settings.cCheck = 200; example.m 41 Settings.lowBndC = 0.5; example.m 42 Settings.uppBndC = 5;

We define the grid logGrid as a row vector with $\check c$ elements that are equidistant on the logarithmic scale. d is the distance between any two elements of logGrid.

 example.m 48 Settings.logGrid = example.m 49 linspace(log(Settings.lowBndC), log(Settings.uppBndC), Settings.cCheck); example.m 50 Settings.d = Settings.logGrid(2) - Settings.logGrid(1);

Next, we define the number of markets rCheck and the number of time periods tCheck. In the data generating process (dgp), we will draw data for tBurn + tCheck periods, but only store the last tCheck time periods of data. tBurn denotes the number of burn-in periods that are used to ensure that the simulated data refers to the approximately ergodic distribution of the model.

 example.m 59 Settings.rCheck = 1000; example.m 60 Settings.tCheck = 10; example.m 61 Settings.tBurn = 100;

Standard errors are computed using the outer-product-of-the-gradient method. To compute the gradients (likelihood scores) we use two sided finite differences. The parameter fdStep refers to the step size of this approximation.

 example.m 68 Settings.fdStep = 1e-7;

To compute the likelihood contribution of purely mixed strategy play, we will need to numerically integrate over the support of the survival strategies, $(0,1)$. We will do so using Gauss-Legendre quadrature. integrationLength refers to number of Gauss-Legendre nodes used. We document the function lgwt in the Appendix.

 example.m 76 Settings.integrationLength = 32; example.m 77 [Settings.integrationNodes, Settings.integrationWeights] = example.m 78 lgwt(Settings.integrationLength, 0, 1);

We can now define the settings for the optimizer used during the estimation. Note that these options will be used for Matlab's constrained optimization function fmincon.

 example.m 84 options = optimset('Algorithm', 'interior-point', 'Display', 'iter', example.m 85 'TolFun', Settings.tolOuter, 'TolX', Settings.tolOuter, example.m 86 'GradObj', 'off');

We now define the true values listed in the paper for the parameters of the model, starting with the discount factor:

 example.m 91 Param.rho = 1/1.05;

The true values for the estimated parameters in $\theta$ are defined as

 example.m 95 Param.k = [1.8, 1.4, 1.2, 1, 0.9]; example.m 96 Param.phi = 10; example.m 97 Param.omega = 1; example.m 98 Param.demand.muC = 0; example.m 99 Param.demand.sigmaC = 0.02;

We then collect the true parameter values into a vector for each of the three steps of the estimation procedure.

 example.m 104 Param.truth.step1 = [Param.demand.muC, Param.demand.sigmaC]; example.m 105 Param.truth.step2 = [Param.k, Param.phi, Param.omega]; example.m 106 Param.truth.step3 = [Param.truth.step2, Param.truth.step1];

We now generate a synthetic sample that we will then estimate using the three step estimation procedure. We begin the data generation by computing the transition matrix and the ergodic distribution of the demand process, using the true values for its parameters $(\mu_C, \sigma_C)$. This is done using the function markov, which creates the $\check c \times \check c$ transition matrix Param.demand.transMat and the $\check c \times 1$ ergodic distribution Param.demand.ergdist.

 example.m 116 Param = markov(Param, Settings);

Next, we generate the dataset $(n,c)$ using dgp. This creates the two $\check t \times \check r$ matrices data.C and data.N. Then construct the from and to matrices of size $(\check t -1)\times \check r$ as we did in likelihoodStep1.

 example.m 123 Data = dgp(Settings, Param); example.m 124 from = Data.C(1:Settings.tCheck - 1, 1:Settings.rCheck); example.m 125 to = Data.C(2:Settings.tCheck, 1:Settings.rCheck);

These matrices include $C_{t,r}$ and $C_{t+1,r}$, respectively, for $t=1,\ldots,\check t -1$ and $r=1,\ldots,\check r$, as given in the $\check{t}\times \check r$ matrix Data.C. As discussed above, we use the mean and standard deviations of the innovations in logged demand as starting values for $(\mu_C, \sigma_C)$. These are stored in startValues.step1

 example.m 134 logTransitions = Settings.logGrid([from(:), to(:)]); example.m 135 innovLogC = logTransitions(:, 2) - logTransitions(:, 1); example.m 136 startValues.step1 = [mean(innovLogC), std(innovLogC)];

Declare the first step likelihood function to be the objective function. This is an anonymous function with parameter vector estimates

 example.m 141 objFunStep1 = @(estimates) likelihoodStep1(Data, Settings, estimates);

Store the negative log likelihood evaluated at the true values of $(\mu_C, \sigma_C)$. This will later allow us to compare the negative log likelihood function at the final estimates to the negative log likelihood function at the true parameter values (the former should always be smaller than the latter).

 example.m 149 llhTruth.step1 = objFunStep1(Param.truth.step1);

Next, maximize the likelihood function using fmincon. The only constraint under which we are maximizing is that $\sigma_C \gt 0$. We impose this constraint by specifying the lower bound of $(\mu_C, \sigma_C)$ to be [-inf, 0]. The estimates of $(\mu_C, \sigma_C)$ are stored in Estimates.step1, the likelihood at the optimal value is stored in llh.step1 and the exit flag (the reason for which the optimization ended) is stored in exitFlag.step1.

 example.m 159 tic; example.m 160 [Estimates.step1, llh.step1, exitFlag.step1] = fmincon(objFunStep1, example.m 161 startValues.step1, [], [], [], [], [-inf, 0], [], [], options); example.m 162 computingTime.step1 = toc;

Now consider the second step, in which we estimate $(k, \varphi,\omega)$. Start by creating anonymous functions which will be used in likelihoodStep2 to map the vector of parameter estimates into the Param structure:

 example.m 169 Settings.estimates2k = @(x) x(1:Settings.nCheck); example.m 170 Settings.estimates2phi = @(x) x(6); example.m 171 Settings.estimates2omega = @(x) x(7);

Starting values are the same random draw from a uniform distribution on $[1,5]$:

 example.m 176 startValues.step2 = 1 + 4 * ones(1, length(Param.truth.step2)) * rand;

Using markov, we then generate the transition matrix and ergodic distribution using the estimated values $(\hat \mu_C,\hat \sigma_C)$ from the first step as the parameters for the demand process:

 example.m 183 Param = markov(Param, Settings, Estimates.step1(1), Estimates.step1(2));

Declare the objective function as in the first step:

 example.m 187 objFunStep2 = @(estimates) likelihoodStep2(Data, Settings, Param, estimates);

The maximization is constrained by imposing that $(\hat k,\hat \varphi, \hat \omega)$ are nonnegative. Store the negative log-likelihood at the true parameter values:

 example.m 193 lb = zeros(size(startValues.step2)); example.m 194 llhTruth.step2 = objFunStep2(Param.truth.step2);

Then, minimize the objective function:

 example.m 198 tic; example.m 199 [Estimates.step2,llh.step2,exitFlag.step2] = fmincon(objFunStep2, example.m 200 startValues.step2, [], [], [], [], lb, [], [], options); example.m 201 computingTime.step2 = toc;

The results are stored as in the first step. Now consider the third step, FIML. Start by declaring the estimates from the first two steps to be the starting values for the third step:

 example.m 208 startValues.step3 = [Estimates.step2, Estimates.step1];

Declare the objective function:

 example.m 212 objFunStep3 = @(estimates) likelihoodStep3(Data, Settings, Param, estimates);

The lower bound for all parameters is zero, except for $\mu_C$ which is unbounded. $\mu_C$ corresponds to the second to last entry in the vector of parameters.

 example.m 218 lb = zeros(size(startValues.step3)); example.m 219 lb(length(startValues.step3) - 1) = -inf;

Store the negative log-likelihood at the true parameter values:

 example.m 223 llhTruth.step3 = objFunStep3(Param.truth.step3);

Obtain the estimates subject to the constraints lb:

 example.m 227 tic; example.m 228 [Estimates.step3, llh.step3, exitFlag.step3] = fmincon(objFunStep3, example.m 229 startValues.step3, [], [], [], [], lb, [], [], options); example.m 230 computingTime.step3 = toc;

Compute the standard errors by requesting two output arguments, and store these in Estimates.se

 example.m 235 [~,Estimates.se] = likelihoodStep3(Data, Settings, Param, Estimates.step3);

which concludes the estimation of the synthetic sample.

### 5Appendix A: List of Structures

Throughout the Matlab code, the structures Settings, Param, and Data play an important role. We define their contents in this section.

The structure Settings contains parameters that govern the execution of the Matlab. All elements in Settings need to be defined by hand and remain constant throughout the execution of the program.

• Settings.tolInner the real valued tolerance for the inner loop of the NFXP.
• Settings.tolOuter the real valued tolerance for the outer loop of the NFXP.
• Settings.maxIter the integer valued maximum number of iterations the inner loop of the NXP may take before it throws an error.
• Settings.nCheck the integer valued maximum number of firms that the market can sustain, $\check n$.
• Settings.cCheck the integer valued number of support points that the demand process can take on.
• Settings.lowBndC is the real valued lower bound of the demand grid.
• Settings.uppBndC is the real valued upper bound of the demand grid.
• Settings.logGrid is a row vector of length Settings.cCheck with the logged demand grid.
• Settings.d is the real valued distance between two points on the logged demand grid.
• Settings.rCheck is the integer valued number of markets for which we have or simulate data, $\check r$.
• Settings.tCheck is the integer valued number of time periods for which we or simulate have data, $\check t$.
• Settings.tBurn is the integer valued number of burn in periods used during the simulation.
• Settings.fdStep is the real valued step size used to compute finite differences, when we compute the score of the likelihood function.
• Settings.integrationLength is the integer valued number of points used for the Gauss-Legendre integration.
• Settings.integrationNodes is the real valued row vector of length Settings.integrationLength with Gauss-Legendre nodes.
• Settings.integrationWeights is the real valued row vector of length Settings.integrationLength with Gauss-Legendre weights.
• estimates2k(x) is an anonymous function that maps the argument a vector of estimates x into the vector valued outcome Param.k, which will be defined below.
• estimates2phi(x) is an anonymous function that maps the argument a vector of estimates x into the vector valued outcome Param.phi, which will be defined below.
• estimates2omega(x) is an anonymous function that maps the argument a vector of estimates x into the real valued outcome Param.omega, which will be defined below.

The structure Param contains the primitives of the model.

• Param.rho is the real valued discount factor.
• Param.k is a real valued row vector of length Settings.nCheck that parameterizes the surplus function, $k(n)$.
• Param.phi is a real valued scalar that contains the entry costs, $\varphi$.
• Param.omega is a real and parameterizes the scale, $\omega$, of the cost shock distribution.
• Param.demand.muC is a real and parameterizes the mean, $\mu_C$, of the log innovations of the demand process.
• Param.demand.sigmaC is a real and parameterizes the standard deviation, $\sigma_C$, of the log innovations of the demand process.
• Param.demand.transMat is a real valued transition probability matrix of the demand process, which is of size Settings.cCheck by Settings.cCheck.
• Param.demand.ergDist is a real valued column vector of length Settings.cCheck with the ergodic distribution of the demand process.
• Param.truth.step1 is a real valued row vector with the true parameter values for the first step in the three-step estimation procedure during the Monte Carlo simulation.
• Param.truth.step2 is a real valued row vector with the true parameter values for the second step in the three-step estimation procedure during the Monte Carlo simulation.
• Param.truth.step3 is a real valued row vector with the true parameter values for the third step in the three-step estimation procedure during the Monte Carlo simulation.

The structure Data contains the following elements.

• Data.C is an integer valued matrix of size Settings.tCheck by Settings.rCheck, where each element (t,r) contains the index of the logged demand grid that describes the demand state in market r at time t.
• Data.N is an integer valued matrix of size Settings.tCheck by Settings.rCheck, where each element (t,r) contains the number of active firms in market r at time t.
• Data.W is a real valued matrix of size Settings.tCheck by Settings.rCheck that contains the cost shocks that are generated in the Monte Carlo simulation.

### 6Appendix B: Auxiliary Functions

This part of the appendix contains descriptions of all auxiliary functions used that were not described above.

#### 61Compute Markov Process

This function computes the transition probability matrix transMat and the ergodic distribution ergDist of the demand process. If this function is called using only Param and Settings as inputs, then Param.demand.transMat is computed using the true values of the demand process parameters $(\mu_C, \sigma_C)$ that are stored in Param.truth.step1. This is used in example.m when we generate synthetic data on the number of consumers in each market. In this case, we will also compute the ergodic distribution Param.demand.ergDist. Alternatively, if the function is called with the additional inputs muC and sigmaC, then these parameter values will be used to compute Param.demand.transMat. In this case, Param.demand.ergDist will not be computed.

 markov.m 15 function Param = markov(Param, Settings, muC, sigmaC)

The code starts with extracting the relevant variables from the Settings structure for convenience. If only two input arguments are passed to the function, then the true values for $(\mu_C, \sigma_C)$ are used.

 markov.m 21 cCheck = Settings.cCheck; markov.m 22 logGrid = Settings.logGrid; markov.m 23 d = Settings.d; markov.m 25 if nargin == 2 markov.m 26 muC = Param.demand.muC; markov.m 27 sigmaC = Param.demand.sigmaC; markov.m 28 end

The transition probability matrix is computed according to the method outlined by Tauchen(1986), which is used to discretize a continuous (and bounded) stochastic process. We use the standard convention for transition probability matrices. That is, for a transition probability matrix $\Pi$, each entry $\Pi_{i,j}$ gives the probability of transitioning from state $i$ (row) to state $j$ (column). The transition matrix is of dimension $\check c \times \check c$. The idea of the Tauchen method is intuitive - we assumed the growth of $C$ to be normally distributed with parameters muC and sigmaC. Since this is a continuous distribution, while the state space is discrete, we treat a transition to a neighborhood around $c_{[j]}$ as a transition to $c_{[j]}$ itself. These neighborhoods span from one mid-point between two nodes on logGrid to the next mid-point, i.e. $\left[\log c_{[j]}-\frac{d}{2},\log c_{[j]}+\frac{d}{2}\right]$. Transitions to the end-points of logGrid follow the same logic. Distinguishing between transitions to interior points ($j=2,\ldots,\check c -1$) and transitions to end-points ($j=1$ or $j=\check c$), we have three cases.

 markov.m 49 transMat = NaN(cCheck, cCheck);

(36)

$\Pi_{i,j}=Pr\left[C'=c_{[j]} |C=c_{[i]}\right] = \Phi\left(\frac{\log c_{[j]} - \log c_{[i]} +\frac{d}{2}-\mu_C}{\sigma_C}\right)- \Phi\left(\frac{\log c_{[j]} - \log c_{[i]} -\frac{d}{2}-\mu_C}{\sigma_C}\right)$

 markov.m 58 for jX = 2:(cCheck - 1) markov.m 59 transMat(:, jX) = normcdf((logGrid(jX) - logGrid' + d / 2 - muC) / sigmaC) markov.m 60 - normcdf((logGrid(jX) - logGrid'-d / 2-muC) / sigmaC); markov.m 61 end

(37)

$\Pi_{i,1}=Pr\left[C'=c_{[1]} |C=c_{[i]}\right] = \Phi\left(\frac{\log c_{[1]} - \log c_{[i]} +\frac{d}{2}-\mu_C}{\sigma_C}\right)$

 markov.m 69 transMat(:, 1) = normcdf((logGrid(1) - logGrid' + d / 2 - muC) / sigmaC);

(38)

$\Pi_{i,1}=Pr\left[C'=c_{[\check c]} |C=c_{[i]}\right] = 1-\Phi\left(\frac{\log c_{[\check c]} - \log c_{[i]} -\frac{d}{2}-\mu_C}{\sigma_C}\right)$

 markov.m 75 transMat(:,cCheck) = 1 - normcdf((logGrid(cCheck) - logGrid' - d / 2 - muC) / sigmaC);

This completes the construction of the transition matrix transMat, which we now store in the Param structure.

 markov.m 80 Param.demand.transMat = transMat;

Next, we compute the ergodic distribution of the demand process ergDist. The ergodic distribution is only computed for the true parameter values, i.e. when the number of input arguments equals two. The ergodic distribution is the eigenvector of the transpose of the transition matrix with eigenvalue equal to 1 after normalizing it such that its entries sum to unity. The Perron-Frobenius Theorem guarantees that such an eigenvector exists and that all eigenvalues are not greater than 1 in absolute value. We store the eigenvectors of the transition matrix as columns in eigenVecs, in decreasing order of eigenvalues from left to right. The eigenvector with unit eigenvalue is normalized by dividing each element by the sum of elements in the vector, so that it sums to 1 and thus is the ergodic distribution, stored as ergDist. Finally, we store ergDist in the Param structure.

 markov.m 98 if nargin == 2 markov.m 99 [eigenVecs, eigenVals] = eigs(transMat'); markov.m 100 Param.demand.ergDist = eigenVecs(:, 1) / sum(eigenVecs(:, 1));

We conclude by checking for two numerical problems that may arise. Firstly, confirm that the greatest eigenvalue is sufficiently close to 1 and return an error if it is not. Secondly, due to approximation errors, it may be the case that not all elements in the eigenvector are of the same sign (one of them may be just under or above zero). This will undesirably result in an ergodic distribution with a negative entry.

 markov.m 109 if abs(max(eigenVals(:) - 1)) > 1e-5 markov.m 110 error('Warning: highest eigenvalue not sufficiently close to 1') markov.m 111 end markov.m 112 markov.m 113 signDummy = eigenVecs(:, 1) > 0; markov.m 114 if sum(signDummy) ~= 0 && sum(signDummy) ~= cCheck markov.m 115 error('Not all elements in relevant eigenvector are of same sign'); markov.m 116 end markov.m 117 markov.m 118 end

#### 62Draw from Discrete Distribution

This function randomly draws realizations from discrete probability distributions using the inverse cumulative distribution function method. The idea is to first draw a realization from a uniform distribution and then map that draw, a probability, into the corresponding mass point of the discrete distribution. This function is written to draw one realization each from K distributions at a time. Each of the K distributions is assumed to be defined over M points.

The function takes as input a matrix P which is of dimension M times K. In this matrix, each column corresponds to a probability mass function.

 randomDiscr.m 16 function iX = randomDiscr(P) randomDiscr.m 17 M = size(P, 1); randomDiscr.m 18 K = size(P, 2); randomDiscr.m 19 U = ones(M, 1) * rand(1, K); % Draw uniform random variables randomDiscr.m 20 cdf = cumsum(P); % Compute CDF randomDiscr.m 21 iX = 1 + sum(U > cdf); % Find mass point that corresponds to U

The result is a vector iX of length K with realizations from the discrete distributions described by P.

 randomDiscr.m 24 end

#### 63Compute Gauss-Legendre Weights

 lgwt.m 4 function [x,w] = lgwt(N, a, b)

This function is for computing definite integrals using Legendre-Gauss Quadrature. It computes the Legendre-Gauss nodes and weights on an interval [a,b] with number of nodes N.

Suppose you have a continuous function $f(x)$ which is defined on [a,b] which you can evaluate at any $x$ in [a,b]. Simply evaluate it at all of the values contained in the x vector to obtain a vector $f(x)$. Then compute the definite integral using sum(f .* w);

 lgwt.m 13 N = N - 1; lgwt.m 14 N1 = N + 1; lgwt.m 15 N2 = N + 2; lgwt.m 16 xu = linspace(-1, 1, N1)'; lgwt.m 17 y = cos((2*(0:N)'+1) * pi / (2*N+2)) + 0.27/N1 * sin(pi*xu*N/N2); lgwt.m 18 L = NaN(N1, N2); lgwt.m 19 Lp = NaN(N1, N2); lgwt.m 20 y0 = 2; lgwt.m 21 while max(abs(y-y0)) > eps lgwt.m 22 L(:, 1) = 1; lgwt.m 23 L(:, 2) = y; lgwt.m 24 for k = 2:N1 lgwt.m 25 L(:, k+1) = ((2*k-1)*y .* L(:,k)-(k-1)*L(:,k-1)) / k; lgwt.m 26 end lgwt.m 27 Lp = N2*(L(:,N1)-y.*L(:,N2))./(1-y.^2); lgwt.m 28 y0 = y; lgwt.m 29 y = y0-L(:,N2) ./ Lp; lgwt.m 30 end lgwt.m 31 x = (a*(1-y) + b*(1+y)) / 2; lgwt.m 32 w = (b-a) ./ ((1-y.^2) .* Lp.^2) * (N2/N1)^2; lgwt.m 33 end

 valueFunctionIteration.m 44 function [vS, pEntry, pEntrySet, pStay] = valueFunctionIteration.m 45 valueFunctionIteration(Settings, Param) valueFunctionIteration.m 49 vS = zeros(Settings.nCheck + 1, Settings.cCheck); valueFunctionIteration.m 50 pEntry = zeros(Settings.nCheck + 1, Settings.cCheck); valueFunctionIteration.m 51 pEntrySet = zeros(Settings.nCheck + 1, Settings.cCheck); valueFunctionIteration.m 52 pStay = zeros(Settings.nCheck, Settings.cCheck); valueFunctionIteration.m 75 thetaW2 = Param.omega ^ 2; valueFunctionIteration.m 76 gridTrans = exp(Settings.logGrid)'; valueFunctionIteration.m 78 for n = Settings.nCheck:-1:1 valueFunctionIteration.m 80 valueFunctionIteration.m 81 iter = 0; valueFunctionIteration.m 82 vSdiff = 1; valueFunctionIteration.m 84 valueFunctionIteration.m 85 vSn = vS(n, :)'; valueFunctionIteration.m 87 valueFunctionIteration.m 88 flowSurplus = gridTrans * Param.k(n) / n; valueFunctionIteration.m 90 valueFunctionIteration.m 91 valueAdditionalEntry = valueFunctionIteration.m 92 sum(pEntrySet((n + 1):end, :) .* vS((n + 1):end, :))'; valueFunctionIteration.m 94 valueFunctionIteration.m 95 pEntrynPlus1 = pEntry(n + 1, :)'; valueFunctionIteration.m 97 valueFunctionIteration.m 98 while (vSdiff > Settings.tolInner && iter < Settings.maxIter) valueFunctionIteration.m 100 valueFunctionIteration.m 101 iter = iter + 1; valueFunctionIteration.m 103 valueFunctionIteration.m 104 logvSn = log(vSn); valueFunctionIteration.m 106 valueFunctionIteration.m 107 pStayn = normcdf(logvSn, -0.5 * thetaW2, Param.omega); valueFunctionIteration.m 109 valueFunctionIteration.m 110 partialExp = 1 - normcdf(0.5 * thetaW2 - logvSn, 0, Param.omega); valueFunctionIteration.m 112 valueFunctionIteration.m 113 valueSureSurvNoEntry = (pStayn - pEntrynPlus1) .* vSn; valueFunctionIteration.m 114 vSnPrime = (Param.rho * Param.demand.transMat * (flowSurplus valueFunctionIteration.m 115 - partialExp + valueSureSurvNoEntry + valueAdditionalEntry)); valueFunctionIteration.m 116 vSdiff = max(abs(vSn - vSnPrime)); valueFunctionIteration.m 118 valueFunctionIteration.m 119 vSn = vSnPrime; valueFunctionIteration.m 120 end valueFunctionIteration.m 122 if (iter == Settings.maxIter) valueFunctionIteration.m 123 error('value function iteration failed'); valueFunctionIteration.m 124 end valueFunctionIteration.m 126 valueFunctionIteration.m 127 vS(n, :) = vSn; valueFunctionIteration.m 128 pStay(n, :) = pStayn; valueFunctionIteration.m 129 pEntry(n, :) = normcdf(logvSn - log((1 + Param.phi)), -0.5 * thetaW2, Param.omega); valueFunctionIteration.m 130 pEntrySet(n, :) = pEntry(n, :) - pEntry(n + 1, :); valueFunctionIteration.m 132 end valueFunctionIteration.m 140 end
 mixingProbabilities.m 12 function aS = mixingProbabilities(N, C, W, vS) mixingProbabilities.m 32 aS = NaN(size(N)); mixingProbabilities.m 34 for iX = 1:length(N) mixingProbabilities.m 40 nE = N(iX); mixingProbabilities.m 41 matCoef = zeros(nE); mixingProbabilities.m 42 for jX = (nE - 1):-1:0 mixingProbabilities.m 43 signCoef = (-1) .^ (jX - (0:jX)); mixingProbabilities.m 44 nCk = factorial(nE - 1) / factorial(nE - 1 - jX) ./ (factorial(0:jX) .* factorial(jX - (0:jX))); mixingProbabilities.m 45 continuationValue = (-exp(W(iX)) + vS(1:(jX + 1), C(iX)))'; mixingProbabilities.m 46 matCoef(nE - jX, 1:(jX + 1)) = signCoef .* nCk .* continuationValue; mixingProbabilities.m 47 end mixingProbabilities.m 49 vecCoef = sum(matCoef, 2); mixingProbabilities.m 57 mixprobcand = roots(vecCoef); mixingProbabilities.m 58 mixprobcand(mixprobcand<0 | mixprobcand>1) = []; mixingProbabilities.m 59 mixprobcand(real(mixprobcand) ~= mixprobcand) = []; mixingProbabilities.m 60 if(length(mixprobcand) ~= 1) mixingProbabilities.m 61 error('The number of roots between 0 and 1 is not equal to 1.'); mixingProbabilities.m 62 end mixingProbabilities.m 63 aS(iX) = mixprobcand; mixingProbabilities.m 65 end mixingProbabilities.m 67 end
 likelihoodStep1.m 11 function [ll, likeCont] = likelihoodStep1(Data , Settings, estimates) likelihoodStep1.m 19 from = Data.C(1:(Settings.tCheck - 1), 1:Settings.rCheck); likelihoodStep1.m 20 to = Data.C(2:Settings.tCheck, 1:Settings.rCheck); likelihoodStep1.m 24 likeCont = NaN(size(from(:),1), 1); likelihoodStep1.m 28 muC = estimates(1); likelihoodStep1.m 29 sigmaC = estimates(2); likelihoodStep1.m 63 selectInteriorTransitions = to > 1 & to < Settings.cCheck; likelihoodStep1.m 65 likeCont(selectInteriorTransitions) likelihoodStep1.m 66 = normcdf((Settings.logGrid(to(selectInteriorTransitions)) likelihoodStep1.m 67 - Settings.logGrid(from(selectInteriorTransitions)) + Settings.d / 2 - muC) / sigmaC) likelihoodStep1.m 68 - normcdf((Settings.logGrid(to(selectInteriorTransitions)) likelihoodStep1.m 69 - Settings.logGrid(from(selectInteriorTransitions)) - Settings.d / 2 - muC) / sigmaC); likelihoodStep1.m 71 likeCont(to == 1) = normcdf((Settings.logGrid(1) - likelihoodStep1.m 72 Settings.logGrid(from(to == 1)) + Settings.d / 2 - muC) / sigmaC); likelihoodStep1.m 74 likeCont(to == Settings.cCheck) = 1 - normcdf((Settings.logGrid(Settings.cCheck) likelihoodStep1.m 75 - Settings.logGrid(from(to == Settings.cCheck)) - Settings.d / 2 - muC) / sigmaC); likelihoodStep1.m 77 ll = -sum(log(likeCont)); likelihoodStep1.m 80 end
 likelihoodStep2.m 76 function [ll, likeCont] = likelihoodStep2(Data, Settings, Param, estimates) likelihoodStep2.m 83 Param.k = Settings.estimates2k(estimates); likelihoodStep2.m 84 Param.phi = Settings.estimates2phi(estimates); likelihoodStep2.m 85 Param.omega = Settings.estimates2omega(estimates); likelihoodStep2.m 93 [vS,pEntry,pEntrySet,pStay] = valueFunctionIteration(Settings, Param); likelihoodStep2.m 99 vec = @(x) x(:); likelihoodStep2.m 100 from = vec(Data.N(1:Settings.tCheck - 1, 1:Settings.rCheck)); likelihoodStep2.m 101 to = vec(Data.N(2:Settings.tCheck, 1:Settings.rCheck)); likelihoodStep2.m 102 demand = vec(Data.C(1:(Settings.tCheck - 1), 1:Settings.rCheck)); likelihoodStep2.m 111 llhContributionsCaseI = zeros(size(from)); likelihoodStep2.m 112 llhContributionsCaseII = zeros(size(from)); likelihoodStep2.m 113 llhContributionsCaseIII = zeros(size(from)); likelihoodStep2.m 114 llhContributionsCaseIV = zeros(size(from)); likelihoodStep2.m 115 llhContributionsCaseV = zeros(size(from));