My first post! I apologize for how long it will be, but please bear with me.

I am looking for help in a topic related to Markov chain modeling with covariates.

I have a decent knowledge of the underlying theory, but not exactly how to implement some aspects of it in practical terms.

First, a bit of exposition and background material that may be useful.

I want to build the transitions matrix for a model with agents transitioning between states a and b, where each element

in the transitions matrix is parameterized by a logistic link connecting individual features of agents to the transitions.

Hence, say, for 2 different states a and b, one must build a stochastic matrix Theta, with elements theta1, theta2, theta3, theta4,

where theta1 represents the weighted transitions between state a and itself, theta2 between a and b, and as usual theta1 + theta2 = 1, and so on.

Theta = [theta1 theta2

theta3 theta4]

According to the logistic link, we have that:

theta1 = 1/(1 + exp(Yab)),

theta2 = exp(Yab)/(1 + exp(Yab))

theta3 = 1/(1 + exp(Yba))

theta4 = exp(Yba)/(1 + exp(Yba))

where Yij = Beta_ij(0) + Beta_ij(Feature1) + Beta_ij(Feature2), i.e, it is a logistic regression expression

where the coefficients Beta are attached to individual features for the agents transitioning between states i and j.

The generalization for a higher number of states is easy enough.

From this you then derive, for each agent in your dataframe, the individual loglikelihood.

The objective is to obtain the global likelihood as the sum over all likelihoods of the agents and pass that function

along with initial parameters to an optimizer to find the Beta_ij and from there, the matrix Theta.

Lets say individual 1 transitions from state a to state b and then back to state a. Then, given this set

of consecutive states, and letting Pi_a be the proportion of individuals initially at state a, his log-likelihood will be

LL1 = log(Pi_a) + log(theta1) + log(theta4)

» log(Pi_a) + log(exp(Yab)/(1 + exp(Yab))) + log(exp(Yba)/(1 + exp(Yba)))

» log(Pi_a) + log(exp(Beta_ab(0) + Beta_ab(Feature1) + Beta_ab(Feature2))/(1 + exp(Beta_ab(0) + Beta_ab(Feature1) + Beta_ab(Feature2)))) +

log(exp(Beta_ba(0) + Beta_ba(Feature1) + Beta_ba(Feature2))/(1 + exp(Yba = Beta_ba(0) + Beta_ba(Feature1) + Beta_ba(Feature2))))

For your n agents, the log-likelihood will be

LL = Sum(LLi)

Ok, enough background material. on to more concrete things and some procedural doubts.

Say I have a data frame in panel data format in R, from which I derive the both the transitions and the set of features agents possess at the time they transition:

ID Feature1 Feature2 Transition

120421006 10000 1 ab

120421006 12000 0 ba

120421006 10000 1 ab

123884392 3000 1 ab

123884392 2000 0 ba

908747738 1000 1 ab

My questions:

1) How would you automate a function in R that returns the log-likelihood of each agent given the features, and then sums them over into one global function, where the unknowns are the Betas?

2) The function you feed into the MLE optimizer returns the estimates of the Betas. How do you build a matrix from the Betas if you need the Ys for the thetas? I suppose an approach is to take the means,

but how would you do it? and if there are dummy variables in the features, like feature2 above?

3) What software would you recommend for the MLE estimation if there are hundreds of states, thousands os possible transitions, and millions of agents? I tried a fully computational approach(namely, the msm package in R) but it chokes

up with this amount of data. Plus, it may be better to develop the code to have more control over it.

Thanks in advance!