I have questions regarding how I can implement a custom balancing method.
I am currently trying to implement the package EBAL on CohortMethod. Here’s a link to the paper
It computes for a set of weights applied to the covariates of the control group.
It looks like the output model is fitted with the fitOutcomeModel function, and the object population contains everything including the treatment indicator, covariates and the outcome.
Will I be getting the intended balancing and the final model with this modification?
Any help would be greatly appreciated. Thanks in advance!
edit:
I roughly laid out the plan, to be precise I intend to use population$treatment == 0 as a filter and apply the weights to the control group only, possibly to covariateData or cyclopsData.
Hi @manjimin ! The population object does not contain the covariates. Those are kept separately in the CohortMethodData object, which is an Andromeda object that allows for very large data. This is because we use large-scale propensity scores (LSPS), where we include many baseline covariates (often > 100,000) in the propensity model.
There is an option to include covariates in the outcome model (possibly in addition to using the LSPS), but we typically don’t use that because the outcome is often rare.
Is the outcome model fitted with the treatment variable as the only feature then?
Would it make sense to get the weights using the covariates in the CohortMethodData object, and then apply them to the population object before fitting the model, instead of ps matching?
Yes, we typically use variable-ratio PS matching, in which case we fit a Cox model conditioned on the matched set with the treatment as the only predictor. Alternatively, we may use 1-on-1 PS matching and not use the conditioning, which is more (statistically) efficient when both cohorts are of roughly the same size.
If I understand correctly, you want to evaluate a type of weighting to the population. The CohortMethod package supports weighting such as IPTW, but we consistently find in real-world settings it leads to biased estimates. (Many other people use IPTW, but don’t empirically evaluate it. I guess ignorance is bliss) For this reason we tend to not use weighting, but I welcome your research into different types of weighting.
Normally, the createPs() function would add the weights as the iptw field to the population object, as you can see here. You could replace that with your own function that adds these weights. Then, when calling the fitOutcomeModel() function you can set inversePtWeighting = TRUE and it would use your weights.
I’m unsure how the Entropy Balancing works. As mentioned before, we tend to use a lot of covariates when fitting the PS model. This has the advantage of not missing important confounders, and can even adjust for confounders that are only measured indirectly. Ideally, you would use the same approach with Entropy Balancing, but I’m not sure it scales to tens of thousands of covariates.