First of all, thank you for the helpful comment, @Christian_Reich
Indeed the index event would be the major problem of this design. Since I'll extract the incidence or prevalence of cancer patients quarterly, I can extract large comparator set as age/sex matched population without cancer before and after the index date (The index date will be the first day of the quarter). And then these patients can be matched with cancer patients by large-scale propensity score matching.
As you said, doctors usually target higher glucose level in diabetic patients with cancer than in diabetic patients without cancer, which means diabetic patients with cancer requires less anti-diabetic medication or tests. I cannot overcome this problem...
As you know, there is no perfect way to estimate 'net cost'. The strength of this approach is the 'scalability'. We can apply this method to estimate 'net medical cost' for any other chronic diseases, like gout, psoriasis, rheumatic arthritis, lupus, and so on. Or we can estimate the cost caused by diabetes mellitus in cancer patients (in this case, we need to set many excluded_concept_ids for matching though).