Ok, makes sense. Well, your algorithm has to realize that there is no such a thing as a linear condition-drug-outcome chain of events. There are probabilities between them. In real life, we don't know what they are. In a simulated life we do, but we don't know how realistic they are. Think about it: The simplistic logic is that a drug gets prescribed to treat a certain condition. In reality this is often the case (high blood pressure causes the prescription of an anti-hypertensive drug), but in other cases it is not (beta blockers also help you calm down, which might be an intended effect, or not). But we rarely capture any of this. The doctor just prescribes "what's good for you" and doesn't bother explaining too much. That's for the first half of the triple. The second half, the relationship between treatment and outcome, is even murkier. We have these self-healing bodies, and it is very hard to say for sure what it was that made the patient get better. In some cases, we know for sure it wasn't the drug (like antibiotics prescribed for common colds, which is a viral disease in most cases). As a result, we rarely try to resolve this on an individual patient basis, but look for causal effects in the entire population of patients that share some features we are studying. It's called population-based estimation, and there is a whole part of OHDSI building methods and tools to help with that.
So far for the theory. In your particular case, you can use OSIM2 to play with it. Two things:
- There is no difference between Condition and Outcome. Outcome just means you get more or fewer Conditions (the rate changes), or a higher or lower degree of a Condition (the quantity of something changes), or faster or slower time to onset or end of a Condition. So, your triple is Condition-Drug-Condition.
- OSIM2 let's you model both halves. But you need the transition matrix in order to create realistic first halves. The transition matrix of the data on the FTP site is gone and cannot be reproduced (it's based on a database of a few years ago, nobody has that anymore).
So, you need to recreate the whole thing. For that, you need access to data. You either have data, or you need to get them from organizations that sell them, like QuintilesIMS (where I work), Truven, Optum or so. These places usually want money for that, but can be convinced to give you access to the data for free in exchange to some useful artifact, insight or paper. One artifact could be that you will provide the simulated data back to them, especially if they contained more than just Condition and Drug.
Does that help?