Last week there was a great discussion about how to make network studies easier.
Our group needs to teach an R expert some basics of creating a study.
I wanted to share some notes on the forum.
We had trouble following the skeleton for comparative study posted earlier.
Instead, we used an endometriosis study as example.
Some notes I ended up explaining (that apply to many studies) were:
- to create your cohorts, you use .SQL. You can add those to your package manually or use R code. This would be extras folder and in file package maintenance
https://github.com/molliemckillop/Endometriosis-Phenotype-Characterization/blob/master/extras/PackageMaintenance.R
You don’t have to use cohort.csv file. You can simply cut and paste from public Atlas outputed SQL for a cohort.
-
to test your package, (outside readme snippets), you can inspect code in ‘codeToRun.R’ Like here https://github.com/molliemckillop/Endometriosis-Phenotype-Characterization/blob/master/extras/CodeToRun.R
-
You end up picking a name for your cohort table (e.g., HIVCOHORTS) and also your cohortIDs.
-
You can also assume a cohort is already created for you and start using other features right away (e.g., follow this vignete to produce table 1 http://ohdsi.github.io/FeatureExtraction/articles/UsingFeatureExtraction.html#creating-a-table-1 )
If the SimpleSQLSkeleton github repo and guidance is ever created (as discussed last week), I am happy to review it for currency (working after any package updates) every 6 months. We also hope to have a series of HIV studies that will be in increasing complexity. (starting with some descriptive studies first) (since our funding mandate for an HIV theme is informatics-ish in nature)