One more question (or should I start a new topic for that?)
The unpacking of CPT4 codes overwrites downloaded CONCEPT.csv - should it do that? We followed the instructions in this automated email from Athena:
Important: All vocabularies are fully represented in the downloaded files with the exception of CPT-4: OHDSI does not have a distribution license to ship CPT-4 codes together with the descriptions. Therefore, we provide you with a utility that downloads the descriptions separately and merges them together with everything else. After unpacking, simply open a command line in the directory you unpacked all the files into and run “java -Dumls-user=xxx -Dumls-password=xxx -jar cpt4.jar 5”. Please replace “xxx” with UMLS username and password.
and the result was as follows:
Updated CPT4 records: 15640, records to update: 0
All cpt4 concepts are processed.
Writing updated data to CONCEPT.csv
CPT successfully updated.
So the CONCEPT.CSV now contains only CPT4 concepts (and the size is 2 MB from some 800 MB).