Phoebe 2.0

Chris_Knoll · September 20, 2022, 3:38pm

Atlas 2.12.0 includes PHOEBE 2.0: concept recommendations. This function is available in the concept set editor under the ‘Recommend’ tab. However, in order for this to function to work, a custom table concept_recommended will need to be created and initialized with concept recommendations in your cdm schema with your other vocabulary tables.

These recommendations are provided as a .csv file and table DDL here.

In the event your environment does not have the concept_recommended table, Atlas will present a message directing to this forum post, with information about the latest version of the concept recommendations.

The current concept recommendations is in the file concept_recommended_20221006.zip.

Ajit_Londhe · October 7, 2022, 1:06pm

This is awesome work, and I know it will be a huge help for concept set authors both novice and expert!

Ajit_Londhe · March 22, 2023, 9:54pm

@Chris_Knoll could we host these files elsewhere, perhaps a new OHDSI github repo?

Chris_Knoll · March 23, 2023, 1:26pm

My understanding is that this will be provided as a standard part of the vocabulary download (ie: along with concept, concept_ancestor, concept_relationship, etc).

so, for now, this is a temporary arrangement until it becomes part of the formal process.

@Patrick_Ryan and @aostropolets, is there any eta on when this table will be included?

Ajit_Londhe · March 23, 2023, 1:42pm

Gotcha. I’ve got a new branch of Broadsea that will include loading this into an omop vocab postgres schema in progress. Any issues with these files being part of that for now?

Chris_Knoll · March 23, 2023, 1:57pm

No, I think it makes good sense to bundle those files/tables together in your broadsea deployment.

Christian_Reich · March 24, 2023, 10:44am

Will it, @Chris_Knoll? Isn’t the problem that the recommendations depend on the data that created them? How do we solve that problem? Or are you thinking of a generic or community set of recommendations?

Chris_Knoll · March 24, 2023, 2:24pm

A generic/community set of recommendations by default. The main post on this points to a file that was generated on some set of data, so the vocabulary providing this table by default would be no different. If the Phoebe authors want to release instructions on how to derive your own concept recommendations from local data, they could do that, and then they (the data owners) could just replace/augment the default recommendations with their own.

aostropolets · March 24, 2023, 4:49pm

The only data-specific piece may be patient context-derived recommendation. The rest is agnostic and the users can prioritize their review based on the network counts or their local data source counts.

On where to store the file and how to distribute: there have been discussions but we haven’t settled on anything yet. If the users have a preference, I’d gladly hear them out here

As I’m planning to work on enhancing the recommendations that would be a good time for deciding on the place of the file as well.

Chris_Knoll · March 24, 2023, 5:52pm

My prefrence would be that the vocabulary includes this table as part of the vocabulary download, but with an option that if people want to get updated recommendations or generate their own patient-context recommendations, that they can generated those for themselves (via a new R package that can be hosted in a git repo).

Maybe if there is going to be an R package to do this, the ‘default’ file can be saved inside this repo.

Ajit_Londhe · March 24, 2023, 7:02pm

This sounds good to me, very flexible. Plus we can pull these files into Broadsea for easier deployments.

aostropolets · March 24, 2023, 7:12pm

Ajit, are you using Phoebe? And the current option of downloading the file from forum is not flexible? Want to make sure I get it right.

Ajit_Londhe · March 24, 2023, 7:32pm

I think since it is such a key feature in Atlas 2.12+ (with a dedicated UI), it’s important that sites know where to find it.

A more formal place to me would be in GitHub and/or as part of Athena downloads. Plus there’s a lifecycle that can be followed as users.

Ajit_Londhe · March 25, 2023, 1:07am

Okay, forgot the CSV is 130 MB. But zipped, it’s 30 MB, which is fine for GitHub (max is 100 MB).

aostropolets · March 25, 2023, 4:17am

It makes sense in general that it would be a more formal place. I’m not sure what it would be. GitHub makes sense to me personally if it’s an R package. Vocab pack if it is not. Not sure if it requires additional tweaking on Athena side, gotta ask.

Ajit_Londhe · March 27, 2023, 1:43pm

Well, for now, I have the zip file of the CSV itself here in my fork of Broadsea, along with a bash script:

MaximMoinat · July 17, 2023, 5:50pm

@aostropolets Would you have more information about how the patient context works? All I could find is this snippet from the Phoebe 2.0 manual. This hints that the network counts from Phoebe 1.0 were not used for this, but a different source.

PHOEBE 2.0:
CREATING COMPREHENSIVE CONCEPT SETS
Now with:
• enhanced lexical match (with lemmatization, bigrams, conversion to a common part of
speech, tf-idf and pairwise cosine similarity)
• patient context (pairwise cosine similarity based on the vectors of the concept co-occurring
with a given concept)

Would be great to know e.g. on which dataset(s) the patient context was derived. Thanks in advance.

Linking in a post where another user had the same question: PHOEBE recommendations and standard codes

aostropolets · July 19, 2023, 6:27pm

Thanks, Maxim. I’m working on more extended docs along with code and will share the links once I’m done. The counts for all recommendations are coming from the network (or your ds if you use Phoebe on your local Atlas instance). Lexical and ontology recommendations were generated off of vocab tables, patient context ones were generated off of patient data from a collection of US claims and EHR data sources. We can run the code against your ds if you are interested in enhancing the recommendations with ones specific to your ds.

Chris_Knoll · May 28, 2024, 3:49pm

The recommended file as been updated and you can use this file:
concept_recommended_20240527.zip

For the latest recommendations.

I do not have permissions to edit the original post with this updated file.

jcabrerazuniga · May 29, 2024, 4:16pm

While uploading the latest

concept_recommended_20240527.csv

I run into the next error:

ERROR: value too long for type character varying(20)
CONTEXT: COPY concept_recommended, line 4938309, column relationship_id: “Ontology-relationship”

So, the next command needs to be updated:
CREATE TABLE {schema}.concept_recommended
(
concept_id_1 bigint,
concept_id_2 bigint,
relationship_id character varying(20)
)

Thanks