Korean Vocab EDI issue

Rijnbeek · February 9, 2025, 6:09pm

https://athena.ohdsi.org/search-terms/terms/42388897

Is it expected that the English name is like this having Korean symbols? It is the case in many other concepts.

Christian_Reich · February 10, 2025, 4:42am

Should not. Concept names should be English. Synonyms should contain foreign languages. Thanks for catching.

aostropolets · February 10, 2025, 5:43am

The one you’re pointing to is deprecated. If there are other valid terms like such, you could create an issue on github and we will direct it to the team who’s updating. Our Korean colleagues have been doing amazing job with refreshing the vocabulary as community contributors (vocabulary stewards can also be seen here).

SCYou · February 11, 2025, 10:51am

Thank you @aostropolets for clear elaboration

Rijnbeek · February 12, 2025, 7:43pm

To be clear I am not saying they are not doing a great job :).

I assume this issue has resulted in a test that it is not possible in the future that the concept names will contain these characters since they have been standard before (otherwise they would not be deprecated now).

edburn · February 14, 2025, 8:36pm

@Christian_Reich not sure if I can reopen this Non-english characters in name of standard concept 3008995 · Issue #916 · OHDSI/Vocabulary-v5.0 · GitHub, but it would be great if names of standard concepts were uft8

Christian_Reich · February 15, 2025, 12:38pm

Isn’t that a setting of your database and loading scripts? In the Vocab server, it is.

edburn · February 19, 2025, 7:22pm

To some degree, but in a network study that often would then fall on various data partners. Moreover, if you then want to collect data out of a server to incorporate in analytics you often hit problems like these (problem with concept names that contain non utf-8 characters · Issue #232 · darwin-eu/CodelistGenerator · GitHub, getDrugIngredientCodes and non UTF-8 characters · Issue #233 · darwin-eu/CodelistGenerator · GitHub) where hard to reproduce problems can happen because of the data parnter’s locale and so on (which is then “solved” by converting everything into UTF-8)

Christian_Reich · February 21, 2025, 12:35pm

What’s the solution?