OHDSI Home | Forums | Wiki | Github

Error in getDbCohortMethodData()

Hello. I’m one of people who wants to study CDM a lot even it’s hard than I expected.

When I run this code, I’ve got an error below.

cmData ← getDbCohortMethodData(connectionDetails = connDetails,
cdmDatabaseSchema = cdmDbSchema,
oracleTempSchema = NULL,
targetId = 1,
comparatorId = 2,
outcomeIds = 3,
studyStartDate = “”,
studyEndDate = “”,
exposureDatabaseSchema = cohortDbSchema,
exposureTable = cohortTable,
outcomeDatabaseSchema = cohortDbSchema,
outcomeTable = cohortTable,
cdmVersion = cdmVersion,
firstExposureOnly = FALSE,
removeDuplicateSubjects = FALSE,
restrictToCommonPeriod = FALSE,
washoutPeriod = 0,
covariateSettings=FALSE)

|===================================================================== | 50%Error in nchar(object, type = “chars”) :
invalid multibyte string, element 4

What should I do??

@Minjin Could you share the error message in the log file?

Error in nchar(object, type = “chars”) : invalid multibyte string, element 4

This is the message of error. The table in here doesn’t have any other languages except English.

@SCYou cohort table havs 4 colums and their names are Person_id, cohort_definition_id, start_date amd end_date. All of these variables are consist of numeric numbers…

@Minjin Usually, the cohort table has 4 columns: ‘subject_id’(and not person_id), ‘cohort_definition_id’, ‘start_date’, and ‘end_date’.

perhaps the number is not an actual half-width number, but it is a number that is full-width in UTF-8 or korean. i have cases where there is a hidden character in the middle of the string where it is not displayable in English computer, but it is there to block me from copying data onto the database.

we do not know unless you post the raw data. don’t. likely you have to figure it out yourself.

a way to figure it out is to go to that part of the source code and print the data line by line.

@SCYou I thought it has mapped automatically but it still have same problems…I should figure out how to solve this problems…

@lychenus13 Thank you for your reply! Probably subject id could be problem. For instance, id is 1000001 but it displayed 1.00e+01. What do you think?

I think it could be solved by running the following command in R:
Sys.setlocale(category=“LC_CTYPE”, locale=“C”)
options(scipen=5)

@Jaehyeong_Cho Thank you for your reply. Unfortunately, when I run that code, I have another problem. It occurs Error below.

Error in rJava::.jcall(p, “Ljava/lang/Object;”, “setProperty”, names(properties)[i], : Unable to start conversion to UTF-16

@Jaehyeong_Cho I think I have problem in rjava or rlang. If you don’t mind could you tell me the version of rjava ans rlang?

Problem has solved!! Thank you for all of your replies! @Jaehyeong_Cho @lychenus13 @SCYou

@Minjin Could you share how you solve this problem?

@SCYou The main problem was the name of each columns. As you said, the cohort table has 4 columns: ‘subject_id ’(and not person_id ), ‘cohort_definition_id’, ‘start_date’, and ‘end_date’. So I changed all of the names from
person_id, cohort_definition_id, start_date, end_date
to
subject_id, cohort_definition_id, cohort_start_date, cohort_end_date

and it works!

Plus, I run this code too.

Sys.setlocale(category=“LC_CTYPE”, locale=“us”)
options(scipen=100)

thank you for @Jaehyeong_Cho 's idea :slight_smile:

1 Like
t