OHDSI Home | Forums | Wiki | Github

White Rabbit: frequency vs word count

Hi

While using White Rabbit with source data where one of the variables is a “text snippet” (consisting of several words), we see that in the scanresults not the frequency of each snippet is shown but instead the “word count” for each word used in the snippets. When reducing the source data to a set of 100 records and using this in White Rabbit, each snippet is shown with its frequency (and not the word count).

What is the “trigger” for White Rabbit to change “frequency” to “word count”?

Thanks.

Hi @Luc

What I can tell from the White Rabbit code is that the switch to word count is only made if the table contains > 1000 records and the average value length of the first 1000 values in the text column is > 100 characters (excluding empty values).

t