OHDSI Home | Forums | Wiki | Github

Google BigQuery support update

We only keep the one with the int64

Thanks a lot Jose!

Hi @gregk and @jposada, I’m trying to connect to BigQuery using connection details. I’m able to get it to work if I have permissions to create a service account. Has anyone been able to get DatabaseConnector to work with BigQuery when using Using a Google Account?

1 Like

yes, I use the application default credentials.

You can use this small package I created to ease the process of using those credentials.

Thanks! I’ll check it out.

Thanks, @jposada. It worked!

I have another question :slight_smile: : Have you run into issues w/ casing with OHDSI packages and BigQuery? From my understanding, BigQuery is case-sensitive, but I’ve noticed that when using CohortDiagnostics, it calls DatabaseConnector:: renderTranslateQuerySql where it renders and then translates sql. I have a schema in BigQuery that is in upper case, but when translate is run, the schema in the query is converted to lower case. Have you experienced this? When I manually call translate and then render, I get the expected SQL with the proper casing.

Tagging @schuemie and @Gowtham_Rao since they know DatabaseConnector and CohortDiagnostics. Thanks.

Yes, there is an issue with casing for table names. It usually has not bitten me because we took the internal decision of lowercasing everything we produce in BQ because of this. This decision was made not only for OHDSI but all our other Research Data Warehouse assets.

That makes sense. I unfortunately do not have that option. I’m trying to get OHDSI tools working in the All of Us Research Workbench (and we’re very close). This is the last hurdle (I think). SqlRender handles cases from what I can tell; it’s just translate that does not.

Same for Colorado as for Stanford – we moved to all lower-case table names for the same reason. For some logic that only Googler’s understand, GBQ is case sensitive for table names but not for column names. Go figure…

Since we are using OHDSI naming convention, we are okay w/ the tables and columns, but for our schema (i.e. project space) it is case sensitive…

filenames for tables on Unix file systems?

1 Like

It is common for us, that build file structures in *nix systems, to only use lowercase directory/file names and use underscores in place of whitespace, for this exact reason.

And after you get used to it, it’s hard to try something else.

I know this is an ANCIENT issue but there now is an option that makes BQ case insensitive:

alter schema project_id.dataset
set options (is_case_insensitive = TRUE);

It has mildly adjusted my love-hate relationship with BigQuery…

3 Likes

Hi Michael,

Thank you for this tip. I’m trying to follow your instructions. But since I’m not used to work on google cloud platform where my database biquery is hosted. I don’t know where to run this code.
On google cloud I don’t have the permissions to modify the database. Is there a way to run this code in my R environment ?

Thanks,

@MeriemGrs If you do not have permissions, the only option is to request this change internally. I assume your IT folks could help you.

1 Like

Ok than I will do that. Thanks Konstantin !

@MeriemGrs: I am not sure what permissions are required to allow you to execute the “alter schema…” command. I executed that statement only once as a stand-alone command in the Google BigQuery SQL console. If I understand your post, you tried this (or something similar) and had the command rejected. I am pretty sure putting it into an R script (for example, adding it to the SQL statements in R) would result in the same rejection. I agree with @Konstantin_Yaroshove that if executing the commend in the Google console is rejected for you, your only option is to ask the BigQuery owners to change it for you. It is a one-time request but (unfortunately) for each projectid.dataset.

Your IT folks should not give you any push-back as it should not evoke any additional security concerns. If they refuse, your only remaining approach is to manually change the table names in all SQL scripts that you receive into the upper/lower case that you use. Prior to discovering the new command, I wrote a simple SED script that did this for me for all of the OMOP tables.

1 Like

Seems I cannot post files so here is my TRIVIAL script that converts OMOP table names that are in uppercase to lowercase because our OMOP tables are in lowercase. It assumes that the original table names are prepended with “Schema” (as is commonly used as “@cdmSchema” or “@resultsSchema”. It also assumes the three files ORIGINAL_bigquery.sql, lower_case_OMOP_tablenames.sh, and lc_tables.sed, are in the same directory. There are more sophisticated scripts that can do much more than this. I only created this utility for my personal use so I wasn’t fancy about it…

To be clear, after I was able to run the “update schema…” command, I no longer need to run this SED script.

lower_case_OMOP_tablenames.sh:

#!/bin/bash
sed -f ./lc_tables.sed ORIGINAL_bigquery.sql > ORIGINAL_bigquery_lc.sql

lc_tables.sed:

s/Schema.PERSON/Schema.person/g
s/Schema.OBSERVATION_PERIOD/Schema.observation_period/g
s/Schema.VISIT_OCCURRENCE/Schema.visit_occurrence/g
s/Schema.CONDITION_OCCURRENCE/Schema.condition_occurrence/g
s/Schema.DRUG_EXPOSURE/Schema.drug_exposure/g
s/Schema.PROCEDURE_OCCURRENCE/Schema.procedure_occurrence/g
s/Schema.MEASUREMENT/Schema.measurement/g
s/Schema.OBSERVATION/Schema.observation/g
s/Schema.LOCATION/Schema.location/g
s/Schema.CARE_SITE/Schema.care_site/g
s/Schema.PROVIDER/Schema.provider/g
s/Schema.DEATH/Schema.death/g
s/Schema.DRUG_ERA/Schema.drug_era/g
s/Schema.CONDITION_ERA/Schema.condition_era/g
s/Schema.OBSERVATION_PERIOD/Schema.observation_period/g
s/Schema.DRUG_ERA/Schema.drug_era/g
s/Schema.CONDITION_ERA/Schema.condition_era/g

1 Like

Thank you all for your answers and explanations.
I managed to run the “alter schema” and that was very helpfull.

t