Google BigQuery support update

jposada · March 15, 2021, 4:06pm

We only keep the one with the int64

anoop.yamsani · March 30, 2021, 3:39am

Thanks a lot Jose!

cukarthik · July 11, 2021, 10:49pm

Hi @gregk and @jposada, I’m trying to connect to BigQuery using connection details. I’m able to get it to work if I have permissions to create a service account. Has anyone been able to get DatabaseConnector to work with BigQuery when using Using a Google Account?

jposada · July 12, 2021, 10:57pm

yes, I use the application default credentials.

You can use this small package I created to ease the process of using those credentials.

cukarthik · July 12, 2021, 6:01pm

Thanks! I’ll check it out.

cukarthik · July 13, 2021, 2:57pm

Thanks, @jposada. It worked!

I have another question : Have you run into issues w/ casing with OHDSI packages and BigQuery? From my understanding, BigQuery is case-sensitive, but I’ve noticed that when using CohortDiagnostics, it calls DatabaseConnector:: renderTranslateQuerySql where it renders and then translates sql. I have a schema in BigQuery that is in upper case, but when translate is run, the schema in the query is converted to lower case. Have you experienced this? When I manually call translate and then render, I get the expected SQL with the proper casing.

Tagging @schuemie and @Gowtham_Rao since they know DatabaseConnector and CohortDiagnostics. Thanks.

jposada · July 13, 2021, 2:59pm

Yes, there is an issue with casing for table names. It usually has not bitten me because we took the internal decision of lowercasing everything we produce in BQ because of this. This decision was made not only for OHDSI but all our other Research Data Warehouse assets.

cukarthik · July 13, 2021, 3:01pm

That makes sense. I unfortunately do not have that option. I’m trying to get OHDSI tools working in the All of Us Research Workbench (and we’re very close). This is the last hurdle (I think). SqlRender handles cases from what I can tell; it’s just translate that does not.

mgkahn · July 13, 2021, 3:01pm

Same for Colorado as for Stanford – we moved to all lower-case table names for the same reason. For some logic that only Googler’s understand, GBQ is case sensitive for table names but not for column names. Go figure…

cukarthik · July 13, 2021, 3:02pm

Since we are using OHDSI naming convention, we are okay w/ the tables and columns, but for our schema (i.e. project space) it is case sensitive…

CRoeder · July 15, 2021, 4:50pm

filenames for tables on Unix file systems?

Mark · July 15, 2021, 5:15pm

It is common for us, that build file structures in *nix systems, to only use lowercase directory/file names and use underscores in place of whitespace, for this exact reason.

sofomon · September 2, 2021, 12:05pm

And after you get used to it, it’s hard to try something else.

mgkahn · July 20, 2023, 6:46pm

I know this is an ANCIENT issue but there now is an option that makes BQ case insensitive:

alter schema project_id.dataset
set options (is_case_insensitive = TRUE);

It has mildly adjusted my love-hate relationship with BigQuery…

MeriemGrs · March 14, 2024, 9:27am

Hi Michael,

Thank you for this tip. I’m trying to follow your instructions. But since I’m not used to work on google cloud platform where my database biquery is hosted. I don’t know where to run this code.
On google cloud I don’t have the permissions to modify the database. Is there a way to run this code in my R environment ?

Thanks,

Konstantin_Yaroshove · March 14, 2024, 10:55am

@MeriemGrs If you do not have permissions, the only option is to request this change internally. I assume your IT folks could help you.

MeriemGrs · March 14, 2024, 12:06pm

Ok than I will do that. Thanks Konstantin !

mgkahn · March 14, 2024, 3:51pm

@MeriemGrs: I am not sure what permissions are required to allow you to execute the “alter schema…” command. I executed that statement only once as a stand-alone command in the Google BigQuery SQL console. If I understand your post, you tried this (or something similar) and had the command rejected. I am pretty sure putting it into an R script (for example, adding it to the SQL statements in R) would result in the same rejection. I agree with @Konstantin_Yaroshove that if executing the commend in the Google console is rejected for you, your only option is to ask the BigQuery owners to change it for you. It is a one-time request but (unfortunately) for each projectid.dataset.

Your IT folks should not give you any push-back as it should not evoke any additional security concerns. If they refuse, your only remaining approach is to manually change the table names in all SQL scripts that you receive into the upper/lower case that you use. Prior to discovering the new command, I wrote a simple SED script that did this for me for all of the OMOP tables.

mgkahn · March 14, 2024, 4:08pm

Seems I cannot post files so here is my TRIVIAL script that converts OMOP table names that are in uppercase to lowercase because our OMOP tables are in lowercase. It assumes that the original table names are prepended with “Schema” (as is commonly used as “@cdmSchema” or “@resultsSchema”. It also assumes the three files ORIGINAL_bigquery.sql, lower_case_OMOP_tablenames.sh, and lc_tables.sed, are in the same directory. There are more sophisticated scripts that can do much more than this. I only created this utility for my personal use so I wasn’t fancy about it…

To be clear, after I was able to run the “update schema…” command, I no longer need to run this SED script.

lower_case_OMOP_tablenames.sh:

#!/bin/bash
sed -f ./lc_tables.sed ORIGINAL_bigquery.sql > ORIGINAL_bigquery_lc.sql

lc_tables.sed:

s/Schema.PERSON/Schema.person/g
s/Schema.OBSERVATION_PERIOD/Schema.observation_period/g
s/Schema.VISIT_OCCURRENCE/Schema.visit_occurrence/g
s/Schema.CONDITION_OCCURRENCE/Schema.condition_occurrence/g
s/Schema.DRUG_EXPOSURE/Schema.drug_exposure/g
s/Schema.PROCEDURE_OCCURRENCE/Schema.procedure_occurrence/g
s/Schema.MEASUREMENT/Schema.measurement/g
s/Schema.OBSERVATION/Schema.observation/g
s/Schema.LOCATION/Schema.location/g
s/Schema.CARE_SITE/Schema.care_site/g
s/Schema.PROVIDER/Schema.provider/g
s/Schema.DEATH/Schema.death/g
s/Schema.DRUG_ERA/Schema.drug_era/g
s/Schema.CONDITION_ERA/Schema.condition_era/g
s/Schema.OBSERVATION_PERIOD/Schema.observation_period/g
s/Schema.DRUG_ERA/Schema.drug_era/g
s/Schema.CONDITION_ERA/Schema.condition_era/g

MeriemGrs · March 19, 2024, 8:42am

Thank you all for your answers and explanations.
I managed to run the “alter schema” and that was very helpfull.