I have been trying to implement cohort diagnostics for a BgQuery dataset. I got multiple errors but one of them was mainly about not allowing an aggregation function in Group By. Anyone had a similar problem or error?
For instance running the simplest Cohort Diagnostic function:
runCohortDiagnostics(baseUrl = baseUrl,
cohortSetReference = cohortsToCreate,
connectionDetails = connectionDetails,
cdmDatabaseSchema = “SCHEMA”,
tempEmulationSchema = " SCHEMA ",
cohortDatabaseSchema = " SCHEMA ",
cohortTable = “cohorttable”,
inclusionStatisticsFolder = inclusionStatisticsFolder,
exportFolder = exportFolder,
databaseId = 'DatasetID,
runInclusionStatistics = FALSE,
runIncludedSourceConcepts = TRUE,
runOrphanConcepts = TRUE,
runTimeDistributions = TRUE,
runBreakdownIndexEvents = TRUE,
runIncidenceRate = FALSE,
runCohortOverlap = FALSE,
runCohortCharacterization = FALSE,
minCellCount = 0)
gave this error
Error:
java.sql.SQLException: [Simba]BigQueryJDBCDriver Error executing query job. Message: Column 6 contains an aggregation function, which is not allowed in GROUP BY at [53:19]
SQL:
select cohort_definition_id cohort_id
- ,period_begin*
- ,calendar_interval*
- ,count() records
- ,count(distinct subject_id) subjects*
- ,sum(date_diff(if(safe_cast(case *
-
when cohort_end_date >= period_end*
-
then period_end*
-
else cohort_end_date*
-
end as date) is null,parse_date('%Y%m%d', cast(case *
-
when cohort_end_date >= period_end*
-
then period_end*
-
else cohort_end_date*
-
end as string)),safe_cast(case *
-
when cohort_end_date >= period_end*
-
then period_end*
-
else cohort_end_date*
-
end as date)), if(safe_cast(case *
-
when cohort_start_date >= period_begin*
-
then cohort_start_date*
-
else period_begin*
-
end as date) is null,parse_date('%Y%m%d', cast(case *
-
when cohort_start_date >= period_begin*
-
then cohort_start_date*
-
else period_begin*
-
end as string)),safe_cast(case *
-
when cohort_start_date >= period_begin*
-
then cohort_start_date*
-
else period_begin*
-
end as date)), day) + 1) person_days*
- ,count(case *
-
when cohort_start_date >= period_begin*
-
and cohort_start_date <= period_end*
-
then subject_id*
-
else null*
-
end) records_incidence*
- ,count(distinct case *
-
when cohort_start_date >= period_begin*
-
and cohort_start_date <= period_end*
-
then subject_id*
-
else null*
-
end) subjects_incidence*
- from test_writable_sulieman.cohorttabletest*
inner join ofcjbawzcalendar_periods cp on ( -
cohort_start_date >= period_begin*
-
and cohort_start_date <= period_end*
-
)*
- or (*
-
cohort_end_date >= period_begin*
-
and cohort_end_date <= period_end*
-
)*
where cohort_definition_id in (101431)
- group by 2, 3, 6 ;*