OHDSI Home | Forums | Wiki | Github

Achilles Heel installation error

Hello

I got stuck when installing the Achilles with R. After long query of Achilles, Achilles Heel installation stopped with error message. Here, I would like to attach the R codes and Error report. Can you help me solve the problem? Thank you.

-R codes-
achillesResults <- Achilles::achilles(connectionDetails,
cdmDatabaseSchema = cdmDatabaseSchema,
oracleTempSchema = NULL,
resultsDatabaseSchema = resultsDatabaseSchema,
sourceName = db,
#analysisIds = c(691,791,891,1891),
createTable = TRUE,
smallcellcount = 5,
cdmVersion = “5”,
runHeel = TRUE,
validateSchema = TRUE,
vocabDatabaseSchema = vocabDatabaseSchema,
runCostAnalysis = FALSE,
sqlOnly = FALSE,
conceptHierarchy = TRUE,
createIndices = TRUE)

-Error message-
Executing Achilles Heel. This could take a while
|== | 5%Error: Error executing SQL:
com.microsoft.sqlserver.jdbc.SQLServerException: Invalid column name ‘measure_id’.
An error report has been created at C:/Users/Administrator/Documents/errorReport.txt

-Error Report-
DBMS:
sql server

Error:
com.microsoft.sqlserver.jdbc.SQLServerException: Invalid column name ‘measure_id’.

SQL:
insert into RESULT.DBO.ACHILLES_results_derived (analysis_id, stratum_1, statistic_value,measure_id)
select
–100000+analysis_id,
NULL as analysis_id,
stratum_2 as stratum_1,
sum(count_value) as statistic_value,
CAST(CONCAT(‘ach_’,CAST(analysis_id as VARCHAR),’:GlobalCnt’) AS VARCHAR(255)) as measure_id
from RESULT.achilles_results
where analysis_id in(1805,705,605,805,405) group by analysis_id,stratum_2,measure_id

R version:
R version 3.4.3 (2017-11-30)

Platform:
x86_64-w64-mingw32

Attached base packages:

  • stats
  • graphics
  • grDevices
  • utils
  • datasets
  • methods
  • base

Other attached packages:

  • Achilles (1.5)
  • rjson (0.2.15)
  • DatabaseConnector (2.0.5)
  • SqlRender (1.4.8)
  • devtools (1.13.5)

So, this is interesting, and the SQL Server is correct: there’s no ‘measure_id’ to group by, the measure_id is the computed column. I think in sql server, you would have to group by the expression that produced the measure_id:

CAST(CONCAT('ach_',CAST(analysis_id as VARCHAR),':GlobalCnt') AS VARCHAR(255))

but I think this has problems on other paltforms.

I believe the recommended approach to handle this is: the computed columns that want to be grouped-by must be calculated in a sub-query and then referenced outside as the column ie:

insert into RESULT.DBO.ACHILLES_results_derived (analysis_id, stratum_1, statistic_value,measure_id) 
select 
--100000+analysis_id, 
NULL as analysis_id,
stratum_2 as stratum_1,
sum(count_value) as statistic_value,
CAST(CONCAT('ach_',CAST(analysis_id as VARCHAR),':GlobalCnt') AS VARCHAR(255)) as measure_id
from RESULT.achilles_results 
where analysis_id in(1805,705,605,805,405) group by analysis_id,stratum_2,measure_id

should be

insert into RESULT.DBO.ACHILLES_results_derived (analysis_id, stratum_1, statistic_value,measure_id) 
select 
--100000+analysis_id, 
NULL as analysis_id,
stratum_1,
sum(count_value) as statistic_value,
measure_id
FROM (
  select stratum_2 as stratum_1, count_value,  
    CAST(CONCAT('ach_',CAST(analysis_id as VARCHAR),':GlobalCnt') AS VARCHAR(255)) as measure_id
  FROM RESULT.achilles_results 
  where analysis_id in(1805,705,605,805,405)
) Q
group by stratum_1, measure_id

I think since analysis_id is a static value, you don’t need to provide it int he group_by clause

Thanks @Chris_Knoll.

@estone96 Thank you for letting us know about this. Could you open an issue here? Please include in your description the version of Achilles that were you installing.

Thank you very much.
As @Chris_Knoll suggested, I edited sql code and executed again. Fortunately, it seems to work. However, when executing, I got another error “There is already an object named ‘#tempResults’ in the database.”. But, there is no #tempResult table in my Tempdb database. How can I solve it?

Not sure what the #tempResult table is used for, but temp tables shoudl be cleared between connections. is it possible you could restart your R session and jsut execute the HEEL reports? It’s possible that the heel query is maybe re-using a temp table but it’s tryign to create the same temp table with the same name… if restarting the R session (which will restart the connection, drop any temp table you have created) does not resolve the problem, please let me know.

@Chris_Knoll, could you elaborate on why you think grouping on the expression has problems on other platforms?

Because that to me would seem the correct approach (that I think we use in most places).

In fact, in this case you should be able to just group by analysis_id.

I seem to recall an issue where the select statement had a column that was defined by a select statement (like select count(*) from person group by something) and they wanted to group by those counts in the final query, which had something like this:

select a, b, c, select count(*) from X GROUP BY Y as group_counts, count(*) as counts
FROM table
GROUP BY a,b,c, select count(*) from X GROUP BY Y as group_counts

The final GROUP BY couldn’t handle the select statement in the GROUP BY clause and you couldn’t refer to the column as ‘group_counts’ because that column doesn’t actually exist (as we see reported in MSSQL server from the original post)

So, the solution in that case was to break out the group counts into a separate subquery, that you join to the main FROM:

select a, b, c group_counts, count(*) as counts
FROM
(
select a, b, c, select count(*)  as group_counts
from table 
GROUP BY a, b, c
) t
GROUP BY a,b,c, group_counts

Something like that. My point is that not all DBMS have the same support for GROUP BY (sometimes you can use colum indexes, sometimes you can use complex expressions). By ‘hiding’ things beneath subqueries so that you get a ‘named’ column to group by, I feel like you ‘normalize’ the results into a form that is acceptable by all RDBMS.

Happy for you to prove me wrong tho!

-Chris

t