Hi Nihar,
While digging into scraping data from OHDSI-ATHENA website I came across APIs to interact with ATHENA. Below is the python code to create a pandas df on the results.
size is the pageSize in the query
number is the page in the query
numberOfElements is the actual number of elements returned
empty is a boolean that indicates if any elements have been retuned
import pandas as pd
import requests
import json
import datetime
baseUrl = “https://athena.ohdsi.org/api/v1/concepts”
df = pd.DataFrame({‘id’:pd.Series(dtype=int),
‘code’:pd.Series(dtype=‘str’),
‘name’:pd.Series(dtype=‘str’),
‘className’:pd.Series(dtype=‘str’),
‘standardConcept’:pd.Series(dtype=‘str’),
‘invalidReason’:pd.Series(dtype=‘str’),
‘domain’:pd.Series(dtype=‘str’),
‘vocabulary’:pd.Series(dtype=‘str’),
‘score’:pd.Series(dtype=‘str’)})
empty = False
pageNum=1
pageSize=10000
index=1
while not empty:
parameters = {“pageSize”:pageSize,“page”:pageNum,“query”:“aspirin”}
response = requests.get(baseUrl,params=parameters)
data=json.loads(response.content)
empty=data[‘empty’]
if empty:
break
now = datetime.datetime.now()
print(now.strftime("%H:%M:%S"),“page”,pageNum,“contains”,data[‘numberOfElements’],“elements”)
for i in range(0,data[‘numberOfElements’]):
df.loc[index]=data[‘content’][i]
index = index + 1
pageNum = pageNum + 1