join info from three databases, add tags, remove duplicates
join sci and sco
Join Info from SciF and Scopus
join wos and sci
Join Info from WoS and SciF
join info from wos and sci + sco
Join Info from WoS and SciF + Scopus
join wos and sco
Join Info from WoS and Scopus
Warning
All
ByName
remove duplicates
Remove potential duplicates
origin_count = IF [origin] = "sci, sco, wos" THEN 3
ELSEIF [origin]= "sci, wos...
sort by origin count (descending)
origin_count - Descending
add information from sco and wos to data set and calculate average citations
add wos data
add sco data
join wos and sco data
Join WoS and Scopus data
Warning
All
ByName
union all data
Union fully joined data (Sco, WoS), Sco and WoS data
Simple
IsEmpty
title_wos
True
fixed
2024-02-19 14:05:23
0
2024-02-19 14:05:23
2024-02-19 14:05:23
IsEmpty([title_wos])
Warning
All
ByName
False
False
"doi","origin","origin_count","authors","title","year","journal","affiliations","citations_sco","citations_year_sco","citations_wos","citations_year_wos","citations_all_db_wos","citations_all_db_year_wos"
True
True
True
False
False
False
False
False
False
upper
citations_avg = round(([citations_sco]+[citations_wos]+[citations_all_db_wos])/3...
sco data cleaning
remove columns
Removal of unrequired columns
...
Filter out potential duplicates via DOI
\data_package\literature_analysis\database_download\scopus_full.csv
True
False
False
1
254
False
DoubleQuotes
,
False
28591
scopus_full.csv
citations per year calculation
years_since_published = 2024-[year_sco]
citations_year_sco = IF [years_since_pub...
remove columns
wos data cleaning
citations per year calculation
years_since_published = 2024-[year_wos]
citations_year_wos = IF [years_since_pub...
\data_package\literature_analysis\database_download\webofscience_full.xlsx|||`savedrecs$`
False
1
webofscience_full.xlsx
Query=`savedrecs$`
input: web of science
\data_package\literature_analysis\database_download\webofscience.xlsx|||`savedrecs$`
False
1
webofscience.xlsx
Query=`savedrecs$`
Simple
IsNotEmpty
doi
True
fixed
2024-02-19 11:59:31
0
2024-02-19 11:59:31
2024-02-19 11:59:31
!IsEmpty([doi])
input: scopus
\data_package\literature_analysis\database_download\scopus.xlsx|||`Sheet1$`
False
1
scopus.xlsx
Query=`Sheet1$`
Simple
IsNotEmpty
doi
True
fixed
2024-02-19 11:59:31
0
2024-02-19 11:59:31
2024-02-19 11:59:31
!IsEmpty([doi])
input: scifinder
\data_package\literature_analysis\database_download\scifinder.xlsx
False
1
scifinder.xlsx
Simple
IsNotEmpty
doi
True
fixed
2024-02-19 11:59:31
0
2024-02-19 11:59:31
2024-02-19 11:59:31
!IsEmpty([doi])
output: cleaned literature data set
\data_package\literature_analysis\output_data\cleaned_literature_data.tsv
CRLF
\t
Never
True
28591
True
generate surf file output
cleaned_literature_data.tsv
citations_year_avg - Descending
origin_count - Descending
Horizontal
literature_analysis