library(basictabler)
library(data.table)
library(dplyr)
library(ggalluvial)
library(ggplot2)
library(ggpubr)
library(grid)
library(htmlwidgets)
library(openxlsx)
library(plotly)
library(purrr)
library(qcc)
library(readr)
library(svglite)
library(tidyverse)
library(writexl)
1 Introduction
1.1 Content & Aim
This R script contains data quality control, data processing and figures for the p-XRF data collected from pottery excavated in 2023 and 2024 in Assur in the scope of the project New Archaeological Exploration of Assur led by by Prof. Dr. Karen Radner (LMU Munich). The samples were selected in the field by Dr. Jana Richter, Dr. Andrea Squitieri (LMU Munich), Dr. Florian Janoscha Kreppner & Ellen Coster (University of Münster). p-XRF data was handled by Dr. Michaela Schauer (LMU Munich/VIAS), measurements undertaken by Marco Wolf (LMU Munich). Related petrographic analysis was performed by Dr. Silvia Amicone (University of Tübingen) (Schauer and Amicone 2024).
The dataset is part of the second Exploring Assur publication (Schauer and Amicone 2025).
1.2 Device & Measurement Parameters
The data was collected using the Niton XL3t (No.97390), previously owned by LMU Munich’s Department of Cultural Studies and Archaeology and, since late 2023, hosted by LMU Munich’s Department of Geology. Three measurements were taken per sample on different areas of a fresh break, using the TestAllGeo mode with an 8 mm collimator and a measurement time of 300 seconds (60 seconds each for the main, low, and high filters, and 120 seconds for the light filter).
Data collection was conducted by Marco Wolf from August 9 to 11, 2023 (temperature: 28-35°C, relative humidity: %35-42) at the Dep. for Cultural Studies (Schauer and Amicone 2024) and from November 13 to 25, 2024 (temperature: 20°C; relative humidity: 35–42%) in the geology lab.
1.3 Standards used
In 2023, Standard NIST 2709a (National Institute of Standards & Technology 2018), in 2024 Standard SdAR-M2 (International Association of Geoanalysts 2015) was used.
1.4 Data processing
The data of both measurement campaign has been compiled in one file and run though the precision assessment together. For accuracy assessment, due to the use of two different standards (see Section 1.3), the campaigns were analysed separately (see Section 5.1 and Section 5.2).
Following data quality assessment, the data was calibrated and major elements normalised to 100% (see Section 6).
1.4.1 Precision Assessment
To ensure the reliable application of the DQA procedures outlined here, LOD is replaced with ‘0’.
Precision is evaluated (see Section 4) through:
- Measurement Uncertainty (MU) is used to identify reliable elements and individual measurements (see Section 4.1). The benchmarks used is: Reported value ≥ ±2δ MU
- Coefficient of Variation (CV) is calculated for repeated measurements of the same sample to assess overall precision and determine which elements are reliable across the entire dataset (see Section 4.2). The benchmark is set based on the dataset. From the literature and the authors own work, a benchmark of 20% for archaeological pottery, stone, daub and in-situ measurements of soil are appropriate.
In both cases no more than 20% of individual measurements for any given chemical element and no more than 20% of elements considered reliable for a single measurement (MU) or sample (CV) may exceed the benchmarks.
1.4.2 Accuracy Assessment
For accuracy assessment (see Section 5), the following methods are employed to monitor the device performance based on the two standards (see Section 1.3) used for this project:
- Reference values (nominal values) are calculated to define the baseline of comparison (see Section 5.1.1 and Section 5.2.1). Control limits are set at mean ±2δSD.
- Shewhart Control Charts (SCCs) are constructed using said reference values to monitor device performance (see Section 5.1.2 and Section 5.2.2). Three out of five selected elements with CV ≤ 10% (1δSD) - Zn, Sr, Rb, Sr and Y - have to fall in the given value range.
- Relative Percentage Difference (%RD) is calculated to further evaluate accuracy (see Section 5.1.3 and Section 5.2.3). The %RD hast to be below 10% and above -10%.
1.4.3 Calibration & Normalisation
This dataset is calibrated using the coefficient correction coefcor IV (see Section 6) developed according to the Munich Procedure for this specific device using the Frankfurt pottery sample set (Schauer 2024). Major elements are normalised to a 100%.
1.5 Skript & Packages
This Quarto script (R Quarto v.1.5.55) (Allaire et al. 2024) was created using R v.4.4.1 (R Core Team 2024b) and RStudio v.2024.04.2 (RStudio Team 2024). The following R packages are used:
- basictabler (Bailiss 2021)
- data.table (Barrett et al. 2024)
- dplyr (Wickham 2023b)
- ggalluvial (Brunson and Read 2023)
- ggplot2 (Wilke 2024)
- ggpubr (Kassambara 2023)
- grid (R Core Team 2024a)
- htmlwidgets (Vaidyanathan et al. 2023)
- openxlsx (Schauberger and Walker 2024)
- plotly (Sievert 2020)
- purrr (Wickham 2023a)
- qcc (Scrucca 2004)
- readr (Wickham, Hester, and Bryan 2024)
- svglite (Wickham et al. 2023)
- tidyverse (Wickham et al. 2019)
- writexl (Ooms 2025)
Before running the analyses, all packages must be loaded (see Section 2) and the working directory set (see Section 3).
The script runs without errors, provided the predefined data structure is maintained.
2 Packages
3 Set working directory
::opts_knit$set(root.dir = "./") knitr
4 Precision Assessment
4.1 Measurement Uncertainty
4.1.1 Measurement Uncertainty of Measurements
# Load data
<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
data1
# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
<- data1 %>%
data mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>% # Ensure < LOD is replaced with 0
mutate(across(15:155, as.numeric)) # Ensure all values are numeric
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "Cl", "Sc", "V", "Cr", "Co", "Ni", "Cu", "Zn", "As", "Se", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Pd", "Ag", "Cd", "Sn", "Sb", "Te", "Cs", "Ba", "La", "Ce", "Hf", "Ta", "W", "Re", "Au", "Hg", "Pb", "Bi", "Th", "U", "Bal")
variables
# Calculate violations
for (variable in variables) {
<- data[data[, variable] < 2*data[, paste0(variable, ".Error")], ]
filter <- nrow(filter)
count assign(paste0("data", variable), filter)
assign(paste0("count", variable), count)
print(paste0("count", variable, ": ", count))}
[1] "countSi: 0"
[1] "countTi: 0"
[1] "countAl: 0"
[1] "countFe: 0"
[1] "countMn: 0"
[1] "countMg: 29"
[1] "countCa: 0"
[1] "countK: 0"
[1] "countP: 4"
[1] "countS: 2"
[1] "countCl: 30"
[1] "countSc: 43"
[1] "countV: 0"
[1] "countCr: 0"
[1] "countCo: 250"
[1] "countNi: 0"
[1] "countCu: 6"
[1] "countZn: 1"
[1] "countAs: 36"
[1] "countSe: 276"
[1] "countRb: 0"
[1] "countSr: 0"
[1] "countY: 0"
[1] "countZr: 0"
[1] "countNb: 0"
[1] "countMo: 223"
[1] "countPd: 236"
[1] "countAg: 109"
[1] "countCd: 75"
[1] "countSn: 17"
[1] "countSb: 16"
[1] "countTe: 18"
[1] "countCs: 9"
[1] "countBa: 0"
[1] "countLa: 11"
[1] "countCe: 9"
[1] "countHf: 283"
[1] "countTa: 283"
[1] "countW: 283"
[1] "countRe: 283"
[1] "countAu: 282"
[1] "countHg: 191"
[1] "countPb: 15"
[1] "countBi: 283"
[1] "countTh: 45"
[1] "countU: 142"
[1] "countBal: 0"
# Compile list of elements with at least one but less then 20% of violations
= list(dataMg,dataP,dataS,dataCl,dataSc,dataCu,dataZn,dataAs,dataSn,dataSb,dataTe,dataCs,dataLa,dataCe,dataPb,dataTh)
l <-rbindlist(l, use.names=TRUE, fill=TRUE)
Tab1
# Export to .csv
write.csv(Tab1,"../Data//Assur_Pottery_2024_SD_IndivMeas.csv",row.names=FALSE)
Interpretation: Following the data quality criteria defined in Section 1.4.1, elements showing more then 56 measurements (20% of 283 measurements) violating the criteria, should not be used - or if necessary - only with great caution. This is the case for Co, Se, Mo, Pd, Ag, Cd, Hf, Ta, W, Re, Au, Hg, Bi and U. Therefore, from the original 47 elements, 33 remain. This means each sample can’t violate the criterium for more then 6 elements (20% of 33). Individual measurements violating this bench marks are only 656 (AS085) and 691 (AS074). Therefore, the related samples have to be viewed with caution in the next steps of data quality checking.
4.1.2 Measurement Uncertainty of Samples
# Load data
<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
data1
# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
<- data1 %>%
data mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>% # Ensure < LOD is replaced with 0
mutate(across(15:155, as.numeric)) # Ensure all values are numeric
# Calculate mean per sample
<-(data) %>%
Meangroup_by(SAMPLE) %>%
summarize(across(everything(),list(mean=mean)))
# Delete "_mean" from the column name
colnames(Mean) <- gsub("_mean", "", colnames(Mean))
# Define dataset
<- Mean
data
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "Cl", "Sc", "V", "Cr", "Co", "Ni", "Cu", "Zn", "As", "Se", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Pd", "Ag", "Cd", "Sn", "Sb", "Te", "Cs", "Ba", "La", "Ce", "Hf", "Ta", "W", "Re", "Au", "Hg", "Pb", "Bi", "Th", "U", "Bal")
variables
# Calculate violations
for (variable in variables) {
<- data[data[, variable] < 2 * data[, paste0(variable, ".Error")], ]
filter <- nrow(filter)
count assign(paste0("data", variable), filter)
assign(paste0("count", variable), count)
print(paste0("count", variable, ": ", count))}
[1] "countSi: 0"
[1] "countTi: 0"
[1] "countAl: 0"
[1] "countFe: 0"
[1] "countMn: 0"
[1] "countMg: 1"
[1] "countCa: 0"
[1] "countK: 0"
[1] "countP: 0"
[1] "countS: 0"
[1] "countCl: 1"
[1] "countSc: 2"
[1] "countV: 0"
[1] "countCr: 0"
[1] "countCo: 73"
[1] "countNi: 0"
[1] "countCu: 1"
[1] "countZn: 0"
[1] "countAs: 11"
[1] "countSe: 77"
[1] "countRb: 0"
[1] "countSr: 0"
[1] "countY: 0"
[1] "countZr: 0"
[1] "countNb: 0"
[1] "countMo: 73"
[1] "countPd: 72"
[1] "countAg: 40"
[1] "countCd: 22"
[1] "countSn: 3"
[1] "countSb: 3"
[1] "countTe: 2"
[1] "countCs: 0"
[1] "countBa: 0"
[1] "countLa: 3"
[1] "countCe: 1"
[1] "countHf: 77"
[1] "countTa: 77"
[1] "countW: 77"
[1] "countRe: 77"
[1] "countAu: 77"
[1] "countHg: 56"
[1] "countPb: 9"
[1] "countBi: 77"
[1] "countTh: 17"
[1] "countU: 46"
[1] "countBal: 0"
# Compile list of elements with at least one but less then 20% of violations
= list(dataMg,dataCl,dataSc,dataCu,dataAs,dataSn,dataSb,dataTe,dataLa,dataCe,dataPb)
l <-rbindlist(l, use.names=TRUE, fill=TRUE)
Tab1
# Export to .csv
write.csv(Tab1,"../Data//Assur_Pottery_2024_SD_Samples.csv",row.names=FALSE)
Interpretation: Following the data quality criteria defined in Section 1.4.1, elements showing more then 15 samples (20% of 77 samples) violating the criteria, should not be used - or if necessary - only with great caution. This is the case for Co, Se, Mo, Pd, Ag, Cd, Hf, Ta, W, Re, Au, Hg, Bi, Th and U. Therefore, from the original 47 elements, 32 remain. This means each sample can’t violate the criterium for more then 6 elements (20% of 32). No samples violate this criterium.
4.2 Coefficient of Variation
4.2.1 Create Number-Coded Spreadsheet
# Load data
<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
data1
# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
<- data1 %>%
data mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>% # Ensure < LOD is replaced with 0
mutate(across(15:155, as.numeric)) # Ensure all values are numeric
# CV function
<- function(x) { (sd(x) / mean(x)) * 100 }
CV
# Create the SampleControl1 dataframe with CV for each group
<-(data) %>%
SampleControl1group_by(SAMPLE) %>%
summarize(across(everything(),list(CV=CV)))
# Define columns
<-SampleControl1[,c("SAMPLE","Si_CV","Ti_CV","Al_CV","Fe_CV","Mn_CV","Mg_CV","Ca_CV","K_CV","P_CV","V_CV","Cr_CV","Ni_CV","Cu_CV","Zn_CV","Rb_CV","Sr_CV","Y_CV","Zr_CV","Nb_CV","Ba_CV","Pb_CV")]
Control_CV
# Replace missing values with "99"
is.na(Control_CV)] <- 99
Control_CV[
# Replace values below 20 with 0, above 20 with 1 and define sample name as last column
<- cbind(apply(Control_CV[,2:22], 2, function(x) ifelse(x < 20, 0, x)), Control_CV[,1])
Control_CV_0 <- cbind(apply(Control_CV_0[,1:21], 2, function(x) ifelse(x > 20, 1, x)), Control_CV_0[,22])
Control_CV_0_1
# Create new data frame
<- as.data.frame(Control_CV_0_1)
Control_CV_2
# Rename last column as "SAMPLE"
<-Control_CV_2 %>% rename(SAMPLE=22)
Control_CV_2
# Define all other columns as numeric
1:21] <- sapply(Control_CV_2[,1:21],as.numeric)
Control_CV_2[,
# Calculate row sum excluding specific columns
<-Control_CV_2%>%mutate(rows_sum=rowSums(.[, !(names(.) %in% c("SAMPLE","Mg_CV","P_CV","Cu_CV","Zn_CV","Ba_CV","Pb_CV"))]))
Control_CV_3
# Calculate column sums except for column "SAMPLE"
<- colSums(Control_CV_3[, !colnames(Control_CV_3) %in% "SAMPLE"])
col_sum <- c(col_sum,0)
col_sum <- rbind(Control_CV_3,col_sum)
Control_CV_4
# Calculate column percent
<-(nrow(Control_CV_4)-1)
rown<-(100/rown*col_sum)
percent<- rbind(Control_CV_4,percent)
Control_CV_5
# Calculate row percent
<-Control_CV_5$rows_sum
rows_sum<-ncol(Control_CV_5)
coln<-(100/(coln-7)*rows_sum)
rowper<- cbind(Control_CV_5,rowper)
Control_CV_6
# Round values function
<- function(x) {if(is.numeric(x)) {return(round(x))} else {
Numform return(x)}}
# Create new dataframe
<- as.data.frame(lapply(Control_CV_6,Numform))
Control_CV_6
# Apply rounding function
<-Control_CV_6[order(Control_CV_6$rowper, decreasing = TRUE),]
Control_CV_6
# Add column sum and percent information in specific cell
<-"Column sum"
text78,22]<-text
Control_CV_6[
<-"Column percent"
text79,22]<-text
Control_CV_6[
# Export to .csv
write.csv(Control_CV_6,"../Data//Assur_Pottery_2024_CV_numcode.csv",row.names=FALSE)
Interpretation: Following the data control, the elements Si, Ti, Al, Fe, Mn, Ca, K, V, Cr, Ni, Rb, Sr, Y, Zr and Nb are adhering to the quality criteria defined in Section 1.4.1. The same is true for all samples.
4.2.2 Create Color-Coded Spreadsheet
# Create new spreadsheet
<- BasicTable$new()
tbl
# Add data from Control_CV
$addData(Control_CV, firstColumnAsRowHeaders=TRUE)
tbl
# Define Cells
<- tbl$getCells(rowNumbers=2:77, columnNumbers=2:22, matchMode="combinations")
cells
# Apply the styling
$mapStyling(cells=cells, styleProperty="background-color", valueType="color",mapType="logic",mappings=list("v>20", "red", "v<=20", "black"))
tbl
# Render the table with the applied styles
$renderTable() tbl
# Save the table to an Excel workbook
<- createWorkbook(creator = Sys.getenv("M. Schauer"))
wb addWorksheet(wb, "Data")
$writeToExcelWorksheet(wb=wb, wsName="Data",topRowNumber=1, leftMostColumnNumber=1, applyStyles=TRUE)
tbl
# Export to .xlsx
saveWorkbook(wb, file="../Data//Assur_Pottery_2024_CV_colcode.xlsx", overwrite = TRUE)
Comment: This spreadsheet is important during data interpretation to understand outlieres. If those can be defined, it is worth comming back to this spreadsheet to check whether the distinguishing elements for the specific sample are adhering to the quality criteria defined in Section 1.4.1. If not, the spread has to be checked to identify if the outlier is due to its specific chemistry or a too high variation in the measurements which might have been caused by high heterogeneity, roughness etc. of the sample.
5 Accuracy Assessment
5.1 Standard 2709a Accuracy
This procedure follows the steps outlined in Section 1.4.2.
5.1.1 Calculate reference value (nominal value)
5.1.1.1 Compute metrics of reference values
The name of the .csv-spreadsheet includes the purpose the measurements (reference value - ref_val_meas) were made for, the standard they are based on (2709a), month and year of measurement (Mai 2023 - mai2023) as well as the location (Munich): ref_val_2709a_mai2023_munich.csv. This makes the file easy to identify for future use.
# Load data
<- read.csv("../Data//ref_val_meas_2709a_mai2023_munich.csv")
data1
# Replace '< LOD' with 0 in each column
<- data1 %>%
data2 mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric)) # Ensure all values are numeric
# Remove first 16 columns (no elemental data)
<- data2[, -c(1:14)]
data
# Replace '.' with '_' in column names
<- data %>%
data rename_with(~ gsub("\\.", "_", .x))
# Define statistical functions
<- function(x) (sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
CV <- function(x) sd(x, na.rm = TRUE) / sqrt(length(x))
SEM <- function(x) mean(x, na.rm = TRUE) + sd(x, na.rm = TRUE)
UCL <- function(x) mean(x, na.rm = TRUE) - sd(x, na.rm = TRUE)
LCL <- function(x) (2*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
CV2 <- function(x) mean(x, na.rm = TRUE) + 2*sd(x, na.rm = TRUE)
UCL2 <- function(x) mean(x, na.rm = TRUE) - 2*sd(x, na.rm = TRUE)
LCL2 <- function(x) (3*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
CV3 <- function(x) mean(x, na.rm = TRUE) + 3*sd(x, na.rm = TRUE)
UCL3 <- function(x) mean(x, na.rm = TRUE) - 3*sd(x, na.rm = TRUE)
LCL3
# Compute summary statistics
<- data %>%
NominalSdARM2 summarize(across(everything(), list(
Mean = ~ mean(.x, na.rm = TRUE),
Sd = ~ sd(.x, na.rm = TRUE),
SEM = ~ SEM(.x),
CV = ~ CV(.x),
LCL = ~ LCL(.x),
UCL = ~ UCL(.x),
CV2 = ~ CV2(.x),
LCL2 = ~ LCL2(.x),
UCL2 = ~ UCL2(.x),
CV3 = ~ CV3(.x),
LCL3 = ~ LCL3(.x),
UCL3 = ~ UCL3(.x),
MD = ~ median(.x, na.rm = TRUE)
)))
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
Element "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Function to clean column names
<- function(df, prefix) {
cleanColumnNames colnames(df) <- gsub(paste0(prefix, "_"), "", colnames(df))
return(df)
}
# Function to extract and clean data for each element
<- function(element, df) {
extract_element_data <- grep(paste0("^", element, "_(Mean|Sd|SEM|CV|LCL|UCL|CV2|LCL2|UCL2|CV3|LCL3|UCL3|MD|Error_Mean|Error_Sd|Error_MD)"),
cols colnames(df), value = TRUE)
if (length(cols) > 0) {
<- df %>%
df_out select(all_of(cols)) %>%
cleanColumnNames(element)
# Add the element name as the first column
<- cbind(Element = element, df_out)
df_out return(df_out)
else {
} return(data.frame(Element = element, matrix(NA, nrow = 1, ncol = length(cols))))
}
}
# Process all elements & create final table
<- map_dfr(Element, extract_element_data, df = NominalSdARM2) Tab3
5.1.1.2 Create formated spreadsheet with all metrics
# Replace "Error" with "MU" in column names
colnames(Tab3) <- gsub("Error", "MU", colnames(Tab3))
# Rounding of values
<- c("Mean", "Sd", "SEM", "CV", "UCL", "LCL","CV2", "UCL2", "LCL2", "CV3",
numeric_cols "UCL3", "LCL3","MD", "MU_Mean", "MU_Sd", "MU_MD")
for (col in numeric_cols) {
if (col %in% colnames(Tab3)) {
<- round(Tab3[[col]],
Tab3[[col]] ifelse(col %in% c("CV", "CV2", "CV3"), 2,
ifelse(col %in% c("UCL", "LCL", "UCL2", "LCL2", "UCL3", "LCL3"), 0, 1)))
}
}
# Print table
print(Tab3)
Element Mean Sd SEM CV LCL UCL CV2 LCL2 UCL2
1 Si 295284.0 3205.2 966.4 1.09 292079 298489 2.17 288874 301694
2 Ti 3730.5 34.9 10.5 0.93 3696 3765 1.87 3661 3800
3 Al 87220.5 2338.5 705.1 2.68 84882 89559 5.36 82544 91898
4 Fe 34286.9 99.7 30.1 0.29 34187 34387 0.58 34088 34486
5 Mn 492.0 14.9 4.5 3.03 477 507 6.07 462 522
6 Mg 25548.6 4147.1 1250.4 16.23 21402 29696 32.46 17254 33843
7 Ca 19666.9 88.2 26.6 0.45 19579 19755 0.90 19490 19843
8 K 19497.5 167.1 50.4 0.86 19330 19665 1.71 19163 19832
9 P 384.9 32.4 9.8 8.42 353 417 16.85 320 450
10 S 441.2 9.7 2.9 2.21 431 451 4.42 422 461
11 V 137.6 8.8 2.6 6.37 129 146 12.73 120 155
12 Cr 107.6 4.8 1.4 4.42 103 112 8.84 98 117
13 Ni 90.2 4.9 1.5 5.43 85 95 10.86 80 100
14 Cu 36.3 2.9 0.9 8.06 33 39 16.12 30 42
15 Zn 88.3 2.1 0.6 2.42 86 90 4.83 84 93
16 As 10.0 0.7 0.2 7.10 9 11 14.21 9 11
17 Rb 85.8 1.6 0.5 1.89 84 87 3.79 83 89
18 Sr 223.2 1.6 0.5 0.74 222 225 1.47 220 226
19 Y 15.3 0.6 0.2 3.83 15 16 7.67 14 16
20 Zr 141.0 1.2 0.3 0.82 140 142 1.64 139 143
21 Nb 7.6 0.6 0.2 7.85 7 8 15.70 6 9
22 Mo 1.3 1.2 0.4 96.87 0 3 193.73 -1 4
23 Ba 733.1 18.5 5.6 2.53 715 752 5.06 696 770
24 Pb 16.3 1.1 0.3 6.81 15 17 13.62 14 19
25 Th 3.2 0.3 0.1 8.10 3 3 16.20 3 4
26 U 5.9 1.1 0.3 18.27 5 7 36.55 4 8
CV3 LCL3 UCL3 MD MU_Mean MU_Sd MU_MD
1 3.26 285668 304900 295988.7 1035.9 7.3 1036.5
2 2.80 3626 3835 3732.6 64.3 0.5 64.3
3 8.04 80205 94236 87879.1 1493.9 42.2 1492.1
4 0.87 33988 34586 34293.2 213.0 1.4 212.8
5 9.10 447 537 499.3 36.2 0.3 36.3
6 48.70 13107 37990 24942.2 3954.6 100.1 3958.3
7 1.35 19402 19932 19691.9 180.4 1.0 180.1
8 2.57 18996 19999 19429.5 253.5 2.2 253.3
9 25.27 288 482 396.5 67.8 0.8 67.6
10 6.62 412 470 439.8 16.6 0.2 16.5
11 19.10 111 164 136.7 19.1 0.2 19.0
12 13.26 93 122 107.3 9.4 0.1 9.4
13 16.29 76 105 91.0 14.2 0.1 14.2
14 24.18 28 45 36.7 7.6 0.1 7.6
15 7.25 82 95 88.1 5.9 0.1 5.9
16 21.31 8 12 10.1 2.3 0.0 2.3
17 5.68 81 91 85.5 2.4 0.0 2.4
18 2.21 218 228 223.0 3.1 0.0 3.1
19 11.50 14 17 15.4 1.5 0.0 1.5
20 2.46 138 144 141.3 2.8 0.0 2.8
21 23.55 6 9 7.5 1.6 0.0 1.6
22 290.60 -2 5 2.1 1.6 0.3 1.3
23 7.59 677 789 735.3 19.2 0.1 19.2
24 20.44 13 20 16.7 2.8 0.0 2.8
25 24.30 2 4 3.2 0.7 0.0 0.7
26 54.82 3 9 5.4 2.3 0.0 2.3
# Export to .csv
write.csv(Tab3, "../Data//metrics_ref_val_2709a_mai2023_munich.csv", row.names = FALSE)
5.1.1.3 Export reference values for daily device accuracy testing
This selection only includes the relevant elements and the values needed to define the test range in which the measurement values should lie: Element name with UCL and LCL of the 1δSD-range and UCL2/LCL2 of the 2δSD-range. You can choose the elements to export by adding them in the code for selected_elements.
# Select only necessary columns
<- Tab3 %>% select(Element, LCL,UCL,LCL2,UCL2,LCL3,UCL3)
dataset1
# Filter for the specific elements dynamically
<- c("Rb", "Sr", "Y", "Zn", "Zr")
selected_elements <- dataset1 %>% filter(Element %in% selected_elements)
dataset
# Export to .csv
write.csv(dataset, "../Data//test_range_ref_val_2709a_mai2023_munich.csv", row.names = FALSE)
Comment: This spreadsheet can be used to compare the mean of three standard measurements, as calculated by the device and displayed on the screen, with the test range of the selected elements.Three out of five elements have to be in the 2δSD-range for the test for accuracy to be successful.
5.1.1.4 Create spreadsheet for Shewhart Control Charts
# Transpose the data
<- as.data.frame(t(Tab3))
Tab4
# Store the original row names (before transposing)
<- rownames(Tab3)
row_names
# Store the original column names (before transposing)
<- colnames(Tab3)
col_names
# Assign the original row names as column names of the transposed data
colnames(Tab4) <- row_names
# Assign the original column names as row names of the transposed data
rownames(Tab4) <- col_names
# Set the first row as the column names (this will use the values of the first row as column headers)
colnames(Tab4) <- Tab4[1, ]
# Remove the first row (since it is now used as the column names)
<- Tab4[-1, ]
Tab4
# # Export to .csv
write.csv(Tab4, "../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv", row.names = TRUE)
5.1.2 Compute Shewhart Control Charts
# Load and filter data
<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
data <- subset(data, SAMPLE == "2709a")
filtered_data
# Transform date field
<- filtered_data |>
data1 mutate(
Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"), # 24-hour format
Time = format(Time, "%m/%d/%Y") # Output as mm/dd/yyyy
)
# Replace '< LOD' with 0 in each column
<- data1 %>%
data2 mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric))
# Remove first 16 columns (no elemental data)
<- data2[, -c(1:14)]
data
# Load data
<- read.csv("../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv")
data1
# Create PDF for all SCC plots
pdf("../Figures//SCC_2709a_accuracy_plots.pdf", width = 8, height = 6)
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
selected_columns "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
for (col in selected_columns) {
# Extract the data for the current column
<- data[, col]
chartdata
# Dynamically get metrics for current column from data1
<- data1[1, col]
CL <- data1[2, col]
SD <- data1[5, col]
LCL <- data1[6, col]
UCL <- data1[8, col]
LCL2 <- data1[9, col]
UCL2 <- data1[11, col]
LCL3 <- data1[12, col]
UCL3
# Round to 0 decimal places
<- round(CL, 0)
CL <- round(SD, 0)
SD <- round(LCL, 0)
LCL <- round(UCL, 0)
UCL <- round(LCL2, 0)
LCL2 <- round(UCL2, 0)
UCL2 <- round(LCL3, 0)
LCL3 <- round(UCL3, 0)
UCL3
# Create the control chart
<- qcc(
qcc_result data = chartdata,
type = "xbar",
nsigmas = 1,
std.dev = SD,
center = CL,
rules = NULL,
plot = FALSE
)
# Create the plot for the current column
plot(qcc_result, restore.par = FALSE,
title = paste("SCC for", col),
ylab = "Concentration (ppm)",
xlab = "Values of 2709a throughout the 2023 measurement campaign")
# Add 3-Sigma control limits
abline(h = c(UCL3, LCL3), lty = 3, col = "black")
text(x = length(chartdata) * 1, y = UCL3, labels = "UCL3", pos = 3, col = "black", cex = 0.8)
text(x = length(chartdata) * 1, y = LCL3, labels = "LCL3", pos = 1, col = "black", cex = 0.8)
# Add 2-Sigma control limits
abline(h = c(UCL2, LCL2), lty = 3, col = "black")
text(x = length(chartdata) * 1, y = UCL2, labels = "UCL2", pos = 3, col = "black", cex = 0.8)
text(x = length(chartdata) * 1, y = LCL2, labels = "LCL2", pos = 1, col = "black", cex = 0.8)
# Highlight points based on thresholds
<- chartdata < UCL | chartdata > LCL
below_1sigma <- chartdata > UCL | chartdata < LCL
below_2sigma <- chartdata > UCL3 | chartdata < LCL3
above_3sigma <- (chartdata > UCL2 & chartdata <= UCL3) | (chartdata < LCL2 & chartdata >= LCL3)
above_2sigma
# Overlay red points for values in 1-sigma
points(which(below_1sigma), chartdata[below_1sigma], col = "lightgreen", pch = 19, cex = 1.2)
# Overlay yellow points for values between 1-sigma and 2-sigma
points(which(below_2sigma), chartdata[below_2sigma], col = "lightblue", pch = 19, cex = 1.2)
# Overlay yellow points for values between 2-sigma and 3-sigma
points(which(above_2sigma), chartdata[above_2sigma], col = "orange", pch = 19, cex = 1.2)
# Overlay red points for values beyond 3-sigma
points(which(above_3sigma), chartdata[above_3sigma], col = "red", pch = 19, cex = 1.2)
}
# Close the PDF file
dev.off()
Shewhart Control Charts
Interpretation: Skimming through the SCCs of the same standard throughout the 2023 measurement campaign reveals that the measurement values do vary. Many values fall within the ±1δSD range (green), with some extending into the ±1δSD to ±2δSD range (blue) but also ±2δSD to ±3δSD range (orange). Especially for the light elements, some major drifts and values beyond the quality criterium are visible, making it clear that all interpretation has to be checked in detail with other information such as petrography and archaeological background information (see also Schauer and Amicone 2024).
5.1.3 Compute Relative Percentage Difference (%RD)
5.1.3.1 Calculate %RD
# Load and filter data
<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
data <- subset(data, SAMPLE == "2709a")
filtered_data
# Transform date field
<- filtered_data |>
data1 mutate(
Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"), # 24-hour format
Time = format(Time, "%m/%d/%Y") # Output as mm/dd/yyyy
)
# Replace '< LOD' with 0 in each column
<- data1[, -c(1:14)] %>%
data2 mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric))
# Merge data
<-cbind(data1[, c(1:14)],data2)
data3
# Calculate mean
<- data3 %>%
mean_data group_by(Time) %>%
summarize(across(everything(), mean, na.rm = TRUE))
# Remove first 16 columns (no elemental data)
<- mean_data[, -c(1:14)]
data4
# Merge data
<-cbind(mean_data[, c(1)],data4)
data
# Load second dataset
<- read.csv("../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv")
data1
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
selected_columns "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Loop over each selected column
for (col in selected_columns) {
# Extract the data for the current column by name
<- data[, col] # This extracts the column by name (which is a string)
mean_val
# Dynamically get the reference value for the current column from data1 (assuming row 1 contains reference values)
<- data1[1, col] # Reference value for the column from data1
CL
# Define the percentage difference function (element-wise operation)
<- function(x) {
percentage_difference return(((x - CL) / CL) * 100) # Apply the percentage difference formula
}
# Apply the percentage difference function to each element in mean_val
<- sapply(mean_val, percentage_difference)
percentage_diff
# Optionally, store the result in a new column in the data
paste(col, "percentage_diff", sep = "_")] <- percentage_diff
data[,
}
# Select necessary columns, round values (except 'Time'), and remove '_percentage_diff' from column names
<- data[, c(1, 143:168)] %>%
data_RD mutate(across(-which(names(.) == "Time"), round, 0)) %>% # Round all columns except 'Date'
setNames(gsub("_percentage_diff", "", colnames(.))) %>% # Remove '_percentage_diff' from column names
setNames(gsub("^Time$", "%RD", colnames(.))) # Replace "Time" with "%RD"
5.1.3.2 Format %RD-spreadsheet
# Create new spreadsheet
<- BasicTable$new()
tbl $addData(data_RD, firstColumnAsRowHeaders=TRUE)
tbl
# Get the number of rows and columns in the dataset
<- nrow(data_RD)
num_rows <- ncol(data_RD)
num_columns
# Find the index of the column with the header "%RD"
<- which(names(data_RD) == "%RD")
rd_column_index
# Define the cells for which the styling needs to be applied (excluding the '%RD' column)
<- setdiff(2:num_columns, rd_column_index)
column_indices_to_style
# Ensure the last row is included in styling
<- tbl$getCells(
cells rowNumbers = 2:(num_rows + 1),
columnNumbers = column_indices_to_style,
matchMode = "combinations"
)
# Apply the styling
$mapStyling(
tblcells = cells,
styleProperty = "background-color",
valueType = "color",
mapType = "logic",
mappings = list(
"v >= -3 & v <= 3", "lightgreen", # Values between -3 and 3
"v > -7 & v < -3", "lightblue", # Values between -7 and -3
"v > 3 & v <= 7", "lightblue", # Values between 3 and 7
"v > 7 & v <= 10", "yellow", # Values between 7 and 10
"v > -10 & v <= -7", "yellow", # Values between -10 and -7
"v >= 10", "firebrick", # Values greater than 10
"v <= -10", "firebrick" # Values less than -10
)
)
# Render the table with the applied styles
$renderTable() tbl
# Save the table to an Excel workbook
<- createWorkbook(creator = Sys.getenv("M. Schauer"))
wb addWorksheet(wb, "Data")
$writeToExcelWorksheet(wb = wb, wsName = "Data", topRowNumber = 1, leftMostColumnNumber = 1, applyStyles = TRUE)
tbl
# Export to .xlsx
saveWorkbook(wb, file = "../Data//relRD_2709a_mai2023_munich.xlsx", overwrite = TRUE)
Interpretation: The %RD demonstrates excellent accuracy (between -3 and +3) for the elements Si, Fe, Ca, K, Zn, Rb and Sr, with some elements — Ti, Al, S, Cr, Ni, Nb, Ba and Pb — also including highly accurate %RD values (3 to 7, -3 to -7). Mn, V, As and Zr show some %RD values of good accuracy (7 to 10, -7 to -10), whereas for Mg, P, Cu, Y, Mo, Th and U, the %RD values are not accurate enough (higher than 10, lower than -10). Therefore, these latter elements are not measured with sufficient accuracy and should either not be used or, if used, be handled with great care in further analysis.
5.1.3.3 Create figures of %RD
# Load data
<- read.xlsx("../Data//relRD_2709a_mai2023_munich.xlsx")
data1
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
selected_columns "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Create a PDF file to save all plots
pdf("../Figures//relRD_2709a_mai2023_munich_plots.pdf", width = 10, height = 6)
# Loop through each selected column and create plots
for (col in selected_columns) {
# Extract data for plotting
<- data.frame(
plot_data Day = as.factor(data1$`%RD`), # Treat measurement days as categorical on X-axis
Value = data1[[col]]
)
# Create scatter plot with lines connecting measurement days
<- ggplot(plot_data, aes(x = Day, y = Value, group = 1)) +
p geom_point() +
geom_line() + # Connect points across measurement days
geom_hline(yintercept = 0, linetype = "dashed", color = "red") + # Reference line
labs(title = paste("%RD-Plot for", col), x = "Measurement Day", y = col) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) # Tilt x-axis labels
# Display the plot in R
print(p)
}
# Close the PDF device
dev.off()
%RD plots
5.2 Standard SdAR-M2 Accuracy
This procedure follows the steps outlined in Section 1.4.2.
5.2.1 Calculate reference value (nominal value)
5.2.1.1 Compute metrics of reference values
The name of the .csv-spreadsheet includes the purpose the measurements (reference value - ref_val_meas) were made for, the standard they are based on (SdAR-M2), month and year of measurement (November 2024 - nov2024) as well as the location (Munich): ref_val_SdAR-M2_nov2024_munich.csv. This makes the file easy to identify for future use.
# Load data
<- read.csv("../Data//ref_val_meas_SdAR-M2_nov2024_munich.csv")
data1
# Replace '< LOD' with 0 in each column
<- data1 %>%
data2 mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric)) # Ensure all values are numeric
# Remove first 16 columns (no elemental data)
<- data2[, -c(1:14)]
data
# Replace '.' with '_' in column names
<- data %>%
data rename_with(~ gsub("\\.", "_", .x))
# Define statistical functions
<- function(x) (sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
CV <- function(x) sd(x, na.rm = TRUE) / sqrt(length(x))
SEM <- function(x) mean(x, na.rm = TRUE) + sd(x, na.rm = TRUE)
UCL <- function(x) mean(x, na.rm = TRUE) - sd(x, na.rm = TRUE)
LCL <- function(x) (2*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
CV2 <- function(x) mean(x, na.rm = TRUE) + 2*sd(x, na.rm = TRUE)
UCL2 <- function(x) mean(x, na.rm = TRUE) - 2*sd(x, na.rm = TRUE)
LCL2 <- function(x) (3*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
CV3 <- function(x) mean(x, na.rm = TRUE) + 3*sd(x, na.rm = TRUE)
UCL3 <- function(x) mean(x, na.rm = TRUE) - 3*sd(x, na.rm = TRUE)
LCL3
# Compute summary statistics
<- data %>%
NominalSdARM2 summarize(across(everything(), list(
Mean = ~ mean(.x, na.rm = TRUE),
Sd = ~ sd(.x, na.rm = TRUE),
SEM = ~ SEM(.x),
CV = ~ CV(.x),
LCL = ~ LCL(.x),
UCL = ~ UCL(.x),
CV2 = ~ CV2(.x),
LCL2 = ~ LCL2(.x),
UCL2 = ~ UCL2(.x),
CV3 = ~ CV3(.x),
LCL3 = ~ LCL3(.x),
UCL3 = ~ UCL3(.x),
MD = ~ median(.x, na.rm = TRUE)
)))
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
Element "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Function to clean column names
<- function(df, prefix) {
cleanColumnNames colnames(df) <- gsub(paste0(prefix, "_"), "", colnames(df))
return(df)
}
# Function to extract and clean data for each element
<- function(element, df) {
extract_element_data <- grep(paste0("^", element, "_(Mean|Sd|SEM|CV|LCL|UCL|CV2|LCL2|UCL2|CV3|LCL3|UCL3|MD|Error_Mean|Error_Sd|Error_MD)"),
cols colnames(df), value = TRUE)
if (length(cols) > 0) {
<- df %>%
df_out select(all_of(cols)) %>%
cleanColumnNames(element)
# Add the element name as the first column
<- cbind(Element = element, df_out)
df_out return(df_out)
else {
} return(data.frame(Element = element, matrix(NA, nrow = 1, ncol = length(cols))))
}
}
# Process all elements & create final table
<- map_dfr(Element, extract_element_data, df = NominalSdARM2) Tab3
5.2.1.2 Create formated spreadsheet with all metrics
# Replace "Error" with "MU" in column names
colnames(Tab3) <- gsub("Error", "MU", colnames(Tab3))
# Rounding of values
<- c("Mean", "Sd", "SEM", "CV", "UCL", "LCL","CV2", "UCL2", "LCL2", "CV3",
numeric_cols "UCL3", "LCL3","MD", "MU_Mean", "MU_Sd", "MU_MD")
for (col in numeric_cols) {
if (col %in% colnames(Tab3)) {
<- round(Tab3[[col]],
Tab3[[col]] ifelse(col %in% c("CV", "CV2", "CV3"), 2,
ifelse(col %in% c("UCL", "LCL", "UCL2", "LCL2", "UCL3", "LCL3"), 0, 1)))
}
}
# Print table
print(Tab3)
Element Mean Sd SEM CV LCL UCL CV2 LCL2 UCL2
1 Si 377487.7 7948.9 2513.7 2.11 369539 385437 4.21 361590 393385
2 Ti 1578.7 43.7 13.8 2.77 1535 1622 5.54 1491 1666
3 Al 75023.3 3821.6 1208.5 5.09 71202 78845 10.19 67380 82666
4 Fe 17581.0 171.4 54.2 0.98 17410 17752 1.95 17238 17924
5 Mn 847.6 17.1 5.4 2.01 831 865 4.02 813 882
6 Mg 3280.3 4261.1 1347.5 129.90 -981 7541 259.80 -5242 11803
7 Ca 5233.6 134.4 42.5 2.57 5099 5368 5.14 4965 5502
8 K 41940.1 204.7 64.7 0.49 41735 42145 0.98 41531 42349
9 P 242.7 94.4 29.8 38.88 148 337 77.75 54 431
10 S 783.4 44.3 14.0 5.65 739 828 11.30 695 872
11 V 50.7 5.1 1.6 10.06 46 56 20.11 41 61
12 Cr 49.2 7.2 2.3 14.60 42 56 29.19 35 64
13 Ni 81.3 10.1 3.2 12.40 71 91 24.80 61 101
14 Cu 213.2 4.7 1.5 2.19 209 218 4.39 204 223
15 Zn 699.6 7.1 2.2 1.01 692 707 2.03 685 714
16 As 80.6 4.8 1.5 5.95 76 85 11.90 71 90
17 Rb 143.9 1.3 0.4 0.87 143 145 1.74 141 146
18 Sr 146.2 1.2 0.4 0.84 145 147 1.68 144 149
19 Y 30.8 0.9 0.3 3.07 30 32 6.14 29 33
20 Zr 251.0 5.0 1.6 1.99 246 256 3.98 241 261
21 Nb 25.2 0.5 0.2 1.99 25 26 3.98 24 26
22 Mo 13.8 1.0 0.3 7.16 13 15 14.31 12 16
23 Ba 783.6 14.6 4.6 1.87 769 798 3.74 754 813
24 Pb 850.9 7.3 2.3 0.86 844 858 1.72 836 865
25 Th 6.7 0.5 0.2 7.26 6 7 14.51 6 8
26 U 7.4 1.8 0.6 24.73 6 9 49.46 4 11
CV3 LCL3 UCL3 MD MU_Mean MU_Sd MU_MD
1 6.32 353641 401334 375499.0 1233.5 30.8 1235.4
2 8.31 1448 1710 1569.1 42.3 0.8 42.3
3 15.28 63559 86488 74631.7 1406.4 152.6 1377.2
4 2.93 17067 18095 17564.8 152.5 1.0 152.8
5 6.03 796 899 847.9 40.9 0.3 40.9
6 389.70 -9503 16064 0.0 5217.6 1617.8 5270.5
7 7.70 4830 5637 5174.2 83.6 1.4 83.8
8 1.46 41326 42554 41992.7 340.9 1.8 341.3
9 116.63 -40 526 209.9 78.3 1.8 78.3
10 16.95 651 916 776.4 19.4 0.4 19.4
11 30.17 35 66 51.8 13.1 0.2 13.2
12 43.79 28 71 47.7 6.9 0.1 6.9
13 37.19 51 111 80.6 14.1 0.1 14.0
14 6.58 199 227 213.5 10.8 0.1 10.8
15 3.04 678 721 698.7 13.7 0.1 13.7
16 17.84 66 95 79.9 9.9 0.0 9.9
17 2.61 140 148 144.1 3.1 0.0 3.1
18 2.52 142 150 145.9 2.6 0.0 2.6
19 9.21 28 34 30.8 2.3 0.0 2.3
20 5.96 236 266 250.8 3.4 0.0 3.4
21 5.97 24 27 25.1 2.1 0.0 2.1
22 21.47 11 17 13.8 1.5 0.0 1.5
23 5.60 740 828 786.0 19.3 0.1 19.3
24 2.57 829 873 853.3 13.1 0.1 13.1
25 21.77 5 8 6.7 1.2 0.0 1.2
26 74.18 2 13 7.6 2.7 0.0 2.7
# Save final table with all reference value metrics
write.csv(Tab3, "../Data//metrics_ref_val_SdAR-M2_nov2024_munich.csv", row.names = FALSE)
5.2.1.3 Export reference values for daily device accuracy testing
This selection only includes the relevant elements and the values needed to define the test range in which the measurement values should lie: Element name with UCL and LCL of the 1δSD-range and UCL2/LCL2 of the 2δSD-range. You can choose the elements to export by adding them in the code for selected_elements.
# Select only necessary columns
<- Tab3 %>% select(Element, LCL,UCL,LCL2,UCL2,LCL3,UCL3)
dataset1
# Filter for the specific elements dynamically
<- c("Rb", "Sr", "Y", "Zn", "Zr")
selected_elements <- dataset1 %>% filter(Element %in% selected_elements)
dataset
# Export to .csv
write.csv(dataset, "../Data//test_range_ref_val_SdAR-M2_nov2024_munich.csv", row.names = FALSE)
Comment: This spreadsheet can be used to compare the mean of three standard measurements, as calculated by the device and displayed on the screen, with the test range of the selected elements.Three out of five elements have to be in the 2δSD-range for the test for accuracy to be successful.
5.2.1.4 Create spreadsheet for Shewhart Control Charts
# Transpose the data
<- as.data.frame(t(Tab3))
Tab4
# Store the original row names (before transposing)
<- rownames(Tab3)
row_names
# Store the original column names (before transposing)
<- colnames(Tab3)
col_names
# Assign the original row names as column names of the transposed data
colnames(Tab4) <- row_names
# Assign the original column names as row names of the transposed data
rownames(Tab4) <- col_names
# Set the first row as the column names (this will use the values of the first row as column headers)
colnames(Tab4) <- Tab4[1, ]
# Remove the first row (since it is now used as the column names)
<- Tab4[-1, ]
Tab4
# # Export to .csv
write.csv(Tab4, "../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv", row.names = TRUE)
5.2.2 Compute Shewhart Control Charts
# Load and filter data
<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
data <- subset(data, SAMPLE == "SdAR-M2")
filtered_data
<- filtered_data |>
data1 mutate(
Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"), # 24-hour format
Time = format(Time, "%m/%d/%Y") # Output as mm/dd/yyyy
)
# Replace '< LOD' with 0 in each column
<- data1 %>%
data2 mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric))
# Remove first 16 columns (no elemental data)
<- data2[, -c(1:14)]
data
# Load data
<- read.csv("../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv")
data1
# Create PDF for all SCC plots
pdf("../Figures//SCC_SdAR-M2_accuracy_plots.pdf", width = 8, height = 6)
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
selected_columns "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
for (col in selected_columns) {
# Extract the data for the current column
<- data[, col]
chartdata
# Dynamically get metrics for current column from data1
<- data1[1, col]
CL <- data1[2, col]
SD <- data1[5, col]
LCL <- data1[6, col]
UCL <- data1[8, col]
LCL2 <- data1[9, col]
UCL2 <- data1[11, col]
LCL3 <- data1[12, col]
UCL3
# Round to 0 decimal places
<- round(CL, 0)
CL <- round(SD, 0)
SD <- round(LCL, 0)
LCL <- round(UCL, 0)
UCL <- round(LCL2, 0)
LCL2 <- round(UCL2, 0)
UCL2 <- round(LCL3, 0)
LCL3 <- round(UCL3, 0)
UCL3
# Create the control chart
<- qcc(
qcc_result data = chartdata,
type = "xbar",
nsigmas = 1,
std.dev = SD,
center = CL,
rules = NULL,
plot = FALSE
)
# Create the plot for the current column
plot(qcc_result, restore.par = FALSE,
title = paste("SCC for", col),
ylab = "Concentration (ppm)",
xlab = "Values of SdAR-M2 throughout the 2024 measurement campaign")
# Add 3-Sigma control limits
abline(h = c(UCL3, LCL3), lty = 3, col = "black")
text(x = length(chartdata) * 1, y = UCL3, labels = "UCL3", pos = 3, col = "black", cex = 0.8)
text(x = length(chartdata) * 1, y = LCL3, labels = "LCL3", pos = 1, col = "black", cex = 0.8)
# Add 2-Sigma control limits
abline(h = c(UCL2, LCL2), lty = 3, col = "black")
text(x = length(chartdata) * 1, y = UCL2, labels = "UCL2", pos = 3, col = "black", cex = 0.8)
text(x = length(chartdata) * 1, y = LCL2, labels = "LCL2", pos = 1, col = "black", cex = 0.8)
# Highlight points based on thresholds
<- chartdata < UCL | chartdata > LCL
below_1sigma <- chartdata > UCL | chartdata < LCL
below_2sigma <- chartdata > UCL3 | chartdata < LCL3
above_3sigma <- (chartdata > UCL2 & chartdata <= UCL3) | (chartdata < LCL2 & chartdata >= LCL3)
above_2sigma
# Overlay red points for values in 1-sigma
points(which(below_1sigma), chartdata[below_1sigma], col = "lightgreen", pch = 19, cex = 1.2)
# Overlay yellow points for values between 1-sigma and 2-sigma
points(which(below_2sigma), chartdata[below_2sigma], col = "lightblue", pch = 19, cex = 1.2)
# Overlay yellow points for values between 2-sigma and 3-sigma
points(which(above_2sigma), chartdata[above_2sigma], col = "orange", pch = 19, cex = 1.2)
# Overlay red points for values beyond 3-sigma
points(which(above_3sigma), chartdata[above_3sigma], col = "red", pch = 19, cex = 1.2)
}
# Close the PDF file
dev.off()
Shewhart Control Charts
Interpretation: Skimming through the SCCs of the same standard throughout the 2024 measurement campaign reveals quite taccurate results for all elements. Many values fall within the ±1δSD range (green), with some extending into the ±1δSD to ±2δSD range (blue) but more rarely to the ±2δSD to ±3δSD range (orange). Especially for the light elements, some drifts and values beyond the quality criterium are visible, making it clear that all interpretation has to be checked in detail with other information such as petrography and archaeological background information (see also Schauer and Amicone 2024).
5.2.3 Compute Relative Percentage Difference (%RD)
5.2.3.1 Calculate %RD
# Load data
<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
data
<- subset(data, SAMPLE == "SdAR-M2")
filtered_data
# Transform date field
<- filtered_data |>
data1 mutate(
Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"), # 24-hour format
Time = format(Time, "%m/%d/%Y") # Output as mm/dd/yyyy
)
# Replace '< LOD' with 0 in each column
<- data1[, -c(1:14)] %>%
data2 mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric))
# Merge data
<-cbind(data1[, c(1:14)],data2)
data3
# Calculate mean
<- data3 %>%
mean_data group_by(Time) %>%
summarize(across(everything(), mean, na.rm = TRUE))
# Remove first 16 columns (no elemental data)
<- mean_data[, -c(1:14)]
data4
# Merge data
<-cbind(mean_data[, c(1)],data4)
data
# Load second dataset
<- read.csv("../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv")
data1
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
selected_columns "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Loop over each selected column
for (col in selected_columns) {
# Extract the data for the current column by name
<- data[, col] # This extracts the column by name (which is a string)
mean_val
# Dynamically get the reference value for the current column from data1 (assuming row 1 contains reference values)
<- data1[1, col] # Reference value for the column from data1
CL
# Define the percentage difference function (element-wise operation)
<- function(x) {
percentage_difference return(((x - CL) / CL) * 100) # Apply the percentage difference formula
}
# Apply the percentage difference function to each element in mean_val
<- sapply(mean_val, percentage_difference)
percentage_diff
# Optionally, store the result in a new column in the data
paste(col, "percentage_diff", sep = "_")] <- percentage_diff
data[,
}
# Select necessary columns, round values (except 'Time'), and remove '_percentage_diff' from column names
<- data[, c(1, 143:168)] %>%
data_RD mutate(across(-which(names(.) == "Time"), round, 0)) %>% # Round all columns except 'Date'
setNames(gsub("_percentage_diff", "", colnames(.))) %>% # Remove '_percentage_diff' from column names
setNames(gsub("^Time$", "%RD", colnames(.))) # Replace "Time" with "%RD"
5.2.3.2 Format %RD-spreadsheet
# Create new spreadsheet
<- BasicTable$new()
tbl $addData(data_RD, firstColumnAsRowHeaders=TRUE)
tbl
# Get the number of rows and columns in the dataset
<- nrow(data_RD)
num_rows <- ncol(data_RD)
num_columns
# Find the index of the column with the header "%RD"
<- which(names(data_RD) == "%RD")
rd_column_index
# Define the cells for which the styling needs to be applied (excluding the '%RD' column)
<- setdiff(2:num_columns, rd_column_index)
column_indices_to_style
# Ensure the last row is included in styling
<- tbl$getCells(
cells rowNumbers = 2:(num_rows + 1),
columnNumbers = column_indices_to_style,
matchMode = "combinations"
)
# Apply the styling
$mapStyling(
tblcells = cells,
styleProperty = "background-color",
valueType = "color",
mapType = "logic",
mappings = list(
"v >= -3 & v <= 3", "lightgreen", # Values between -3 and 3
"v > -7 & v < -3", "lightblue", # Values between -7 and -3
"v > 3 & v <= 7", "lightblue", # Values between 3 and 7
"v > 7 & v <= 10", "yellow", # Values between 7 and 10
"v > -10 & v <= -7", "yellow", # Values between -10 and -7
"v >= 10", "firebrick", # Values greater than 10
"v <= -10", "firebrick" # Values less than -10
)
)
# Render the table with the applied styles
$renderTable() tbl
# Save the table to an Excel workbook
<- createWorkbook(creator = Sys.getenv("M. Schauer"))
wb addWorksheet(wb, "Data")
$writeToExcelWorksheet(wb = wb, wsName = "Data", topRowNumber = 1, leftMostColumnNumber = 1, applyStyles = TRUE)
tbl
# Export to .xlsx
saveWorkbook(wb, file = "../Data//relRD_SdAR-M2_nov2024_munich.xlsx", overwrite = TRUE)
Interpretation: The %RD demonstrates excellent accuracy (between -3 and +3) for the elements Si, Ti, Al, Fe, K, Cu, Zn, Rb, Sr, Y, Zr, Nb and Pb, with some elements — Mn, Ca, Ni, As, Mo and Ba — also including highly accurate %RD values (3 to 7, -3 to -7). V shows some %RD values of good accuracy (7 to 10, -7 to -10), whereas for Mg, P, S, Cr, Th and U, generally speaking, the %RD values are not accurate enough (higher than 10, lower than -10). Therefore, these latter elements are not measured with sufficient accuracy and should either not be used or, if used, be handled with great care in further analysis.
5.2.3.3 Create figures of %RD
# Load data
<- read.xlsx("../Data//relRD_SdAR-M2_nov2024_munich.xlsx")
data1
# Define columns
<- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
selected_columns "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Create a PDF file to save all plots
pdf("../Figures//relRD_SdAR-M2_nov2024_munich_plots.pdf", width = 10, height = 6)
# Loop through each selected column and create plots
for (col in selected_columns) {
# Extract data for plotting
<- data.frame(
plot_data Day = as.factor(data1$`%RD`), # Treat measurement days as categorical on X-axis
Value = data1[[col]]
)
# Create scatter plot with lines connecting measurement days
<- ggplot(plot_data, aes(x = Day, y = Value, group = 1)) +
p geom_point() +
geom_line() + # Connect points across measurement days
geom_hline(yintercept = 0, linetype = "dashed", color = "red") + # Reference line
labs(title = paste("%RD-Plot for", col), x = "Measurement Day", y = col) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) # Tilt x-axis labels
# Display the plot in R
print(p)
}
# Close the PDF device
dev.off()
%RD plots
6 Apply coefficient correction (coefcorIV)
More information on coefficient correction coefcor IV can be found in (Schauer 2024).
6.1 Compile Data
# Load data
<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
data1
# Replace '< LOD' with 0 in each column
<- data1 %>%
data mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>% # Ensure < LOD is replaced with 0
mutate(across(15:155, as.numeric)) # Ensure all values are numeric
# Define columns
<-data[, c("SAMPLE","Time","Si","Ti","Al","Fe","Mn","Mg","Ca","K","P","S","Cl","Sc","V","Cr","Co","Ni","Cu","Zn","As","Se","Rb","Sr","Y","Zr","Nb","Mo","Pd","Ag","Cd","Sn","Sb","Te","Cs","Ba","La","Ce","Hf","Ta","W","Re","Au","Hg","Pb","Bi","Th","U")]
dataset
# Load data
<- read.csv("../RawData//coefcorIV_factors.csv")
coefcor
# Define relevant variables
<-dataset$SAMPLE
SAMPLE<-dataset$Time Time
6.2 Process Coefcor IV
# Select element column from the measurement data
<-dataset$Si
Si
# Select slope correction
<-coefcor$Si_a
Si_a
# Select offset correction (if applicable)
<-coefcor$Si_b
Si_b
# Calculate coefficient correction and transform to oxide percent
<-(Si_a*Si+Si_b)*0.00021393
SiO2
<-dataset$Ti
Ti<-coefcor$Ti_a
Ti_a<-coefcor$Ti_b
Ti_b<-(Ti_a*Ti+Ti_b)*0.0001668
TiO2
<-dataset$Al
Al<-coefcor$Al_a
Al_a<-coefcor$Al_b
Al_b<-(Al_a*Al+Al_b)*0.00018895
Al2O3
<-dataset$Fe
Fe<-coefcor$Fe_a
Fe_a<-coefcor$Fe_b
Fe_b<-(Fe_a*Fe+Fe_b)*0.000143
Fe2O3
<-dataset$Mn
Mn<-coefcor$Mn_a
Mn_a<-coefcor$Mn_b
Mn_b<-(Mn_a*Mn+Mn_b)*0.00012912
MnO
<-dataset$Mg
Mg<-coefcor$Mg_a
Mg_a<-coefcor$Mg_b
Mg_b<-(Mg_a*Mg+Mg_b)*0.00016583
MgO
<-dataset$Ca
Ca<-coefcor$Ca_a
Ca_a<-coefcor$Ca_b
Ca_b<-(Ca_a*Ca+Ca_b)*0.00013992
CaO
<-dataset$K
K<-coefcor$K_a
K_a<-coefcor$K_b
K_b<-(K_a*K+K_b)*0.00012046
K2O
<-dataset$P
P<-coefcor$P_a
P_a<-coefcor$P_b
P_b<-(P_a*P+P_b)*0.00022914
P2O5
# Create new data frame
<-data.frame(SiO2,TiO2,Al2O3,Fe2O3,MnO,MgO,CaO,K2O,P2O5)
data_norm
# Calculate row sum
<-data_norm %>% rowwise() %>% mutate(sum = sum(c(SiO2,TiO2,Al2O3,
data_norm_withsum
Fe2O3,MnO,MgO,CaO,K2O,P2O5)))
# Define formular for row percent
<-100/data_norm_withsum$sum
sumpct
# Normalise major elements
<-SiO2*sumpct
SiO2<-TiO2*sumpct
TiO2<-Al2O3*sumpct
Al2O3<-Fe2O3*sumpct
Fe2O3<-MnO*sumpct
MnO<-MgO*sumpct
MgO<-CaO*sumpct
CaO<-K2O*sumpct
K2O<-P2O5*sumpct
P2O5
# Select element column from the measurement data
<-dataset$V
V
# Select slope correction
<-coefcor$V_a
V_a
# Select offset correction (if applicable)
<-coefcor$V_b
V_b
# Calculate coefficient correction
<-V_a*V+V_b
V
<-dataset$Cr
Cr<-coefcor$Cr_a
Cr_a<-coefcor$Cr_b
Cr_b<-Cr_a*Cr+Cr_b
Cr
<-dataset$Ni
Ni<-coefcor$Ni_a
Ni_a<-coefcor$Ni_b
Ni_b<-Ni_a*Ni+Ni_b
Ni
<-dataset$Cu
Cu<-coefcor$Cu_a
Cu_a<-coefcor$Cu_b
Cu_b<-Cu_a*Cu+Cu_b
Cu
<-dataset$Zn
Zn<-coefcor$Zn_a
Zn_a<-coefcor$Zn_b
Zn_b<-Zn_a*Zn+Zn_b
Zn
<-dataset$Rb
Rb<-coefcor$Rb_a
Rb_a<-coefcor$Rb_b
Rb_b<-Rb_a*Rb+Rb_b
Rb
<-dataset$Sr
Sr<-coefcor$Sr_a
Sr_a<-coefcor$Sr_b
Sr_b<-Sr_a*Sr+Sr_b
Sr
<-dataset$Y
Y<-coefcor$Y_a
Y_a<-coefcor$Y_b
Y_b<-Y_a*Y+Y_b
Y
<-dataset$Zr
Zr<-coefcor$Zr_a
Zr_a<-coefcor$Zr_b
Zr_b<-Zr_a*Zr+Zr_b
Zr
<-dataset$Nb
Nb<-coefcor$Nb_a
Nb_a<-coefcor$Nb_b
Nb_b<-Nb_a*Nb+Nb_b
Nb
<-dataset$Ba
Ba<-coefcor$Ba_a
Ba_a<-coefcor$Ba_b
Ba_b<-Ba_a*Ba+Ba_b
Ba
<-dataset$Pb
Pb<-coefcor$Pb_a
Pb_a<-coefcor$Pb_b
Pb_b<-Pb_a*Pb+Pb_b Pb
6.3 Compile Corrected Data
# Create new data frame
<-data.frame(SAMPLE,SiO2,TiO2,Al2O3,Fe2O3,MnO,MgO,CaO,
data
K2O,P2O5,V,Cr,Ni,Cu,Zn,Rb,Sr,Y,Zr,Nb,Ba,Pb)
# Round selected columns to .000 for visually better plotting
c("SiO2","TiO2","Al2O3","Fe2O3","MnO","MgO","CaO",
data[, "K2O","P2O5")] <- round(data[, c("SiO2","TiO2","Al2O3","Fe2O3",
"MnO","MgO","CaO","K2O","P2O5")], 3)
# Round selected columns to .00 for visually better plotting
c("V","Cr","Ni","Cu","Zn","Rb","Sr","Y","Zr","Nb",
data[, "Ba","Pb")] <- round(data[, c("V","Cr","Ni","Cu","Zn",
"Rb","Sr","Y","Zr","Nb","Ba","Pb")], 2)
# Export to .csv
write.csv(data,"../Data//Assur_Pottery_2024_Data_Coefcor.csv",row.names=TRUE)
6.4 Calculate Mean of Corrected Data
# Load data
<- read.csv("../Data//Assur_Pottery_2024_Data_Coefcor.csv")
data1
# Calculate mean
<-(data1) %>%
data4group_by(SAMPLE) %>%
summarize(across(everything(),list(mean=mean)))
# Delete "_mean" from the column name
colnames(data4) <- gsub("_mean", "", colnames(data4))
# Define all columns except for the first as numeric
-1] <- sapply(data4[, -1], as.numeric)
data4[,
# Round selected columns to .000 for visually better plotting
c("SiO2","TiO2","Al2O3","Fe2O3","MnO","MgO","CaO",
data4[, "K2O","P2O5")] <- round(data4[, c("SiO2","TiO2","Al2O3","Fe2O3",
"MnO","MgO","CaO","K2O","P2O5")], 3)
# Round selected columns to .00 for visually better plotting
c("V","Cr","Ni","Cu","Zn","Rb","Sr","Y","Zr","Nb",
data4[, "Ba","Pb")] <- round(data4[, c("V","Cr","Ni","Cu","Zn",
"Rb","Sr","Y","Zr","Nb","Ba","Pb")], 2)
# Select columns
<-data4[,c(1,3:7,9,10,12:14,15:21)]
data4
# Export to .csv
write.csv(data4,"../Data//Assur_Pottery_2024_Data_Coefcor_MW.csv",row.names=FALSE)
6.5 Merge with Archaeological Information
# Load data
<- read.csv("../Data//Assur_Pottery_2024_ArchData.csv")
data2
# Merge data
<-merge(data4,data2, by="SAMPLE", all=FALSE)
data3
# Export to .csv
write.csv(data3,"../Data//Assur_Pottery_2024_complete_MW.csv",row.names=FALSE)
7 Tab. D2.1
# Load data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data1
# Select columns
<-data1[,c(1:18,20,21,27,22,23,25)]
data
# Replace "A_Pathian" in column 23 with value from column 22 (same row)
23]] <- ifelse(
data[[23]] == "A_Parthian",
data[[22]],
data[[23]]
data[[
)
# Rename column 23 to "Group"
colnames(data)[23] <- "Group"
# Drop column 22
<- data[, -22]
data
# Round columns 2 to 8 to two decimal places
2:8] <- lapply(data[, 2:8], function(x) round(x, 2))
data[,
# Round columns 9 to 18 to whole numbers
9:18] <- lapply(data[, 9:18], function(x) round(x, 0))
data[,
# Export to .xlsx
setwd("../Figures")
write_xlsx(data, path = "Tab_D2-1.xlsx")
8 Hellenistic Chemical Groups
8.1 Fig. D2.1
# Load and filter data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
data <- data[order(data$ChemGroup_H), ]
data
# Define custom colors and shapes
<- c(
mycolors "HG1" = "dodgerblue1", "HG2" = "blue", "HG3" = "orange", "HG4" = "green",
"HG5" = "wheat", "HO1" = "firebrick", "HO2" = "pink", "HO3" = "magenta",
"HO4" = "red", "HO5" = "yellow3", "HO6" = "black", "HO7" = "lightblue",
"HO8" = "slategrey", "HO9" = "seagreen"
)
<- c(
myshapes "HG1" = 17, "HG2" = 19, "HG3" = 15, "HG4" = 1, "HG5" = 2,
"HO1" = 3, "HO2" = 4, "HO3" = 8, "HO4" = 9, "HO5" = 15,
"HO6" = 16, "HO7" = 18, "HO8" = 20, "HO9" = 10
)
# Define a reusable plot theme
<- theme_classic() +
base_theme theme(
axis.line = element_line(colour = "black", size = 0.25),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.title = element_blank(),
legend.text = element_text(size = 8),
legend.position = "bottom"
)
# Function to generate a single plot
<- function(x, y, x_lab, y_lab) {
make_plot ggplot(data, aes(x = .data[[x]], y = .data[[y]], color = ChemGroup_H, shape = ChemGroup_H)) +
geom_point(size = 2) +
scale_color_manual(values = mycolors) +
scale_shape_manual(values = myshapes) +
labs(x = x_lab, y = y_lab) +
base_theme
}
# Create all plots
<- make_plot("Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %")
plot_1 <- make_plot("CaO", "K2O", "CaO in %", "K2O in %")
plot_2 <- make_plot("TiO2", "MnO", "TiO2 in %", "MnO in %")
plot_3 <- make_plot("Rb", "Sr", "Rb in ppm", "Sr in ppm")
plot_4 <- make_plot("V", "Cr", "V in ppm", "Cr in ppm")
plot_5 <- make_plot("Ni", "Nb", "Ni in ppm", "Nb in ppm")
plot_6 <- make_plot("Y", "Zr", "Y in ppm", "Zr in ppm")
plot_7
# Arrange the plots
<- ggarrange(
grid_plot
plot_1, plot_2, plot_3,
plot_4, plot_5, plot_6,NULL,
plot_7, ncol = 2, nrow = 4,
common.legend = TRUE,
legend = "bottom",
align = "hv"
)
# Export to .eps
ggsave(
filename = "Fig_D2-1.eps",
plot = grid_plot,
path = "../Figures",
device = cairo_ps,
width = 15.3,
height = 21,
units = "cm",
dpi = 1200
)
8.2 Fig. D2.1 - Interactive with Parthian material
# Load data
<- read.csv("../Data//Assur_Pottery_2024_complete_MW_ed.csv")
data
# Define custom colors and shapes
<- c("HG1" = "dodgerblue1", "HG2" = "blue","HG3"="orange","HG4"="green","HG5"="wheat","HO1"="firebrick","HO2"="pink","HO3"="magenta","HO4"="red","HO5"="yellow3","HO6"="black","HO7"="lightblue","HO8"="slategrey","HO9"="seagreen","A_Parthian"="grey85")
mycolors
<- c(
myshapes "HG1" = 17, "HG2" = 19, "HG3" = 15, "HG4" = 1, "HG5" = 2,
"HO1" = 3, "HO2" = 4, "HO3" = 8, "HO4" = 9, "HO5" = 15,
"HO6" = 16, "HO7" = 18, "HO8" = 20, "HO9" = 10, "A_Parthian"=17
)
# Sort data
<- data[order(data$ChemGroup_H), ]
data
# Create scatterplot function with hover support
<- function(data4, x_col, y_col, x_label, y_label, title) {
create_interactive_plot <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'ChemGroup_H', shape = 'ChemGroup_H', text = 'SAMPLE')) +
p geom_point(size = 2) +
scale_shape_manual(values = myshapes) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
<- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot1 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot2 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot3 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot4 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot5 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot6 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Fig. D2.1 - Interactive with Parthian Material")
plot7
# Arrange the plots in a grid layout with individual axis titles
<- subplot(
grid
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Fig_D2-1_interactive.html", selfcontained = TRUE)
8.3 Interactive plots of Subfabric Groups
# Load and filter data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
data
# Define custom colors and shapes
<- c( "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854", "6" = "#B3B3B3"
mycolors
)
<- c("1A"="17","1B"="17","2A" = "19","2B" = "19","2C"="19","2E"="19","3A"="2","3B"="2","4A"="6","4B"="6","4C"="6","5"="15","6"="16")
mysymbols
# Sort data
<- data[order(data$ChemGroup_H), ]
data
# Create scatterplot function with hover support
<- function(data4, x_col, y_col, x_label, y_label, title) {
create_interactive_plot <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Subfabric', shape = 'Subfabric', text = 'SAMPLE')) +
p geom_point(size = 2) +
scale_shape_manual(values = mysymbols) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
<- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot1 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot2 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot3 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot4 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot5 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot6 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Scatterplot of Hellenistic Subfabric groups")
plot7
# Arrange the plots in a grid layout with individual axis titles
<- subplot(
grid
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Hellenistic_Subfabricgroups.html", selfcontained = TRUE)
8.4 Fig. D2.5
# Load and filter data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
data
# Define color palettes
<- c( "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854", "6" = "#B3B3B3"
subfabric_colors
)
<- c("HG1" = "dodgerblue1", "HG2" = "blue","HG3"="orange","HG4"="green","HG5"="wheat","HO1"="firebrick","HO2"="pink","HO3"="magenta","HO4"="red","HO5"="yellow3","HO6"="black","HO7"="lightblue","HO8"="slategrey","HO9"="seagreen","A_Parthian"="grey85")
chemgroup_colors
# Sort and aggregate data
<- data |>
plot_data filter(!is.na(ChemGroup_H), !is.na(Vessel.type), !is.na(Subfabric)) |>
arrange(ChemGroup_H) |>
count(ChemGroup_H, Vessel.type, Subfabric)
# Calculate label widths
<- function(label, size = 2.5) {
get_text_width pushViewport(viewport())
<- textGrob(label, gp = gpar(fontsize = size * 10))
g <- convertWidth(grobWidth(g), "npc", valueOnly = TRUE)
width popViewport()
return(width)
}
# Define unique strata and associated axis
<- tibble(
stratum_df stratum = unique(c(plot_data$ChemGroup_H, plot_data$Vessel.type, plot_data$Subfabric))
|>
) mutate(
axis = case_when(
%in% plot_data$ChemGroup_H ~ "axis1",
stratum %in% plot_data$Vessel.type ~ "axis2",
stratum %in% plot_data$Subfabric ~ "axis3"
stratum
)
)
# Estimate text width
<- stratum_df |>
stratum_df rowwise() |>
mutate(text_width = get_text_width(stratum, size = 2.5)) |>
ungroup()
# Compute max width per axis + padding
<- stratum_df |>
axis_widths group_by(axis) |>
summarise(max_width = max(text_width) + 0.02) # Add padding
# Join width to strata
<- stratum_df |>
stratum_df left_join(axis_widths, by = "axis")
# Plot with custom widths
<- ggplot(plot_data, aes(
alluvial_plot axis1 = ChemGroup_H,
axis2 = Vessel.type,
axis3 = Subfabric,
y = n
+
)) geom_alluvium(
aes(fill = ChemGroup_H),
width = 0.1, # constant for flow thickness
alpha = 0.6
+
) geom_stratum(
aes(
fill = after_stat(stratum),
width = case_when(
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis1"] ~ axis_widths$max_width[axis_widths$axis == "axis1"],
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis2"] ~ axis_widths$max_width[axis_widths$axis == "axis2"],
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis3"] ~ axis_widths$max_width[axis_widths$axis == "axis3"]
)
),color = "black"
+
) geom_text(
stat = "stratum",
aes(label = after_stat(stratum)),
size = 2.5,
color = "black"
+
) scale_x_discrete(
limits = c("ChemGroup_H", "Vessel.type", "Subfabric"),
expand = c(0.05, 0.05)
+
) scale_fill_manual(
values = c(chemgroup_colors, subfabric_colors),
na.value = "grey",
guide = "none"
+
) labs(
title = "Fig. D2.5",
y = "Sample Count"
+
) theme_minimal()
# Display the plot
alluvial_plot
# Export to .svg
svglite("../Figures/Fig_D2-5.svg", width = 12, height = 8)
print(alluvial_plot)
dev.off()
png
2
8.5 Interactive plots of Vessel Types
# Load data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
data
# Define color palettes
<- c("beaker" = "dodgerblue1", "beaker/jar" = "blue","bottle"="orange","body sherd"="slategrey","carinated bowl"="green","cooking pot"="wheat","echinus bowl"="firebrick","folded rim jar"="pink","jar"="magenta","large-mouth pot"="red","large storage jar"="yellow3","medium jar"="black","rolled rim plate"="lightblue","small jar"="brown4","thin walled beaker"="grey")
mycolors
# Sort data
<- data[order(data$Vessel.type), ]
data
# Create scatterplot function with hover support
<- function(data4, x_col, y_col, x_label, y_label, title) {
create_interactive_plot <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Vessel.type', shape = 'Vessel.type', text = 'SAMPLE')) +
p geom_point(size = 2) +
scale_shape_manual(values = mysymbols) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
<- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot1 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot2 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot3 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot4 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot5 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot6 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")
plot7
# Arrange the plots in a grid layout with individual axis titles
<- subplot(
grid
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Hellenistic_VesselTypes.html", selfcontained = TRUE)
9 Parthian Chemical Groups
9.1 Fig. D2.2
# Load and filter data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
data <- data[order(data$ChemGroup_P), ]
data
# Define custom colors and shapes
<- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")
mycolors
<- c(
myshapes "PG1" = 17, "PG2" = 19, "PG3" = 15, "PG4" = 1, "PG5" = 2,
"PO1" = 3, "PO2" = 4, "PO3" = 8, "PO4" = 9, "PO5" = 15,
"PO6" = 16, "PO7" = 18, "PO8" = 20, "PO9" = 10
)
# Define a reusable plot theme
<- theme_classic() +
base_theme theme(
axis.line = element_line(colour = "black", size = 0.25),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.title = element_blank(),
legend.text = element_text(size = 8),
legend.position = "bottom"
)
# Function to generate a single plot
<- function(x, y, x_lab, y_lab) {
make_plot ggplot(data, aes(x = .data[[x]], y = .data[[y]], color = ChemGroup_P, shape = ChemGroup_P)) +
geom_point(size = 2) +
scale_color_manual(values = mycolors) +
scale_shape_manual(values = myshapes) +
labs(x = x_lab, y = y_lab) +
base_theme
}
# Create all plots
<- make_plot("Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %")
plot_1 <- make_plot("CaO", "K2O", "CaO in %", "K2O in %")
plot_2 <- make_plot("TiO2", "MnO", "TiO2 in %", "MnO in %")
plot_3 <- make_plot("Rb", "Sr", "Rb in ppm", "Sr in ppm")
plot_4 <- make_plot("V", "Cr", "V in ppm", "Cr in ppm")
plot_5 <- make_plot("Ni", "Nb", "Ni in ppm", "Nb in ppm")
plot_6 <- make_plot("Y", "Zr", "Y in ppm", "Zr in ppm")
plot_7
# Arrange the plots
<- ggarrange(
grid_plot
plot_1, plot_2, plot_3,
plot_4, plot_5, plot_6,NULL,
plot_7, ncol = 2, nrow = 4,
common.legend = TRUE,
legend = "bottom",
align = "hv"
)
# Export to .eps
ggsave(
filename = "Fig_D2-2.eps",
plot = grid_plot,
path = "../Figures",
device = cairo_ps,
width = 15.3,
height = 21,
units = "cm",
dpi = 1200
)
9.2 Fig. D2.2 - Interactive with Hellenistic material
# Load data
<- read.csv("../Data//Assur_Pottery_2024_complete_MW_ed.csv")
data
# Define custom colors
<- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")
mycolors
# Sort data
<- data[order(data$ChemGroup_P), ]
data
# Create scatterplot function with hover support
<- function(data4, x_col, y_col, x_label, y_label, title) {
create_interactive_plot <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'ChemGroup_P', shape = 'ChemGroup_P', text = 'SAMPLE')) +
p geom_point(size = 2) +
scale_shape_manual(values = c(10, 11,17, 19, 15, 1, 2, 3, 4, 8, 9, 15, 16, 18, 20)) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
<- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot1 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot2 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot3 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot4 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot5 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot6 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Fig. D2.1 - Interactive with Hellenistic Material")
plot7
# Arrange the plots in a grid layout with individual axis titles
<- subplot(
grid
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Fig_D2-2_interactive.html", selfcontained = TRUE)
9.3 Interactive plots of Subfabric Groups
# Load data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
data
# Define color palettes
<- c( "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854", "6" = "#B3B3B3"
mycolors
)
<- c("1A"="17","1B"="17","2A" = "19","2B" = "19","2C"="19","2E"="19","3A"="2","3B"="2","4A"="6","4B"="6","4C"="6","5"="15","6"="16")
mysymbols
# Sort data
<- data[order(data$ChemGroup_P), ]
data
# Create scatterplot function with hover support
<- function(data4, x_col, y_col, x_label, y_label, title) {
create_interactive_plot <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Subfabric', shape = 'Subfabric', text = 'SAMPLE')) +
p geom_point(size = 2) +
scale_shape_manual(values = mysymbols) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
<- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot1 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot2 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot3 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot4 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot5 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot6 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")
plot7
# Arrange the plots in a grid layout with individual axis titles
<- subplot(
grid
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Parthian_Subfabricgroups.html", selfcontained = TRUE)
9.4 Fig. D2.6
# Load data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
data
# Define your color palettes
<- c( "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854", "6" = "#B3B3B3"
subfabric_colors
)
<- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")
chemgroup_colors
# Sort and aggregate data
<- data |>
plot_data filter(!is.na(ChemGroup_P), !is.na(Vessel.type), !is.na(Subfabric)) |>
arrange(ChemGroup_P) |>
count(ChemGroup_P, Vessel.type, Subfabric)
# Calculate label widths
<- function(label, size = 2.5) {
get_text_width pushViewport(viewport())
<- textGrob(label, gp = gpar(fontsize = size * 10))
g <- convertWidth(grobWidth(g), "npc", valueOnly = TRUE)
width popViewport()
return(width)
}
# Define unique strata and associated axis
<- tibble(
stratum_df stratum = unique(c(plot_data$ChemGroup_P, plot_data$Vessel.type, plot_data$Subfabric))
|>
) mutate(
axis = case_when(
%in% plot_data$ChemGroup_P ~ "axis1",
stratum %in% plot_data$Vessel.type ~ "axis2",
stratum %in% plot_data$Subfabric ~ "axis3"
stratum
)
)
# Estimate text width
<- stratum_df |>
stratum_df rowwise() |>
mutate(text_width = get_text_width(stratum, size = 2.5)) |>
ungroup()
# Compute max width per axis + padding
<- stratum_df |>
axis_widths group_by(axis) |>
summarise(max_width = max(text_width) + 0.02) # Add padding
# Join width to strata
<- stratum_df |>
stratum_df left_join(axis_widths, by = "axis")
# Plot with custom widths
<- ggplot(plot_data, aes(
alluvial_plot axis1 = ChemGroup_P,
axis2 = Vessel.type,
axis3 = Subfabric,
y = n
+
)) geom_alluvium(
aes(fill = ChemGroup_P),
width = 0.1, # constant for flow thickness
alpha = 0.6
+
) geom_stratum(
aes(
fill = after_stat(stratum),
width = case_when(
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis1"] ~ axis_widths$max_width[axis_widths$axis == "axis1"],
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis2"] ~ axis_widths$max_width[axis_widths$axis == "axis2"],
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis3"] ~ axis_widths$max_width[axis_widths$axis == "axis3"]
)
),color = "black"
+
) geom_text(
stat = "stratum",
aes(label = after_stat(stratum)),
size = 2.5,
color = "black"
+
) scale_x_discrete(
limits = c("ChemGroup_P", "Vessel.type", "Subfabric"),
expand = c(0.05, 0.05)
+
) scale_fill_manual(
values = c(chemgroup_colors, subfabric_colors),
na.value = "grey",
guide = "none"
+
) labs(
title = "Fig. D2.6",
y = "Sample Count"
+
) theme_minimal()
# Display the plot
alluvial_plot
# Export to .svg
svglite("../Figures/Fig_D2-6.svg", width = 12, height = 8)
print(alluvial_plot)
dev.off()
png
2
9.5 Interactive plots of Vessel Types
# Load data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
data
# Define color palettes
<- c("bottle"="orange","body sherd"="slategrey","bottle or jar" = "dodgerblue1", "bowl " = "blue","bowl or jug"="green","closed bowl"="wheat","deep bowl"="firebrick","jar or pot"="pink","jar"="magenta","hole-mouth jar or pot"="red","krater"="yellow3","large flat bowl"="black","large vessel"="lightblue","open bowl"="brown4","rim sherd"="grey","small jar or juglet"="darkgreen")
mycolors
# Sort data
<- data[order(data$Vessel.type), ]
data
# Create scatterplot function with hover support
<- function(data4, x_col, y_col, x_label, y_label, title) {
create_interactive_plot <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Vessel.type', shape = 'Vessel.type', text = 'SAMPLE')) +
p geom_point(size = 2) +
scale_shape_manual(values = mysymbols) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
<- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot1 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot2 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot3 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot4 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot5 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot6 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")
plot7
# Arrange the plots in a grid layout with individual axis titles
<- subplot(
grid
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Parthian_VesselTypes.html", selfcontained = TRUE)
10 Tab. D2.3 - Basic Information
# Load data
<- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data
# Create Hellenistic period table
<- data %>%
table_h filter(Period == "Hellenistic") %>%
count(ChemGroup_H, Subfabric) %>%
pivot_wider(names_from = Subfabric, values_from = n, values_fill = 0) %>%
mutate(Period = "Hellenistic")
# Create Parthian period table
<- data %>%
table_p filter(Period == "Parthian") %>%
count(ChemGroup_P, Subfabric) %>%
pivot_wider(names_from = Subfabric, values_from = n, values_fill = 0) %>%
mutate(Period = "Parthian")
# Standardizing column names for merging
colnames(table_h)[1] <- "ChemGroup"
colnames(table_p)[1] <- "ChemGroup"
# Stack the tables together
<- bind_rows(table_h, table_p)
stacked_table
# Export to .xlsx
setwd("../Figures")
write_xlsx(stacked_table, path = "Tab_D2-3.xlsx")