p-XRF Data Processing and Interpretation of Hellenistic and Parthian Pottery from the 2023 and 2024 Excavations at Assur

Data Generated with the Niton XL3t No. 97390

Author
Affiliations

Michaela Schauer

Department of Earth and Environmental Sciences, Institute for Geology, Ludwig-Maximilians-University, Luisenstr. 37, 80333 Munich, Germany

Vienna Institute for Archaeological Science (VIAS), University of Vienna, Franz-Klein-Gasse 1, 1190 Vienna, Austria

Research Network Human Evolution und Archaeological Sciences (HEAS), University of Vienna, Djerassiplatz 1, 1030 Vienna, Austria

Published

May 21, 2025

1 Introduction

1.1 Content & Aim

This R script contains data quality control, data processing and figures for the p-XRF data collected from pottery excavated in 2023 and 2024 in Assur in the scope of the project New Archaeological Exploration of Assur led by by Prof. Dr. Karen Radner (LMU Munich). The samples were selected in the field by Dr. Jana Richter, Dr. Andrea Squitieri (LMU Munich), Dr. Florian Janoscha Kreppner & Ellen Coster (University of Münster). p-XRF data was handled by Dr. Michaela Schauer (LMU Munich/VIAS), measurements undertaken by Marco Wolf (LMU Munich). Related petrographic analysis was performed by Dr. Silvia Amicone (University of Tübingen) (Schauer and Amicone 2024).

The dataset is part of the second Exploring Assur publication (Schauer and Amicone 2025).

1.2 Device & Measurement Parameters

The data was collected using the Niton XL3t (No.97390), previously owned by LMU Munich’s Department of Cultural Studies and Archaeology and, since late 2023, hosted by LMU Munich’s Department of Geology. Three measurements were taken per sample on different areas of a fresh break, using the TestAllGeo mode with an 8 mm collimator and a measurement time of 300 seconds (60 seconds each for the main, low, and high filters, and 120 seconds for the light filter).

Data collection was conducted by Marco Wolf from August 9 to 11, 2023 (temperature: 28-35°C, relative humidity: %35-42) at the Dep. for Cultural Studies (Schauer and Amicone 2024) and from November 13 to 25, 2024 (temperature: 20°C; relative humidity: 35–42%) in the geology lab.

1.3 Standards used

In 2023, Standard NIST 2709a (National Institute of Standards & Technology 2018), in 2024 Standard SdAR-M2 (International Association of Geoanalysts 2015) was used.

1.4 Data processing

The data of both measurement campaign has been compiled in one file and run though the precision assessment together. For accuracy assessment, due to the use of two different standards (see Section 1.3), the campaigns were analysed separately (see Section 5.1 and Section 5.2).

Following data quality assessment, the data was calibrated and major elements normalised to 100% (see Section 6).

1.4.1 Precision Assessment

To ensure the reliable application of the DQA procedures outlined here, LOD is replaced with ‘0’.

Precision is evaluated (see Section 4) through:

  • Measurement Uncertainty (MU) is used to identify reliable elements and individual measurements (see Section 4.1). The benchmarks used is: Reported value ≥ ±2δ MU
  • Coefficient of Variation (CV) is calculated for repeated measurements of the same sample to assess overall precision and determine which elements are reliable across the entire dataset (see Section 4.2). The benchmark is set based on the dataset. From the literature and the authors own work, a benchmark of 20% for archaeological pottery, stone, daub and in-situ measurements of soil are appropriate.

In both cases no more than 20% of individual measurements for any given chemical element and no more than 20% of elements considered reliable for a single measurement (MU) or sample (CV) may exceed the benchmarks.

1.4.2 Accuracy Assessment

For accuracy assessment (see Section 5), the following methods are employed to monitor the device performance based on the two standards (see Section 1.3) used for this project:

  • Reference values (nominal values) are calculated to define the baseline of comparison (see Section 5.1.1 and Section 5.2.1). Control limits are set at mean ±2δSD.
  • Shewhart Control Charts (SCCs) are constructed using said reference values to monitor device performance (see Section 5.1.2 and Section 5.2.2). Three out of five selected elements with CV ≤ 10% (1δSD) - Zn, Sr, Rb, Sr and Y - have to fall in the given value range.
  • Relative Percentage Difference (%RD) is calculated to further evaluate accuracy (see Section 5.1.3 and Section 5.2.3). The %RD hast to be below 10% and above -10%.

1.4.3 Calibration & Normalisation

This dataset is calibrated using the coefficient correction coefcor IV (see Section 6) developed according to the Munich Procedure for this specific device using the Frankfurt pottery sample set (Schauer 2024). Major elements are normalised to a 100%.

1.5 Skript & Packages

This Quarto script (R Quarto v.1.5.55) (Allaire et al. 2024) was created using R v.4.4.1 (R Core Team 2024b) and RStudio v.2024.04.2 (RStudio Team 2024). The following R packages are used:

Before running the analyses, all packages must be loaded (see Section 2) and the working directory set (see Section 3).

The script runs without errors, provided the predefined data structure is maintained.

2 Packages

library(basictabler)
library(data.table)
library(dplyr)
library(ggalluvial)
library(ggplot2)
library(ggpubr)
library(grid)
library(htmlwidgets)
library(openxlsx)
library(plotly)
library(purrr)
library(qcc)
library(readr)
library(svglite)
library(tidyverse)
library(writexl)

3 Set working directory

knitr::opts_knit$set(root.dir = "./")

4 Precision Assessment

4.1 Measurement Uncertainty

4.1.1 Measurement Uncertainty of Measurements

# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")

# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
data <- data1 %>%
  mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>%  # Ensure < LOD is replaced with 0
  mutate(across(15:155, as.numeric)) # Ensure all values are numeric

# Define columns
variables <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "Cl", "Sc", "V", "Cr", "Co", "Ni", "Cu", "Zn", "As", "Se", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Pd", "Ag", "Cd", "Sn", "Sb", "Te", "Cs", "Ba", "La", "Ce", "Hf", "Ta", "W", "Re", "Au", "Hg", "Pb", "Bi", "Th", "U", "Bal")

# Calculate violations
for (variable in variables) {
  filter <- data[data[, variable] < 2*data[, paste0(variable, ".Error")], ]
  count <- nrow(filter)
  assign(paste0("data", variable), filter)
  assign(paste0("count", variable), count)
  print(paste0("count", variable, ": ", count))}
[1] "countSi: 0"
[1] "countTi: 0"
[1] "countAl: 0"
[1] "countFe: 0"
[1] "countMn: 0"
[1] "countMg: 29"
[1] "countCa: 0"
[1] "countK: 0"
[1] "countP: 4"
[1] "countS: 2"
[1] "countCl: 30"
[1] "countSc: 43"
[1] "countV: 0"
[1] "countCr: 0"
[1] "countCo: 250"
[1] "countNi: 0"
[1] "countCu: 6"
[1] "countZn: 1"
[1] "countAs: 36"
[1] "countSe: 276"
[1] "countRb: 0"
[1] "countSr: 0"
[1] "countY: 0"
[1] "countZr: 0"
[1] "countNb: 0"
[1] "countMo: 223"
[1] "countPd: 236"
[1] "countAg: 109"
[1] "countCd: 75"
[1] "countSn: 17"
[1] "countSb: 16"
[1] "countTe: 18"
[1] "countCs: 9"
[1] "countBa: 0"
[1] "countLa: 11"
[1] "countCe: 9"
[1] "countHf: 283"
[1] "countTa: 283"
[1] "countW: 283"
[1] "countRe: 283"
[1] "countAu: 282"
[1] "countHg: 191"
[1] "countPb: 15"
[1] "countBi: 283"
[1] "countTh: 45"
[1] "countU: 142"
[1] "countBal: 0"
# Compile list of elements with at least one but less then 20% of violations
l = list(dataMg,dataP,dataS,dataCl,dataSc,dataCu,dataZn,dataAs,dataSn,dataSb,dataTe,dataCs,dataLa,dataCe,dataPb,dataTh)
Tab1<-rbindlist(l, use.names=TRUE, fill=TRUE)

# Export to .csv
write.csv(Tab1,"../Data//Assur_Pottery_2024_SD_IndivMeas.csv",row.names=FALSE)

Interpretation: Following the data quality criteria defined in Section 1.4.1, elements showing more then 56 measurements (20% of 283 measurements) violating the criteria, should not be used - or if necessary - only with great caution. This is the case for Co, Se, Mo, Pd, Ag, Cd, Hf, Ta, W, Re, Au, Hg, Bi and U. Therefore, from the original 47 elements, 33 remain. This means each sample can’t violate the criterium for more then 6 elements (20% of 33). Individual measurements violating this bench marks are only 656 (AS085) and 691 (AS074). Therefore, the related samples have to be viewed with caution in the next steps of data quality checking.

4.1.2 Measurement Uncertainty of Samples

# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")

# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
data <- data1 %>%
  mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>%  # Ensure < LOD is replaced with 0
  mutate(across(15:155, as.numeric)) # Ensure all values are numeric

# Calculate mean per sample
Mean<-(data) %>%
  group_by(SAMPLE) %>%
  summarize(across(everything(),list(mean=mean))) 

# Delete "_mean" from the column name
colnames(Mean) <- gsub("_mean", "", colnames(Mean))

# Define dataset
data<- Mean

# Define columns
variables <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "Cl", "Sc", "V", "Cr", "Co", "Ni", "Cu", "Zn", "As", "Se", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Pd", "Ag", "Cd", "Sn", "Sb", "Te", "Cs", "Ba", "La", "Ce", "Hf", "Ta", "W", "Re", "Au", "Hg", "Pb", "Bi", "Th", "U", "Bal")

# Calculate violations
for (variable in variables) {
  filter <- data[data[, variable] < 2 * data[, paste0(variable, ".Error")], ]
  count <- nrow(filter)
  assign(paste0("data", variable), filter)
  assign(paste0("count", variable), count)
  print(paste0("count", variable, ": ", count))}
[1] "countSi: 0"
[1] "countTi: 0"
[1] "countAl: 0"
[1] "countFe: 0"
[1] "countMn: 0"
[1] "countMg: 1"
[1] "countCa: 0"
[1] "countK: 0"
[1] "countP: 0"
[1] "countS: 0"
[1] "countCl: 1"
[1] "countSc: 2"
[1] "countV: 0"
[1] "countCr: 0"
[1] "countCo: 73"
[1] "countNi: 0"
[1] "countCu: 1"
[1] "countZn: 0"
[1] "countAs: 11"
[1] "countSe: 77"
[1] "countRb: 0"
[1] "countSr: 0"
[1] "countY: 0"
[1] "countZr: 0"
[1] "countNb: 0"
[1] "countMo: 73"
[1] "countPd: 72"
[1] "countAg: 40"
[1] "countCd: 22"
[1] "countSn: 3"
[1] "countSb: 3"
[1] "countTe: 2"
[1] "countCs: 0"
[1] "countBa: 0"
[1] "countLa: 3"
[1] "countCe: 1"
[1] "countHf: 77"
[1] "countTa: 77"
[1] "countW: 77"
[1] "countRe: 77"
[1] "countAu: 77"
[1] "countHg: 56"
[1] "countPb: 9"
[1] "countBi: 77"
[1] "countTh: 17"
[1] "countU: 46"
[1] "countBal: 0"
# Compile list of elements with at least one but less then 20% of violations
l = list(dataMg,dataCl,dataSc,dataCu,dataAs,dataSn,dataSb,dataTe,dataLa,dataCe,dataPb)
Tab1<-rbindlist(l, use.names=TRUE, fill=TRUE)

# Export to .csv
write.csv(Tab1,"../Data//Assur_Pottery_2024_SD_Samples.csv",row.names=FALSE)

Interpretation: Following the data quality criteria defined in Section 1.4.1, elements showing more then 15 samples (20% of 77 samples) violating the criteria, should not be used - or if necessary - only with great caution. This is the case for Co, Se, Mo, Pd, Ag, Cd, Hf, Ta, W, Re, Au, Hg, Bi, Th and U. Therefore, from the original 47 elements, 32 remain. This means each sample can’t violate the criterium for more then 6 elements (20% of 32). No samples violate this criterium.

4.2 Coefficient of Variation

4.2.1 Create Number-Coded Spreadsheet

# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")

# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
data <- data1 %>%
  mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>%  # Ensure < LOD is replaced with 0
  mutate(across(15:155, as.numeric)) # Ensure all values are numeric

# CV function
CV <- function(x) { (sd(x) / mean(x)) * 100 }

# Create the SampleControl1 dataframe with CV for each group
SampleControl1<-(data) %>%
  group_by(SAMPLE) %>%
  summarize(across(everything(),list(CV=CV))) 

# Define columns 
Control_CV<-SampleControl1[,c("SAMPLE","Si_CV","Ti_CV","Al_CV","Fe_CV","Mn_CV","Mg_CV","Ca_CV","K_CV","P_CV","V_CV","Cr_CV","Ni_CV","Cu_CV","Zn_CV","Rb_CV","Sr_CV","Y_CV","Zr_CV","Nb_CV","Ba_CV","Pb_CV")]

# Replace missing values with "99"
Control_CV[is.na(Control_CV)] <- 99

# Replace values below 20 with 0, above 20 with 1 and define sample name as last column
Control_CV_0 <- cbind(apply(Control_CV[,2:22], 2, function(x) ifelse(x < 20, 0, x)), Control_CV[,1])
Control_CV_0_1 <- cbind(apply(Control_CV_0[,1:21], 2, function(x) ifelse(x > 20, 1, x)), Control_CV_0[,22])

# Create new data frame
Control_CV_2 <- as.data.frame(Control_CV_0_1)

# Rename last column as "SAMPLE"
Control_CV_2<-Control_CV_2 %>% rename(SAMPLE=22)

# Define all other columns as numeric
Control_CV_2[,1:21] <- sapply(Control_CV_2[,1:21],as.numeric)

# Calculate row sum excluding specific columns
Control_CV_3<-Control_CV_2%>%mutate(rows_sum=rowSums(.[, !(names(.) %in% c("SAMPLE","Mg_CV","P_CV","Cu_CV","Zn_CV","Ba_CV","Pb_CV"))]))

# Calculate column sums except for column "SAMPLE"
col_sum <- colSums(Control_CV_3[, !colnames(Control_CV_3) %in% "SAMPLE"])
col_sum <- c(col_sum,0)
Control_CV_4 <- rbind(Control_CV_3,col_sum)

# Calculate column percent
rown<-(nrow(Control_CV_4)-1)
percent<-(100/rown*col_sum)
Control_CV_5 <- rbind(Control_CV_4,percent)

# Calculate row percent
rows_sum<-Control_CV_5$rows_sum
coln<-ncol(Control_CV_5)
rowper<-(100/(coln-7)*rows_sum)
Control_CV_6 <- cbind(Control_CV_5,rowper)

# Round values function
Numform <- function(x) {if(is.numeric(x)) {return(round(x))} else {
    return(x)}}

# Create new dataframe
Control_CV_6 <- as.data.frame(lapply(Control_CV_6,Numform))

# Apply rounding function
Control_CV_6<-Control_CV_6[order(Control_CV_6$rowper, decreasing = TRUE),]

# Add column sum and percent information in specific cell
text<-"Column sum"
Control_CV_6[78,22]<-text

text<-"Column percent"
Control_CV_6[79,22]<-text

# Export to .csv
write.csv(Control_CV_6,"../Data//Assur_Pottery_2024_CV_numcode.csv",row.names=FALSE)

Interpretation: Following the data control, the elements Si, Ti, Al, Fe, Mn, Ca, K, V, Cr, Ni, Rb, Sr, Y, Zr and Nb are adhering to the quality criteria defined in Section 1.4.1. The same is true for all samples.

4.2.2 Create Color-Coded Spreadsheet

# Create new spreadsheet
tbl <- BasicTable$new()

# Add data from Control_CV
tbl$addData(Control_CV, firstColumnAsRowHeaders=TRUE)

# Define Cells
cells <- tbl$getCells(rowNumbers=2:77, columnNumbers=2:22, matchMode="combinations")

# Apply the styling
tbl$mapStyling(cells=cells, styleProperty="background-color", valueType="color",mapType="logic",mappings=list("v>20", "red", "v<=20", "black"))

# Render the table with the applied styles
tbl$renderTable()
# Save the table to an Excel workbook
wb <- createWorkbook(creator = Sys.getenv("M. Schauer"))
addWorksheet(wb, "Data")
tbl$writeToExcelWorksheet(wb=wb, wsName="Data",topRowNumber=1, leftMostColumnNumber=1, applyStyles=TRUE)

# Export to .xlsx
saveWorkbook(wb, file="../Data//Assur_Pottery_2024_CV_colcode.xlsx", overwrite = TRUE)

Comment: This spreadsheet is important during data interpretation to understand outlieres. If those can be defined, it is worth comming back to this spreadsheet to check whether the distinguishing elements for the specific sample are adhering to the quality criteria defined in Section 1.4.1. If not, the spread has to be checked to identify if the outlier is due to its specific chemistry or a too high variation in the measurements which might have been caused by high heterogeneity, roughness etc. of the sample.

5 Accuracy Assessment

5.1 Standard 2709a Accuracy

This procedure follows the steps outlined in Section 1.4.2.

5.1.1 Calculate reference value (nominal value)

5.1.1.1 Compute metrics of reference values

The name of the .csv-spreadsheet includes the purpose the measurements (reference value - ref_val_meas) were made for, the standard they are based on (2709a), month and year of measurement (Mai 2023 - mai2023) as well as the location (Munich): ref_val_2709a_mai2023_munich.csv. This makes the file easy to identify for future use.

# Load data
data1 <- read.csv("../Data//ref_val_meas_2709a_mai2023_munich.csv")

# Replace '< LOD' with 0 in each column
data2 <- data1 %>%
  mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
  mutate(across(everything(), as.numeric))  # Ensure all values are numeric

# Remove first 16 columns (no elemental data)
data <- data2[, -c(1:14)]

# Replace '.' with '_' in column names
data <- data %>%
  rename_with(~ gsub("\\.", "_", .x))

# Define statistical functions
CV <- function(x) (sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
SEM <- function(x) sd(x, na.rm = TRUE) / sqrt(length(x))
UCL <- function(x) mean(x, na.rm = TRUE) + sd(x, na.rm = TRUE)
LCL <- function(x) mean(x, na.rm = TRUE) - sd(x, na.rm = TRUE)
CV2 <- function(x) (2*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
UCL2 <- function(x) mean(x, na.rm = TRUE) + 2*sd(x, na.rm = TRUE)
LCL2 <- function(x) mean(x, na.rm = TRUE) - 2*sd(x, na.rm = TRUE)
CV3 <- function(x) (3*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
UCL3 <- function(x) mean(x, na.rm = TRUE) + 3*sd(x, na.rm = TRUE)
LCL3 <- function(x) mean(x, na.rm = TRUE) - 3*sd(x, na.rm = TRUE)

# Compute summary statistics
NominalSdARM2 <- data %>%
  summarize(across(everything(), list(
    Mean = ~ mean(.x, na.rm = TRUE),
    Sd = ~ sd(.x, na.rm = TRUE),
    SEM = ~ SEM(.x),
    CV = ~ CV(.x),
    LCL = ~ LCL(.x),
    UCL = ~ UCL(.x),
    CV2 = ~ CV2(.x),
    LCL2 = ~ LCL2(.x),
    UCL2 = ~ UCL2(.x),
    CV3 = ~ CV3(.x),
    LCL3 = ~ LCL3(.x),
    UCL3 = ~ UCL3(.x),
    MD = ~ median(.x, na.rm = TRUE)
  )))

# Define columns
Element <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr", 
             "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba", 
             "Pb", "Th", "U")

# Function to clean column names
cleanColumnNames <- function(df, prefix) {
  colnames(df) <- gsub(paste0(prefix, "_"), "", colnames(df))
  return(df)
}

# Function to extract and clean data for each element
extract_element_data <- function(element, df) {
  cols <- grep(paste0("^", element, "_(Mean|Sd|SEM|CV|LCL|UCL|CV2|LCL2|UCL2|CV3|LCL3|UCL3|MD|Error_Mean|Error_Sd|Error_MD)"), 
               colnames(df), value = TRUE)
  
  if (length(cols) > 0) {
    df_out <- df %>%
      select(all_of(cols)) %>%
      cleanColumnNames(element)
    
    # Add the element name as the first column
    df_out <- cbind(Element = element, df_out)
    return(df_out)
  } else {
    return(data.frame(Element = element, matrix(NA, nrow = 1, ncol = length(cols))))
  }
}

# Process all elements & create final table
Tab3 <- map_dfr(Element, extract_element_data, df = NominalSdARM2)
5.1.1.2 Create formated spreadsheet with all metrics
# Replace "Error" with "MU" in column names
colnames(Tab3) <- gsub("Error", "MU", colnames(Tab3))

# Rounding of values
numeric_cols <- c("Mean", "Sd", "SEM", "CV", "UCL", "LCL","CV2", "UCL2", "LCL2", "CV3", 
                  "UCL3", "LCL3","MD", "MU_Mean", "MU_Sd", "MU_MD")

for (col in numeric_cols) {
  if (col %in% colnames(Tab3)) {
    Tab3[[col]] <- round(Tab3[[col]], 
                         ifelse(col %in% c("CV", "CV2", "CV3"), 2, 
                                ifelse(col %in% c("UCL", "LCL", "UCL2", "LCL2", "UCL3", "LCL3"), 0, 1)))
  }
}

# Print table
print(Tab3)
   Element     Mean     Sd    SEM    CV    LCL    UCL    CV2   LCL2   UCL2
1       Si 295284.0 3205.2  966.4  1.09 292079 298489   2.17 288874 301694
2       Ti   3730.5   34.9   10.5  0.93   3696   3765   1.87   3661   3800
3       Al  87220.5 2338.5  705.1  2.68  84882  89559   5.36  82544  91898
4       Fe  34286.9   99.7   30.1  0.29  34187  34387   0.58  34088  34486
5       Mn    492.0   14.9    4.5  3.03    477    507   6.07    462    522
6       Mg  25548.6 4147.1 1250.4 16.23  21402  29696  32.46  17254  33843
7       Ca  19666.9   88.2   26.6  0.45  19579  19755   0.90  19490  19843
8        K  19497.5  167.1   50.4  0.86  19330  19665   1.71  19163  19832
9        P    384.9   32.4    9.8  8.42    353    417  16.85    320    450
10       S    441.2    9.7    2.9  2.21    431    451   4.42    422    461
11       V    137.6    8.8    2.6  6.37    129    146  12.73    120    155
12      Cr    107.6    4.8    1.4  4.42    103    112   8.84     98    117
13      Ni     90.2    4.9    1.5  5.43     85     95  10.86     80    100
14      Cu     36.3    2.9    0.9  8.06     33     39  16.12     30     42
15      Zn     88.3    2.1    0.6  2.42     86     90   4.83     84     93
16      As     10.0    0.7    0.2  7.10      9     11  14.21      9     11
17      Rb     85.8    1.6    0.5  1.89     84     87   3.79     83     89
18      Sr    223.2    1.6    0.5  0.74    222    225   1.47    220    226
19       Y     15.3    0.6    0.2  3.83     15     16   7.67     14     16
20      Zr    141.0    1.2    0.3  0.82    140    142   1.64    139    143
21      Nb      7.6    0.6    0.2  7.85      7      8  15.70      6      9
22      Mo      1.3    1.2    0.4 96.87      0      3 193.73     -1      4
23      Ba    733.1   18.5    5.6  2.53    715    752   5.06    696    770
24      Pb     16.3    1.1    0.3  6.81     15     17  13.62     14     19
25      Th      3.2    0.3    0.1  8.10      3      3  16.20      3      4
26       U      5.9    1.1    0.3 18.27      5      7  36.55      4      8
      CV3   LCL3   UCL3       MD MU_Mean MU_Sd  MU_MD
1    3.26 285668 304900 295988.7  1035.9   7.3 1036.5
2    2.80   3626   3835   3732.6    64.3   0.5   64.3
3    8.04  80205  94236  87879.1  1493.9  42.2 1492.1
4    0.87  33988  34586  34293.2   213.0   1.4  212.8
5    9.10    447    537    499.3    36.2   0.3   36.3
6   48.70  13107  37990  24942.2  3954.6 100.1 3958.3
7    1.35  19402  19932  19691.9   180.4   1.0  180.1
8    2.57  18996  19999  19429.5   253.5   2.2  253.3
9   25.27    288    482    396.5    67.8   0.8   67.6
10   6.62    412    470    439.8    16.6   0.2   16.5
11  19.10    111    164    136.7    19.1   0.2   19.0
12  13.26     93    122    107.3     9.4   0.1    9.4
13  16.29     76    105     91.0    14.2   0.1   14.2
14  24.18     28     45     36.7     7.6   0.1    7.6
15   7.25     82     95     88.1     5.9   0.1    5.9
16  21.31      8     12     10.1     2.3   0.0    2.3
17   5.68     81     91     85.5     2.4   0.0    2.4
18   2.21    218    228    223.0     3.1   0.0    3.1
19  11.50     14     17     15.4     1.5   0.0    1.5
20   2.46    138    144    141.3     2.8   0.0    2.8
21  23.55      6      9      7.5     1.6   0.0    1.6
22 290.60     -2      5      2.1     1.6   0.3    1.3
23   7.59    677    789    735.3    19.2   0.1   19.2
24  20.44     13     20     16.7     2.8   0.0    2.8
25  24.30      2      4      3.2     0.7   0.0    0.7
26  54.82      3      9      5.4     2.3   0.0    2.3
# Export to .csv
write.csv(Tab3, "../Data//metrics_ref_val_2709a_mai2023_munich.csv", row.names = FALSE)
5.1.1.3 Export reference values for daily device accuracy testing

This selection only includes the relevant elements and the values needed to define the test range in which the measurement values should lie: Element name with UCL and LCL of the 1δSD-range and UCL2/LCL2 of the 2δSD-range. You can choose the elements to export by adding them in the code for selected_elements.

# Select only necessary columns
dataset1 <- Tab3 %>% select(Element, LCL,UCL,LCL2,UCL2,LCL3,UCL3)

# Filter for the specific elements dynamically
selected_elements <- c("Rb", "Sr", "Y", "Zn", "Zr")
dataset <- dataset1 %>% filter(Element %in% selected_elements)

# Export to .csv
write.csv(dataset, "../Data//test_range_ref_val_2709a_mai2023_munich.csv", row.names = FALSE)

Comment: This spreadsheet can be used to compare the mean of three standard measurements, as calculated by the device and displayed on the screen, with the test range of the selected elements.Three out of five elements have to be in the 2δSD-range for the test for accuracy to be successful.

5.1.1.4 Create spreadsheet for Shewhart Control Charts
# Transpose the data
Tab4 <- as.data.frame(t(Tab3))

# Store the original row names (before transposing)
row_names <- rownames(Tab3)

# Store the original column names (before transposing)
col_names <- colnames(Tab3)

# Assign the original row names as column names of the transposed data
colnames(Tab4) <- row_names

# Assign the original column names as row names of the transposed data
rownames(Tab4) <- col_names

# Set the first row as the column names (this will use the values of the first row as column headers)
colnames(Tab4) <- Tab4[1, ]

# Remove the first row (since it is now used as the column names)
Tab4 <- Tab4[-1, ]

# # Export to .csv
write.csv(Tab4, "../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv", row.names = TRUE)

5.1.2 Compute Shewhart Control Charts

# Load and filter data
data <- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
filtered_data <- subset(data, SAMPLE == "2709a")

# Transform date field
data1 <- filtered_data |>
  mutate(
    Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"),  # 24-hour format
    Time = format(Time, "%m/%d/%Y")  # Output as mm/dd/yyyy
  )

# Replace '< LOD' with 0 in each column
data2 <- data1 %>%
  mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
  mutate(across(everything(), as.numeric)) 

# Remove first 16 columns (no elemental data)
data <- data2[, -c(1:14)]

# Load data
data1 <- read.csv("../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv")

# Create PDF for all SCC plots
pdf("../Figures//SCC_2709a_accuracy_plots.pdf", width = 8, height = 6)

# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr", 
             "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba", 
             "Pb", "Th", "U")

for (col in selected_columns) {
  
  # Extract the data for the current column
  chartdata <- data[, col]
  
  # Dynamically get metrics for current column from data1
  CL <- data1[1, col]        
  SD <- data1[2, col]       
  LCL <- data1[5, col]
  UCL <- data1[6, col]
  LCL2 <- data1[8, col]
  UCL2 <- data1[9, col]
  LCL3 <- data1[11, col]
  UCL3 <- data1[12, col]
  
  # Round to 0 decimal places
  CL <- round(CL, 0)
  SD <- round(SD, 0)
  LCL <- round(LCL, 0)
  UCL <- round(UCL, 0)
  LCL2 <- round(LCL2, 0)
  UCL2 <- round(UCL2, 0)
  LCL3 <- round(LCL3, 0)
  UCL3 <- round(UCL3, 0)

  # Create the control chart
  qcc_result <- qcc(
    data = chartdata, 
    type = "xbar", 
    nsigmas = 1,  
    std.dev = SD,  
    center = CL,
    rules = NULL,
    plot = FALSE  
  )

  # Create the plot for the current column
    plot(qcc_result, restore.par = FALSE, 
       title = paste("SCC for", col),
       ylab = "Concentration (ppm)", 
       xlab = "Values of 2709a throughout the 2023 measurement campaign")

  # Add 3-Sigma control limits
  abline(h = c(UCL3, LCL3), lty = 3, col = "black")  
  text(x = length(chartdata) * 1, y = UCL3, labels = "UCL3", pos = 3, col = "black", cex = 0.8)
  text(x = length(chartdata) * 1, y = LCL3, labels = "LCL3", pos = 1, col = "black", cex = 0.8)

  # Add 2-Sigma control limits
  abline(h = c(UCL2, LCL2), lty = 3, col = "black")
  text(x = length(chartdata) * 1, y = UCL2, labels = "UCL2", pos = 3, col = "black", cex = 0.8)
  text(x = length(chartdata) * 1, y = LCL2, labels = "LCL2", pos = 1, col = "black", cex = 0.8)

  # Highlight points based on thresholds
  below_1sigma <- chartdata < UCL | chartdata > LCL
  below_2sigma <- chartdata > UCL | chartdata < LCL
  above_3sigma <- chartdata > UCL3 | chartdata < LCL3
  above_2sigma <- (chartdata > UCL2 & chartdata <= UCL3) | (chartdata < LCL2 & chartdata >= LCL3)

    # Overlay red points for values in 1-sigma
  points(which(below_1sigma), chartdata[below_1sigma], col = "lightgreen", pch = 19, cex = 1.2)
  
  # Overlay yellow points for values between 1-sigma and 2-sigma
  points(which(below_2sigma), chartdata[below_2sigma], col = "lightblue", pch = 19, cex = 1.2)

  # Overlay yellow points for values between 2-sigma and 3-sigma
  points(which(above_2sigma), chartdata[above_2sigma], col = "orange", pch = 19, cex = 1.2)
  
  # Overlay red points for values beyond 3-sigma
  points(which(above_3sigma), chartdata[above_3sigma], col = "red", pch = 19, cex = 1.2)
  
  }

# Close the PDF file
dev.off()

Shewhart Control Charts

Interpretation: Skimming through the SCCs of the same standard throughout the 2023 measurement campaign reveals that the measurement values do vary. Many values fall within the ±1δSD range (green), with some extending into the ±1δSD to ±2δSD range (blue) but also ±2δSD to ±3δSD range (orange). Especially for the light elements, some major drifts and values beyond the quality criterium are visible, making it clear that all interpretation has to be checked in detail with other information such as petrography and archaeological background information (see also Schauer and Amicone 2024).

5.1.3 Compute Relative Percentage Difference (%RD)

5.1.3.1 Calculate %RD
# Load and filter data
data <- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
filtered_data <- subset(data, SAMPLE == "2709a")

# Transform date field
data1 <- filtered_data |>
  mutate(
    Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"),  # 24-hour format
    Time = format(Time, "%m/%d/%Y")  # Output as mm/dd/yyyy
  )

# Replace '< LOD' with 0 in each column
data2 <- data1[, -c(1:14)] %>%
  mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
  mutate(across(everything(), as.numeric)) 

# Merge data
data3<-cbind(data1[, c(1:14)],data2)

# Calculate mean
mean_data <- data3 %>%
  group_by(Time) %>%
  summarize(across(everything(), mean, na.rm = TRUE))

# Remove first 16 columns (no elemental data)
data4 <- mean_data[, -c(1:14)]

# Merge data
data<-cbind(mean_data[, c(1)],data4)

# Load second dataset
data1 <- read.csv("../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv")

# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr", 
                      "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba", 
                      "Pb", "Th", "U")

# Loop over each selected column
for (col in selected_columns) {
  
  # Extract the data for the current column by name
  mean_val <- data[, col]  # This extracts the column by name (which is a string)
  
  # Dynamically get the reference value for the current column from data1 (assuming row 1 contains reference values)
  CL <- data1[1, col]  # Reference value for the column from data1
  
  # Define the percentage difference function (element-wise operation)
  percentage_difference <- function(x) {
    return(((x - CL) / CL) * 100)  # Apply the percentage difference formula
  }
  
  # Apply the percentage difference function to each element in mean_val
  percentage_diff <- sapply(mean_val, percentage_difference)
  
  # Optionally, store the result in a new column in the data
  data[, paste(col, "percentage_diff", sep = "_")] <- percentage_diff
}

# Select necessary columns, round values (except 'Time'), and remove '_percentage_diff' from column names
data_RD <- data[, c(1, 143:168)] %>%
  mutate(across(-which(names(.) == "Time"), round, 0)) %>%  # Round all columns except 'Date'
  setNames(gsub("_percentage_diff", "", colnames(.))) %>%     # Remove '_percentage_diff' from column names
  setNames(gsub("^Time$", "%RD", colnames(.)))  # Replace "Time" with "%RD"
5.1.3.2 Format %RD-spreadsheet
# Create new spreadsheet
tbl <- BasicTable$new()
tbl$addData(data_RD, firstColumnAsRowHeaders=TRUE)

# Get the number of rows and columns in the dataset
num_rows <- nrow(data_RD)
num_columns <- ncol(data_RD)

# Find the index of the column with the header "%RD"
rd_column_index <- which(names(data_RD) == "%RD")

# Define the cells for which the styling needs to be applied (excluding the '%RD' column)
column_indices_to_style <- setdiff(2:num_columns, rd_column_index)

# Ensure the last row is included in styling
cells <- tbl$getCells(
  rowNumbers = 2:(num_rows + 1),  
  columnNumbers = column_indices_to_style,
  matchMode = "combinations"
)

# Apply the styling
tbl$mapStyling(
  cells = cells,
  styleProperty = "background-color",
  valueType = "color",
  mapType = "logic",
  mappings = list(
    "v >= -3 & v <= 3", "lightgreen",   # Values between -3 and 3
    "v > -7 & v < -3", "lightblue",     # Values between -7 and -3
    "v > 3 & v <= 7", "lightblue",      # Values between 3 and 7
    "v > 7 & v <= 10", "yellow",        # Values between 7 and 10
    "v > -10 & v <= -7", "yellow",      # Values between -10 and -7
    "v >= 10", "firebrick",              # Values greater than 10
    "v <= -10", "firebrick"            # Values less than -10
  )
)

# Render the table with the applied styles
tbl$renderTable()
# Save the table to an Excel workbook
wb <- createWorkbook(creator = Sys.getenv("M. Schauer"))
addWorksheet(wb, "Data")
tbl$writeToExcelWorksheet(wb = wb, wsName = "Data", topRowNumber = 1, leftMostColumnNumber = 1, applyStyles = TRUE)

# Export to .xlsx
saveWorkbook(wb, file = "../Data//relRD_2709a_mai2023_munich.xlsx", overwrite = TRUE)

Interpretation: The %RD demonstrates excellent accuracy (between -3 and +3) for the elements Si, Fe, Ca, K, Zn, Rb and Sr, with some elements — Ti, Al, S, Cr, Ni, Nb, Ba and Pb — also including highly accurate %RD values (3 to 7, -3 to -7). Mn, V, As and Zr show some %RD values of good accuracy (7 to 10, -7 to -10), whereas for Mg, P, Cu, Y, Mo, Th and U, the %RD values are not accurate enough (higher than 10, lower than -10). Therefore, these latter elements are not measured with sufficient accuracy and should either not be used or, if used, be handled with great care in further analysis.

5.1.3.3 Create figures of %RD
# Load data
data1 <- read.xlsx("../Data//relRD_2709a_mai2023_munich.xlsx")

# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr", 
                      "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba", 
                      "Pb", "Th", "U")

# Create a PDF file to save all plots
pdf("../Figures//relRD_2709a_mai2023_munich_plots.pdf", width = 10, height = 6)

# Loop through each selected column and create plots
for (col in selected_columns) {
  
  # Extract data for plotting
  plot_data <- data.frame(
    Day = as.factor(data1$`%RD`),  # Treat measurement days as categorical on X-axis
    Value = data1[[col]]
  )
  
  # Create scatter plot with lines connecting measurement days
  p <- ggplot(plot_data, aes(x = Day, y = Value, group = 1)) +
    geom_point() +  
    geom_line() +  # Connect points across measurement days
    geom_hline(yintercept = 0, linetype = "dashed", color = "red") +  # Reference line
    labs(title = paste("%RD-Plot for", col), x = "Measurement Day", y = col) +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 90, hjust = 1))  # Tilt x-axis labels

  # Display the plot in R
  print(p)
}

# Close the PDF device
dev.off()

%RD plots

5.2 Standard SdAR-M2 Accuracy

This procedure follows the steps outlined in Section 1.4.2.

5.2.1 Calculate reference value (nominal value)

5.2.1.1 Compute metrics of reference values

The name of the .csv-spreadsheet includes the purpose the measurements (reference value - ref_val_meas) were made for, the standard they are based on (SdAR-M2), month and year of measurement (November 2024 - nov2024) as well as the location (Munich): ref_val_SdAR-M2_nov2024_munich.csv. This makes the file easy to identify for future use.

# Load data
data1 <- read.csv("../Data//ref_val_meas_SdAR-M2_nov2024_munich.csv")

# Replace '< LOD' with 0 in each column
data2 <- data1 %>%
  mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
  mutate(across(everything(), as.numeric))  # Ensure all values are numeric

# Remove first 16 columns (no elemental data)
data <- data2[, -c(1:14)]

# Replace '.' with '_' in column names
data <- data %>%
  rename_with(~ gsub("\\.", "_", .x))

# Define statistical functions
CV <- function(x) (sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
SEM <- function(x) sd(x, na.rm = TRUE) / sqrt(length(x))
UCL <- function(x) mean(x, na.rm = TRUE) + sd(x, na.rm = TRUE)
LCL <- function(x) mean(x, na.rm = TRUE) - sd(x, na.rm = TRUE)
CV2 <- function(x) (2*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
UCL2 <- function(x) mean(x, na.rm = TRUE) + 2*sd(x, na.rm = TRUE)
LCL2 <- function(x) mean(x, na.rm = TRUE) - 2*sd(x, na.rm = TRUE)
CV3 <- function(x) (3*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
UCL3 <- function(x) mean(x, na.rm = TRUE) + 3*sd(x, na.rm = TRUE)
LCL3 <- function(x) mean(x, na.rm = TRUE) - 3*sd(x, na.rm = TRUE)

# Compute summary statistics
NominalSdARM2 <- data %>%
  summarize(across(everything(), list(
    Mean = ~ mean(.x, na.rm = TRUE),
    Sd = ~ sd(.x, na.rm = TRUE),
    SEM = ~ SEM(.x),
    CV = ~ CV(.x),
    LCL = ~ LCL(.x),
    UCL = ~ UCL(.x),
    CV2 = ~ CV2(.x),
    LCL2 = ~ LCL2(.x),
    UCL2 = ~ UCL2(.x),
    CV3 = ~ CV3(.x),
    LCL3 = ~ LCL3(.x),
    UCL3 = ~ UCL3(.x),
    MD = ~ median(.x, na.rm = TRUE)
  )))

# Define columns
Element <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr", 
             "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba", 
             "Pb", "Th", "U")

# Function to clean column names
cleanColumnNames <- function(df, prefix) {
  colnames(df) <- gsub(paste0(prefix, "_"), "", colnames(df))
  return(df)
}

# Function to extract and clean data for each element
extract_element_data <- function(element, df) {
  cols <- grep(paste0("^", element, "_(Mean|Sd|SEM|CV|LCL|UCL|CV2|LCL2|UCL2|CV3|LCL3|UCL3|MD|Error_Mean|Error_Sd|Error_MD)"), 
               colnames(df), value = TRUE)
  
  if (length(cols) > 0) {
    df_out <- df %>%
      select(all_of(cols)) %>%
      cleanColumnNames(element)
    
    # Add the element name as the first column
    df_out <- cbind(Element = element, df_out)
    return(df_out)
  } else {
    return(data.frame(Element = element, matrix(NA, nrow = 1, ncol = length(cols))))
  }
}

# Process all elements & create final table
Tab3 <- map_dfr(Element, extract_element_data, df = NominalSdARM2)
5.2.1.2 Create formated spreadsheet with all metrics
# Replace "Error" with "MU" in column names
colnames(Tab3) <- gsub("Error", "MU", colnames(Tab3))

# Rounding of values
numeric_cols <- c("Mean", "Sd", "SEM", "CV", "UCL", "LCL","CV2", "UCL2", "LCL2", "CV3", 
                  "UCL3", "LCL3","MD", "MU_Mean", "MU_Sd", "MU_MD")

for (col in numeric_cols) {
  if (col %in% colnames(Tab3)) {
    Tab3[[col]] <- round(Tab3[[col]], 
                         ifelse(col %in% c("CV", "CV2", "CV3"), 2, 
                                ifelse(col %in% c("UCL", "LCL", "UCL2", "LCL2", "UCL3", "LCL3"), 0, 1)))
  }
}

# Print table
print(Tab3)
   Element     Mean     Sd    SEM     CV    LCL    UCL    CV2   LCL2   UCL2
1       Si 377487.7 7948.9 2513.7   2.11 369539 385437   4.21 361590 393385
2       Ti   1578.7   43.7   13.8   2.77   1535   1622   5.54   1491   1666
3       Al  75023.3 3821.6 1208.5   5.09  71202  78845  10.19  67380  82666
4       Fe  17581.0  171.4   54.2   0.98  17410  17752   1.95  17238  17924
5       Mn    847.6   17.1    5.4   2.01    831    865   4.02    813    882
6       Mg   3280.3 4261.1 1347.5 129.90   -981   7541 259.80  -5242  11803
7       Ca   5233.6  134.4   42.5   2.57   5099   5368   5.14   4965   5502
8        K  41940.1  204.7   64.7   0.49  41735  42145   0.98  41531  42349
9        P    242.7   94.4   29.8  38.88    148    337  77.75     54    431
10       S    783.4   44.3   14.0   5.65    739    828  11.30    695    872
11       V     50.7    5.1    1.6  10.06     46     56  20.11     41     61
12      Cr     49.2    7.2    2.3  14.60     42     56  29.19     35     64
13      Ni     81.3   10.1    3.2  12.40     71     91  24.80     61    101
14      Cu    213.2    4.7    1.5   2.19    209    218   4.39    204    223
15      Zn    699.6    7.1    2.2   1.01    692    707   2.03    685    714
16      As     80.6    4.8    1.5   5.95     76     85  11.90     71     90
17      Rb    143.9    1.3    0.4   0.87    143    145   1.74    141    146
18      Sr    146.2    1.2    0.4   0.84    145    147   1.68    144    149
19       Y     30.8    0.9    0.3   3.07     30     32   6.14     29     33
20      Zr    251.0    5.0    1.6   1.99    246    256   3.98    241    261
21      Nb     25.2    0.5    0.2   1.99     25     26   3.98     24     26
22      Mo     13.8    1.0    0.3   7.16     13     15  14.31     12     16
23      Ba    783.6   14.6    4.6   1.87    769    798   3.74    754    813
24      Pb    850.9    7.3    2.3   0.86    844    858   1.72    836    865
25      Th      6.7    0.5    0.2   7.26      6      7  14.51      6      8
26       U      7.4    1.8    0.6  24.73      6      9  49.46      4     11
      CV3   LCL3   UCL3       MD MU_Mean  MU_Sd  MU_MD
1    6.32 353641 401334 375499.0  1233.5   30.8 1235.4
2    8.31   1448   1710   1569.1    42.3    0.8   42.3
3   15.28  63559  86488  74631.7  1406.4  152.6 1377.2
4    2.93  17067  18095  17564.8   152.5    1.0  152.8
5    6.03    796    899    847.9    40.9    0.3   40.9
6  389.70  -9503  16064      0.0  5217.6 1617.8 5270.5
7    7.70   4830   5637   5174.2    83.6    1.4   83.8
8    1.46  41326  42554  41992.7   340.9    1.8  341.3
9  116.63    -40    526    209.9    78.3    1.8   78.3
10  16.95    651    916    776.4    19.4    0.4   19.4
11  30.17     35     66     51.8    13.1    0.2   13.2
12  43.79     28     71     47.7     6.9    0.1    6.9
13  37.19     51    111     80.6    14.1    0.1   14.0
14   6.58    199    227    213.5    10.8    0.1   10.8
15   3.04    678    721    698.7    13.7    0.1   13.7
16  17.84     66     95     79.9     9.9    0.0    9.9
17   2.61    140    148    144.1     3.1    0.0    3.1
18   2.52    142    150    145.9     2.6    0.0    2.6
19   9.21     28     34     30.8     2.3    0.0    2.3
20   5.96    236    266    250.8     3.4    0.0    3.4
21   5.97     24     27     25.1     2.1    0.0    2.1
22  21.47     11     17     13.8     1.5    0.0    1.5
23   5.60    740    828    786.0    19.3    0.1   19.3
24   2.57    829    873    853.3    13.1    0.1   13.1
25  21.77      5      8      6.7     1.2    0.0    1.2
26  74.18      2     13      7.6     2.7    0.0    2.7
# Save final table with all reference value metrics
write.csv(Tab3, "../Data//metrics_ref_val_SdAR-M2_nov2024_munich.csv", row.names = FALSE)
5.2.1.3 Export reference values for daily device accuracy testing

This selection only includes the relevant elements and the values needed to define the test range in which the measurement values should lie: Element name with UCL and LCL of the 1δSD-range and UCL2/LCL2 of the 2δSD-range. You can choose the elements to export by adding them in the code for selected_elements.

# Select only necessary columns
dataset1 <- Tab3 %>% select(Element, LCL,UCL,LCL2,UCL2,LCL3,UCL3)

# Filter for the specific elements dynamically
selected_elements <- c("Rb", "Sr", "Y", "Zn", "Zr")
dataset <- dataset1 %>% filter(Element %in% selected_elements)

# Export to .csv
write.csv(dataset, "../Data//test_range_ref_val_SdAR-M2_nov2024_munich.csv", row.names = FALSE)

Comment: This spreadsheet can be used to compare the mean of three standard measurements, as calculated by the device and displayed on the screen, with the test range of the selected elements.Three out of five elements have to be in the 2δSD-range for the test for accuracy to be successful.

5.2.1.4 Create spreadsheet for Shewhart Control Charts
# Transpose the data
Tab4 <- as.data.frame(t(Tab3))

# Store the original row names (before transposing)
row_names <- rownames(Tab3)

# Store the original column names (before transposing)
col_names <- colnames(Tab3)

# Assign the original row names as column names of the transposed data
colnames(Tab4) <- row_names

# Assign the original column names as row names of the transposed data
rownames(Tab4) <- col_names

# Set the first row as the column names (this will use the values of the first row as column headers)
colnames(Tab4) <- Tab4[1, ]

# Remove the first row (since it is now used as the column names)
Tab4 <- Tab4[-1, ]

# # Export to .csv
write.csv(Tab4, "../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv", row.names = TRUE)

5.2.2 Compute Shewhart Control Charts

# Load and filter data
data <- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
filtered_data <- subset(data, SAMPLE == "SdAR-M2")


data1 <- filtered_data |>
  mutate(
    Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"),  # 24-hour format
    Time = format(Time, "%m/%d/%Y")  # Output as mm/dd/yyyy
  )

# Replace '< LOD' with 0 in each column
data2 <- data1 %>%
  mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
  mutate(across(everything(), as.numeric)) 

# Remove first 16 columns (no elemental data)
data <- data2[, -c(1:14)]

# Load data
data1 <- read.csv("../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv")

# Create PDF for all SCC plots
pdf("../Figures//SCC_SdAR-M2_accuracy_plots.pdf", width = 8, height = 6)

# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr", 
             "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba", 
             "Pb", "Th", "U")

for (col in selected_columns) {
  
  # Extract the data for the current column
  chartdata <- data[, col]
  
  # Dynamically get metrics for current column from data1
  CL <- data1[1, col]        
  SD <- data1[2, col]       
  LCL <- data1[5, col]
  UCL <- data1[6, col]
  LCL2 <- data1[8, col]
  UCL2 <- data1[9, col]
  LCL3 <- data1[11, col]
  UCL3 <- data1[12, col]
  
  # Round to 0 decimal places
  CL <- round(CL, 0)
  SD <- round(SD, 0)
  LCL <- round(LCL, 0)
  UCL <- round(UCL, 0)
  LCL2 <- round(LCL2, 0)
  UCL2 <- round(UCL2, 0)
  LCL3 <- round(LCL3, 0)
  UCL3 <- round(UCL3, 0)

  # Create the control chart
  qcc_result <- qcc(
    data = chartdata, 
    type = "xbar", 
    nsigmas = 1,  
    std.dev = SD,  
    center = CL,
    rules = NULL,
    plot = FALSE  
  )

  # Create the plot for the current column
    plot(qcc_result, restore.par = FALSE, 
       title = paste("SCC for", col),
       ylab = "Concentration (ppm)", 
       xlab = "Values of SdAR-M2 throughout the 2024 measurement campaign")

  # Add 3-Sigma control limits
  abline(h = c(UCL3, LCL3), lty = 3, col = "black")  
  text(x = length(chartdata) * 1, y = UCL3, labels = "UCL3", pos = 3, col = "black", cex = 0.8)
  text(x = length(chartdata) * 1, y = LCL3, labels = "LCL3", pos = 1, col = "black", cex = 0.8)

  # Add 2-Sigma control limits
  abline(h = c(UCL2, LCL2), lty = 3, col = "black")
  text(x = length(chartdata) * 1, y = UCL2, labels = "UCL2", pos = 3, col = "black", cex = 0.8)
  text(x = length(chartdata) * 1, y = LCL2, labels = "LCL2", pos = 1, col = "black", cex = 0.8)

  # Highlight points based on thresholds
  below_1sigma <- chartdata < UCL | chartdata > LCL
  below_2sigma <- chartdata > UCL | chartdata < LCL
  above_3sigma <- chartdata > UCL3 | chartdata < LCL3
  above_2sigma <- (chartdata > UCL2 & chartdata <= UCL3) | (chartdata < LCL2 & chartdata >= LCL3)

    # Overlay red points for values in 1-sigma
  points(which(below_1sigma), chartdata[below_1sigma], col = "lightgreen", pch = 19, cex = 1.2)
  
  # Overlay yellow points for values between 1-sigma and 2-sigma
  points(which(below_2sigma), chartdata[below_2sigma], col = "lightblue", pch = 19, cex = 1.2)

  # Overlay yellow points for values between 2-sigma and 3-sigma
  points(which(above_2sigma), chartdata[above_2sigma], col = "orange", pch = 19, cex = 1.2)
  
  # Overlay red points for values beyond 3-sigma
  points(which(above_3sigma), chartdata[above_3sigma], col = "red", pch = 19, cex = 1.2)
  
  }

# Close the PDF file
dev.off()

Shewhart Control Charts

Interpretation: Skimming through the SCCs of the same standard throughout the 2024 measurement campaign reveals quite taccurate results for all elements. Many values fall within the ±1δSD range (green), with some extending into the ±1δSD to ±2δSD range (blue) but more rarely to the ±2δSD to ±3δSD range (orange). Especially for the light elements, some drifts and values beyond the quality criterium are visible, making it clear that all interpretation has to be checked in detail with other information such as petrography and archaeological background information (see also Schauer and Amicone 2024).

5.2.3 Compute Relative Percentage Difference (%RD)

5.2.3.1 Calculate %RD
# Load data
data <- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")

filtered_data <- subset(data, SAMPLE == "SdAR-M2")

# Transform date field
data1 <- filtered_data |>
  mutate(
    Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"),  # 24-hour format
    Time = format(Time, "%m/%d/%Y")  # Output as mm/dd/yyyy
  )

# Replace '< LOD' with 0 in each column
data2 <- data1[, -c(1:14)] %>%
  mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
  mutate(across(everything(), as.numeric)) 

# Merge data
data3<-cbind(data1[, c(1:14)],data2)

# Calculate mean
mean_data <- data3 %>%
  group_by(Time) %>%
  summarize(across(everything(), mean, na.rm = TRUE))

# Remove first 16 columns (no elemental data)
data4 <- mean_data[, -c(1:14)]

# Merge data
data<-cbind(mean_data[, c(1)],data4)

# Load second dataset
data1 <- read.csv("../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv")

# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr", 
                      "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba", 
                      "Pb", "Th", "U")

# Loop over each selected column
for (col in selected_columns) {
  
  # Extract the data for the current column by name
  mean_val <- data[, col]  # This extracts the column by name (which is a string)
  
  # Dynamically get the reference value for the current column from data1 (assuming row 1 contains reference values)
  CL <- data1[1, col]  # Reference value for the column from data1
  
  # Define the percentage difference function (element-wise operation)
  percentage_difference <- function(x) {
    return(((x - CL) / CL) * 100)  # Apply the percentage difference formula
  }
  
  # Apply the percentage difference function to each element in mean_val
  percentage_diff <- sapply(mean_val, percentage_difference)
  
  # Optionally, store the result in a new column in the data
  data[, paste(col, "percentage_diff", sep = "_")] <- percentage_diff
}

# Select necessary columns, round values (except 'Time'), and remove '_percentage_diff' from column names
data_RD <- data[, c(1, 143:168)] %>%
  mutate(across(-which(names(.) == "Time"), round, 0)) %>%  # Round all columns except 'Date'
  setNames(gsub("_percentage_diff", "", colnames(.))) %>%     # Remove '_percentage_diff' from column names
  setNames(gsub("^Time$", "%RD", colnames(.)))  # Replace "Time" with "%RD"
5.2.3.2 Format %RD-spreadsheet
# Create new spreadsheet
tbl <- BasicTable$new()
tbl$addData(data_RD, firstColumnAsRowHeaders=TRUE)

# Get the number of rows and columns in the dataset
num_rows <- nrow(data_RD)
num_columns <- ncol(data_RD)

# Find the index of the column with the header "%RD"
rd_column_index <- which(names(data_RD) == "%RD")

# Define the cells for which the styling needs to be applied (excluding the '%RD' column)
column_indices_to_style <- setdiff(2:num_columns, rd_column_index)

# Ensure the last row is included in styling
cells <- tbl$getCells(
  rowNumbers = 2:(num_rows + 1),  
  columnNumbers = column_indices_to_style,
  matchMode = "combinations"
)

# Apply the styling
tbl$mapStyling(
  cells = cells,
  styleProperty = "background-color",
  valueType = "color",
  mapType = "logic",
  mappings = list(
    "v >= -3 & v <= 3", "lightgreen",   # Values between -3 and 3
    "v > -7 & v < -3", "lightblue",     # Values between -7 and -3
    "v > 3 & v <= 7", "lightblue",      # Values between 3 and 7
    "v > 7 & v <= 10", "yellow",        # Values between 7 and 10
    "v > -10 & v <= -7", "yellow",      # Values between -10 and -7
    "v >= 10", "firebrick",              # Values greater than 10
    "v <= -10", "firebrick"            # Values less than -10
  )
)

# Render the table with the applied styles
tbl$renderTable()
# Save the table to an Excel workbook
wb <- createWorkbook(creator = Sys.getenv("M. Schauer"))
addWorksheet(wb, "Data")
tbl$writeToExcelWorksheet(wb = wb, wsName = "Data", topRowNumber = 1, leftMostColumnNumber = 1, applyStyles = TRUE)

# Export to .xlsx
saveWorkbook(wb, file = "../Data//relRD_SdAR-M2_nov2024_munich.xlsx", overwrite = TRUE)

Interpretation: The %RD demonstrates excellent accuracy (between -3 and +3) for the elements Si, Ti, Al, Fe, K, Cu, Zn, Rb, Sr, Y, Zr, Nb and Pb, with some elements — Mn, Ca, Ni, As, Mo and Ba — also including highly accurate %RD values (3 to 7, -3 to -7). V shows some %RD values of good accuracy (7 to 10, -7 to -10), whereas for Mg, P, S, Cr, Th and U, generally speaking, the %RD values are not accurate enough (higher than 10, lower than -10). Therefore, these latter elements are not measured with sufficient accuracy and should either not be used or, if used, be handled with great care in further analysis.

5.2.3.3 Create figures of %RD
# Load data
data1 <- read.xlsx("../Data//relRD_SdAR-M2_nov2024_munich.xlsx")

# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr", 
                      "Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba", 
                      "Pb", "Th", "U")

# Create a PDF file to save all plots
pdf("../Figures//relRD_SdAR-M2_nov2024_munich_plots.pdf", width = 10, height = 6)

# Loop through each selected column and create plots
for (col in selected_columns) {
  
  # Extract data for plotting
  plot_data <- data.frame(
    Day = as.factor(data1$`%RD`),  # Treat measurement days as categorical on X-axis
    Value = data1[[col]]
  )
  
  # Create scatter plot with lines connecting measurement days
  p <- ggplot(plot_data, aes(x = Day, y = Value, group = 1)) +
    geom_point() +  
    geom_line() +  # Connect points across measurement days
    geom_hline(yintercept = 0, linetype = "dashed", color = "red") +  # Reference line
    labs(title = paste("%RD-Plot for", col), x = "Measurement Day", y = col) +
    theme_minimal() +
    theme(axis.text.x = element_text(angle = 90, hjust = 1))  # Tilt x-axis labels

  # Display the plot in R
  print(p)
}

# Close the PDF device
dev.off()

%RD plots

6 Apply coefficient correction (coefcorIV)

More information on coefficient correction coefcor IV can be found in (Schauer 2024).

6.1 Compile Data

# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")

# Replace '< LOD' with 0 in each column
data <- data1 %>%
  mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>%  # Ensure < LOD is replaced with 0
  mutate(across(15:155, as.numeric)) # Ensure all values are numeric

# Define columns
dataset<-data[, c("SAMPLE","Time","Si","Ti","Al","Fe","Mn","Mg","Ca","K","P","S","Cl","Sc","V","Cr","Co","Ni","Cu","Zn","As","Se","Rb","Sr","Y","Zr","Nb","Mo","Pd","Ag","Cd","Sn","Sb","Te","Cs","Ba","La","Ce","Hf","Ta","W","Re","Au","Hg","Pb","Bi","Th","U")]

# Load data
coefcor<- read.csv("../RawData//coefcorIV_factors.csv")

# Define relevant variables
SAMPLE<-dataset$SAMPLE
Time<-dataset$Time

6.2 Process Coefcor IV

# Select element column from the measurement data
Si<-dataset$Si

# Select slope correction
Si_a<-coefcor$Si_a

# Select offset correction (if applicable)
Si_b<-coefcor$Si_b

# Calculate coefficient correction and transform to oxide percent
SiO2<-(Si_a*Si+Si_b)*0.00021393

Ti<-dataset$Ti
Ti_a<-coefcor$Ti_a
Ti_b<-coefcor$Ti_b
TiO2<-(Ti_a*Ti+Ti_b)*0.0001668

Al<-dataset$Al
Al_a<-coefcor$Al_a
Al_b<-coefcor$Al_b
Al2O3<-(Al_a*Al+Al_b)*0.00018895

Fe<-dataset$Fe
Fe_a<-coefcor$Fe_a
Fe_b<-coefcor$Fe_b
Fe2O3<-(Fe_a*Fe+Fe_b)*0.000143

Mn<-dataset$Mn
Mn_a<-coefcor$Mn_a
Mn_b<-coefcor$Mn_b
MnO<-(Mn_a*Mn+Mn_b)*0.00012912

Mg<-dataset$Mg
Mg_a<-coefcor$Mg_a
Mg_b<-coefcor$Mg_b
MgO<-(Mg_a*Mg+Mg_b)*0.00016583

Ca<-dataset$Ca
Ca_a<-coefcor$Ca_a
Ca_b<-coefcor$Ca_b
CaO<-(Ca_a*Ca+Ca_b)*0.00013992

K<-dataset$K
K_a<-coefcor$K_a
K_b<-coefcor$K_b
K2O<-(K_a*K+K_b)*0.00012046

P<-dataset$P
P_a<-coefcor$P_a
P_b<-coefcor$P_b
P2O5<-(P_a*P+P_b)*0.00022914

# Create new data frame
data_norm<-data.frame(SiO2,TiO2,Al2O3,Fe2O3,MnO,MgO,CaO,K2O,P2O5)

# Calculate row sum
data_norm_withsum<-data_norm %>% rowwise() %>%  mutate(sum = sum(c(SiO2,TiO2,Al2O3,
                                                      Fe2O3,MnO,MgO,CaO,K2O,P2O5)))

# Define formular for row percent
sumpct<-100/data_norm_withsum$sum

# Normalise major elements
SiO2<-SiO2*sumpct
TiO2<-TiO2*sumpct
Al2O3<-Al2O3*sumpct
Fe2O3<-Fe2O3*sumpct
MnO<-MnO*sumpct
MgO<-MgO*sumpct
CaO<-CaO*sumpct
K2O<-K2O*sumpct
P2O5<-P2O5*sumpct

# Select element column from the measurement data
V<-dataset$V

# Select slope correction
V_a<-coefcor$V_a

# Select offset correction (if applicable)
V_b<-coefcor$V_b

# Calculate coefficient correction
V<-V_a*V+V_b

Cr<-dataset$Cr
Cr_a<-coefcor$Cr_a
Cr_b<-coefcor$Cr_b
Cr<-Cr_a*Cr+Cr_b

Ni<-dataset$Ni
Ni_a<-coefcor$Ni_a
Ni_b<-coefcor$Ni_b
Ni<-Ni_a*Ni+Ni_b

Cu<-dataset$Cu
Cu_a<-coefcor$Cu_a
Cu_b<-coefcor$Cu_b
Cu<-Cu_a*Cu+Cu_b

Zn<-dataset$Zn
Zn_a<-coefcor$Zn_a
Zn_b<-coefcor$Zn_b
Zn<-Zn_a*Zn+Zn_b

Rb<-dataset$Rb
Rb_a<-coefcor$Rb_a
Rb_b<-coefcor$Rb_b
Rb<-Rb_a*Rb+Rb_b

Sr<-dataset$Sr
Sr_a<-coefcor$Sr_a
Sr_b<-coefcor$Sr_b
Sr<-Sr_a*Sr+Sr_b

Y<-dataset$Y
Y_a<-coefcor$Y_a
Y_b<-coefcor$Y_b
Y<-Y_a*Y+Y_b

Zr<-dataset$Zr
Zr_a<-coefcor$Zr_a
Zr_b<-coefcor$Zr_b
Zr<-Zr_a*Zr+Zr_b

Nb<-dataset$Nb
Nb_a<-coefcor$Nb_a
Nb_b<-coefcor$Nb_b
Nb<-Nb_a*Nb+Nb_b

Ba<-dataset$Ba
Ba_a<-coefcor$Ba_a
Ba_b<-coefcor$Ba_b
Ba<-Ba_a*Ba+Ba_b

Pb<-dataset$Pb
Pb_a<-coefcor$Pb_a
Pb_b<-coefcor$Pb_b
Pb<-Pb_a*Pb+Pb_b

6.3 Compile Corrected Data

# Create new data frame
data<-data.frame(SAMPLE,SiO2,TiO2,Al2O3,Fe2O3,MnO,MgO,CaO,
                                K2O,P2O5,V,Cr,Ni,Cu,Zn,Rb,Sr,Y,Zr,Nb,Ba,Pb)

# Round selected columns to .000 for visually better plotting
data[, c("SiO2","TiO2","Al2O3","Fe2O3","MnO","MgO","CaO",
"K2O","P2O5")] <- round(data[, c("SiO2","TiO2","Al2O3","Fe2O3",
                                        "MnO","MgO","CaO","K2O","P2O5")], 3)

# Round selected columns to .00 for visually better plotting
data[, c("V","Cr","Ni","Cu","Zn","Rb","Sr","Y","Zr","Nb",
            "Ba","Pb")] <- round(data[, c("V","Cr","Ni","Cu","Zn",
                                        "Rb","Sr","Y","Zr","Nb","Ba","Pb")], 2)

# Export to .csv
write.csv(data,"../Data//Assur_Pottery_2024_Data_Coefcor.csv",row.names=TRUE)

6.4 Calculate Mean of Corrected Data

# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_Data_Coefcor.csv")

# Calculate mean
data4<-(data1) %>%
  group_by(SAMPLE) %>%
  summarize(across(everything(),list(mean=mean))) 

# Delete "_mean" from the column name  
colnames(data4) <- gsub("_mean", "", colnames(data4))

# Define all columns except for the first as numeric
data4[, -1] <- sapply(data4[, -1], as.numeric)

# Round selected columns to .000 for visually better plotting
data4[, c("SiO2","TiO2","Al2O3","Fe2O3","MnO","MgO","CaO",
"K2O","P2O5")] <- round(data4[, c("SiO2","TiO2","Al2O3","Fe2O3",
                                        "MnO","MgO","CaO","K2O","P2O5")], 3)

# Round selected columns to .00 for visually better plotting
data4[, c("V","Cr","Ni","Cu","Zn","Rb","Sr","Y","Zr","Nb",
            "Ba","Pb")] <- round(data4[, c("V","Cr","Ni","Cu","Zn",
                                        "Rb","Sr","Y","Zr","Nb","Ba","Pb")], 2)

# Select columns
data4<-data4[,c(1,3:7,9,10,12:14,15:21)]

# Export to .csv
write.csv(data4,"../Data//Assur_Pottery_2024_Data_Coefcor_MW.csv",row.names=FALSE)

6.5 Merge with Archaeological Information

# Load data
data2<- read.csv("../Data//Assur_Pottery_2024_ArchData.csv")

# Merge data
data3<-merge(data4,data2, by="SAMPLE", all=FALSE)

# Export to .csv
write.csv(data3,"../Data//Assur_Pottery_2024_complete_MW.csv",row.names=FALSE)

7 Tab. D2.1

# Load data
data1 <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")

# Select columns
data<-data1[,c(1:18,20,21,27,22,23,25)]

# Replace "A_Pathian" in column 23 with value from column 22 (same row)
data[[23]] <- ifelse(
  data[[23]] == "A_Parthian",
  data[[22]],
  data[[23]]
)

# Rename column 23 to "Group"
colnames(data)[23] <- "Group"

# Drop column 22
data <- data[, -22]

# Round columns 2 to 8 to two decimal places
data[, 2:8] <- lapply(data[, 2:8], function(x) round(x, 2))

# Round columns 9 to 18 to whole numbers
data[, 9:18] <- lapply(data[, 9:18], function(x) round(x, 0))

# Export to .xlsx
setwd("../Figures")
write_xlsx(data, path = "Tab_D2-1.xlsx")

8 Hellenistic Chemical Groups

8.1 Fig. D2.1

# Load and filter data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
data <- data[order(data$ChemGroup_H), ]

# Define custom colors and shapes
mycolors <- c(
  "HG1" = "dodgerblue1", "HG2" = "blue", "HG3" = "orange", "HG4" = "green",
  "HG5" = "wheat", "HO1" = "firebrick", "HO2" = "pink", "HO3" = "magenta",
  "HO4" = "red", "HO5" = "yellow3", "HO6" = "black", "HO7" = "lightblue",
  "HO8" = "slategrey", "HO9" = "seagreen"
)

myshapes <- c(
  "HG1" = 17, "HG2" = 19, "HG3" = 15, "HG4" = 1, "HG5" = 2,
  "HO1" = 3, "HO2" = 4, "HO3" = 8, "HO4" = 9, "HO5" = 15,
  "HO6" = 16, "HO7" = 18, "HO8" = 20, "HO9" = 10
)

# Define a reusable plot theme
base_theme <- theme_classic() +
  theme(
    axis.line = element_line(colour = "black", size = 0.25),
    axis.title = element_text(size = 9),
    axis.text = element_text(size = 8, color = "black"),
    axis.ticks = element_line(size = 0.25, colour = "black"),
    legend.title = element_blank(),
    legend.text = element_text(size = 8),
    legend.position = "bottom"
  )

# Function to generate a single plot
make_plot <- function(x, y, x_lab, y_lab) {
  ggplot(data, aes(x = .data[[x]], y = .data[[y]], color = ChemGroup_H, shape = ChemGroup_H)) +
    geom_point(size = 2) +
    scale_color_manual(values = mycolors) +
    scale_shape_manual(values = myshapes) +
    labs(x = x_lab, y = y_lab) +
    base_theme
}

# Create all plots
plot_1 <- make_plot("Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %")
plot_2 <- make_plot("CaO", "K2O", "CaO in %", "K2O in %")
plot_3 <- make_plot("TiO2", "MnO", "TiO2 in %", "MnO in %")
plot_4 <- make_plot("Rb", "Sr", "Rb in ppm", "Sr in ppm")
plot_5 <- make_plot("V", "Cr", "V in ppm", "Cr in ppm")
plot_6 <- make_plot("Ni", "Nb", "Ni in ppm", "Nb in ppm")
plot_7 <- make_plot("Y", "Zr", "Y in ppm", "Zr in ppm")

# Arrange the plots
grid_plot <- ggarrange(
  plot_1, plot_2, plot_3,
  plot_4, plot_5, plot_6,
  plot_7, NULL,
  ncol = 2, nrow = 4,
  common.legend = TRUE,
  legend = "bottom",
  align = "hv"
)

# Export to .eps
ggsave(
  filename = "Fig_D2-1.eps",
  plot = grid_plot,
  path = "../Figures",
  device = cairo_ps,
  width = 15.3,
  height = 21,
  units = "cm",
  dpi = 1200
)

8.2 Fig. D2.1 - Interactive with Parthian material

# Load data
data <- read.csv("../Data//Assur_Pottery_2024_complete_MW_ed.csv")

# Define custom colors and shapes
mycolors <- c("HG1" = "dodgerblue1", "HG2" = "blue","HG3"="orange","HG4"="green","HG5"="wheat","HO1"="firebrick","HO2"="pink","HO3"="magenta","HO4"="red","HO5"="yellow3","HO6"="black","HO7"="lightblue","HO8"="slategrey","HO9"="seagreen","A_Parthian"="grey85")

myshapes <- c(
  "HG1" = 17, "HG2" = 19, "HG3" = 15, "HG4" = 1, "HG5" = 2,
  "HO1" = 3, "HO2" = 4, "HO3" = 8, "HO4" = 9, "HO5" = 15,
  "HO6" = 16, "HO7" = 18, "HO8" = 20, "HO9" = 10, "A_Parthian"=17
)

# Sort data
data <- data[order(data$ChemGroup_H), ]

# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
  p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'ChemGroup_H', shape = 'ChemGroup_H', text = 'SAMPLE')) +
    geom_point(size = 2) +
    scale_shape_manual(values = myshapes) +
    scale_color_manual(values = mycolors) +
    labs(x = x_label, y = y_label, title = title) +
    theme_classic() +
    theme(
      axis.line = element_line(colour = "black", size = 0.25),
      legend.title = element_blank(),
      legend.text = element_text(size = 8),
      axis.title = element_text(size = 9),
      axis.text = element_text(size = 8, color = "black"),
      axis.ticks = element_line(size = 0.25, colour = "black"),
      legend.position = "bottom"
    )
  ggplotly(p, tooltip = c("text", "x", "y"))
}

# Create individual interactive plots
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Fig. D2.1 - Interactive with Parthian Material")

# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
  plot1, plot2, plot3,
  plot4, plot5, plot6,
  plot7,
  nrows = 4,
  shareX = FALSE, shareY = FALSE,
  titleX = TRUE, titleY = TRUE
)

# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Fig_D2-1_interactive.html", selfcontained = TRUE)

8.3 Interactive plots of Subfabric Groups

# Load and filter data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]

# Define custom colors and shapes
mycolors <- c(  "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854",  "6" = "#B3B3B3"   
)

mysymbols <- c("1A"="17","1B"="17","2A" = "19","2B" = "19","2C"="19","2E"="19","3A"="2","3B"="2","4A"="6","4B"="6","4C"="6","5"="15","6"="16")

# Sort data
data <- data[order(data$ChemGroup_H), ]

# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
  p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Subfabric', shape = 'Subfabric', text = 'SAMPLE')) +
    geom_point(size = 2) +
    scale_shape_manual(values = mysymbols) +
    scale_color_manual(values = mycolors) +
    labs(x = x_label, y = y_label, title = title) +
    theme_classic() +
    theme(
      axis.line = element_line(colour = "black", size = 0.25),
      legend.title = element_blank(),
      legend.text = element_text(size = 8),
      axis.title = element_text(size = 9),
      axis.text = element_text(size = 8, color = "black"),
      axis.ticks = element_line(size = 0.25, colour = "black"),
      legend.position = "bottom"
    )
  ggplotly(p, tooltip = c("text", "x", "y"))
}

# Create individual interactive plots 
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Scatterplot of Hellenistic Subfabric groups")

# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
  plot1, plot2, plot3,
  plot4, plot5, plot6,
  plot7,
  nrows = 4,
  shareX = FALSE, shareY = FALSE,
  titleX = TRUE, titleY = TRUE
)

# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Hellenistic_Subfabricgroups.html", selfcontained = TRUE)

8.4 Fig. D2.5

# Load and filter data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]

# Define color palettes
subfabric_colors <- c(  "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854",  "6" = "#B3B3B3"   
)

chemgroup_colors <- c("HG1" = "dodgerblue1", "HG2" = "blue","HG3"="orange","HG4"="green","HG5"="wheat","HO1"="firebrick","HO2"="pink","HO3"="magenta","HO4"="red","HO5"="yellow3","HO6"="black","HO7"="lightblue","HO8"="slategrey","HO9"="seagreen","A_Parthian"="grey85")

# Sort and aggregate data
plot_data <- data |>
  filter(!is.na(ChemGroup_H), !is.na(Vessel.type), !is.na(Subfabric)) |>
  arrange(ChemGroup_H) |>
  count(ChemGroup_H, Vessel.type, Subfabric)

# Calculate label widths
get_text_width <- function(label, size = 2.5) {
  pushViewport(viewport())
  g <- textGrob(label, gp = gpar(fontsize = size * 10))
  width <- convertWidth(grobWidth(g), "npc", valueOnly = TRUE)
  popViewport()
  return(width)
}

# Define unique strata and associated axis
stratum_df <- tibble(
  stratum = unique(c(plot_data$ChemGroup_H, plot_data$Vessel.type, plot_data$Subfabric))
) |>
  mutate(
    axis = case_when(
      stratum %in% plot_data$ChemGroup_H   ~ "axis1",
      stratum %in% plot_data$Vessel.type   ~ "axis2",
      stratum %in% plot_data$Subfabric     ~ "axis3"
    )
  )

# Estimate text width
stratum_df <- stratum_df |>
  rowwise() |>
  mutate(text_width = get_text_width(stratum, size = 2.5)) |>
  ungroup()

# Compute max width per axis + padding
axis_widths <- stratum_df |>
  group_by(axis) |>
  summarise(max_width = max(text_width) + 0.02)  # Add padding

# Join width to strata
stratum_df <- stratum_df |>
  left_join(axis_widths, by = "axis")

# Plot with custom widths
alluvial_plot <- ggplot(plot_data, aes(
  axis1 = ChemGroup_H,
  axis2 = Vessel.type,
  axis3 = Subfabric,
  y = n
)) +
  geom_alluvium(
    aes(fill = ChemGroup_H),
    width = 0.1,  # constant for flow thickness
    alpha = 0.6
  ) +
  geom_stratum(
    aes(
      fill = after_stat(stratum),
      width = case_when(
        after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis1"] ~ axis_widths$max_width[axis_widths$axis == "axis1"],
        after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis2"] ~ axis_widths$max_width[axis_widths$axis == "axis2"],
        after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis3"] ~ axis_widths$max_width[axis_widths$axis == "axis3"]
      )
    ),
    color = "black"
  ) +
  geom_text(
    stat = "stratum",
    aes(label = after_stat(stratum)),
    size = 2.5,
    color = "black"
  ) +
  scale_x_discrete(
    limits = c("ChemGroup_H", "Vessel.type", "Subfabric"),
    expand = c(0.05, 0.05)
  ) +
  scale_fill_manual(
    values = c(chemgroup_colors, subfabric_colors),
    na.value = "grey",
    guide = "none"
  ) +
  labs(
    title = "Fig. D2.5",
    y = "Sample Count"
  ) +
  theme_minimal()

# Display the plot
alluvial_plot

# Export to .svg
svglite("../Figures/Fig_D2-5.svg", width = 12, height = 8)
print(alluvial_plot)
dev.off()
png 
  2 

8.5 Interactive plots of Vessel Types

# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]

# Define color palettes
mycolors <- c("beaker" = "dodgerblue1", "beaker/jar" = "blue","bottle"="orange","body sherd"="slategrey","carinated bowl"="green","cooking pot"="wheat","echinus bowl"="firebrick","folded rim jar"="pink","jar"="magenta","large-mouth pot"="red","large storage jar"="yellow3","medium jar"="black","rolled rim plate"="lightblue","small jar"="brown4","thin walled beaker"="grey")

# Sort data
data <- data[order(data$Vessel.type), ]

# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
  p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Vessel.type', shape = 'Vessel.type', text = 'SAMPLE')) +
    geom_point(size = 2) +
    scale_shape_manual(values = mysymbols) +
    scale_color_manual(values = mycolors) +
    labs(x = x_label, y = y_label, title = title) +
    theme_classic() +
    theme(
      axis.line = element_line(colour = "black", size = 0.25),
      legend.title = element_blank(),
      legend.text = element_text(size = 8),
      axis.title = element_text(size = 9),
      axis.text = element_text(size = 8, color = "black"),
      axis.ticks = element_line(size = 0.25, colour = "black"),
      legend.position = "bottom"
    )
  ggplotly(p, tooltip = c("text", "x", "y"))
}

# Create individual interactive plots 
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")

# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
  plot1, plot2, plot3,
  plot4, plot5, plot6,
  plot7,
  nrows = 4,
  shareX = FALSE, shareY = FALSE,
  titleX = TRUE, titleY = TRUE
)

# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Hellenistic_VesselTypes.html", selfcontained = TRUE)

9 Parthian Chemical Groups

9.1 Fig. D2.2

# Load and filter data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
data <- data[order(data$ChemGroup_P), ]

# Define custom colors and shapes
mycolors <- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")

myshapes <- c(
  "PG1" = 17, "PG2" = 19, "PG3" = 15, "PG4" = 1, "PG5" = 2,
  "PO1" = 3, "PO2" = 4, "PO3" = 8, "PO4" = 9, "PO5" = 15,
  "PO6" = 16, "PO7" = 18, "PO8" = 20, "PO9" = 10
)

# Define a reusable plot theme
base_theme <- theme_classic() +
  theme(
    axis.line = element_line(colour = "black", size = 0.25),
    axis.title = element_text(size = 9),
    axis.text = element_text(size = 8, color = "black"),
    axis.ticks = element_line(size = 0.25, colour = "black"),
    legend.title = element_blank(),
    legend.text = element_text(size = 8),
    legend.position = "bottom"
  )

# Function to generate a single plot
make_plot <- function(x, y, x_lab, y_lab) {
  ggplot(data, aes(x = .data[[x]], y = .data[[y]], color = ChemGroup_P, shape = ChemGroup_P)) +
    geom_point(size = 2) +
    scale_color_manual(values = mycolors) +
    scale_shape_manual(values = myshapes) +
    labs(x = x_lab, y = y_lab) +
    base_theme
}

# Create all plots
plot_1 <- make_plot("Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %")
plot_2 <- make_plot("CaO", "K2O", "CaO in %", "K2O in %")
plot_3 <- make_plot("TiO2", "MnO", "TiO2 in %", "MnO in %")
plot_4 <- make_plot("Rb", "Sr", "Rb in ppm", "Sr in ppm")
plot_5 <- make_plot("V", "Cr", "V in ppm", "Cr in ppm")
plot_6 <- make_plot("Ni", "Nb", "Ni in ppm", "Nb in ppm")
plot_7 <- make_plot("Y", "Zr", "Y in ppm", "Zr in ppm")

# Arrange the plots
grid_plot <- ggarrange(
  plot_1, plot_2, plot_3,
  plot_4, plot_5, plot_6,
  plot_7, NULL,
  ncol = 2, nrow = 4,
  common.legend = TRUE,
  legend = "bottom",
  align = "hv"
)

# Export to .eps
ggsave(
  filename = "Fig_D2-2.eps",
  plot = grid_plot,
  path = "../Figures",
  device = cairo_ps,  
  width = 15.3,
  height = 21,
  units = "cm",
  dpi = 1200
)

9.2 Fig. D2.2 - Interactive with Hellenistic material

# Load data
data <- read.csv("../Data//Assur_Pottery_2024_complete_MW_ed.csv")

# Define custom colors
mycolors <- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")

# Sort data
data <- data[order(data$ChemGroup_P), ]

# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
  p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'ChemGroup_P', shape = 'ChemGroup_P', text = 'SAMPLE')) +
    geom_point(size = 2) +
    scale_shape_manual(values = c(10, 11,17, 19, 15, 1, 2, 3, 4, 8, 9, 15, 16, 18, 20)) +
    scale_color_manual(values = mycolors) +
    labs(x = x_label, y = y_label, title = title) +
    theme_classic() +
    theme(
      axis.line = element_line(colour = "black", size = 0.25),
      legend.title = element_blank(),
      legend.text = element_text(size = 8),
      axis.title = element_text(size = 9),
      axis.text = element_text(size = 8, color = "black"),
      axis.ticks = element_line(size = 0.25, colour = "black"),
      legend.position = "bottom"
    )
  ggplotly(p, tooltip = c("text", "x", "y"))
}

# Create individual interactive plots 
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Fig. D2.1 - Interactive with Hellenistic Material")

# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
  plot1, plot2, plot3,
  plot4, plot5, plot6,
  plot7,
  nrows = 4,
  shareX = FALSE, shareY = FALSE,
  titleX = TRUE, titleY = TRUE
)

# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Fig_D2-2_interactive.html", selfcontained = TRUE)

9.3 Interactive plots of Subfabric Groups

# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]

# Define color palettes
mycolors <- c(  "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854",  "6" = "#B3B3B3"   
)

mysymbols <- c("1A"="17","1B"="17","2A" = "19","2B" = "19","2C"="19","2E"="19","3A"="2","3B"="2","4A"="6","4B"="6","4C"="6","5"="15","6"="16")

# Sort data
data <- data[order(data$ChemGroup_P), ]

# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
  p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Subfabric', shape = 'Subfabric', text = 'SAMPLE')) +
    geom_point(size = 2) +
    scale_shape_manual(values = mysymbols) +
    scale_color_manual(values = mycolors) +
    labs(x = x_label, y = y_label, title = title) +
    theme_classic() +
    theme(
      axis.line = element_line(colour = "black", size = 0.25),
      legend.title = element_blank(),
      legend.text = element_text(size = 8),
      axis.title = element_text(size = 9),
      axis.text = element_text(size = 8, color = "black"),
      axis.ticks = element_line(size = 0.25, colour = "black"),
      legend.position = "bottom"
    )
  ggplotly(p, tooltip = c("text", "x", "y"))
}

# Create individual interactive plots 
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")

# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
  plot1, plot2, plot3,
  plot4, plot5, plot6,
  plot7,
  nrows = 4,
  shareX = FALSE, shareY = FALSE,
  titleX = TRUE, titleY = TRUE
)

# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Parthian_Subfabricgroups.html", selfcontained = TRUE)

9.4 Fig. D2.6

# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]

# Define your color palettes
subfabric_colors <- c(  "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854",  "6" = "#B3B3B3"   
)

chemgroup_colors <- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")

# Sort and aggregate data
plot_data <- data |>
  filter(!is.na(ChemGroup_P), !is.na(Vessel.type), !is.na(Subfabric)) |>
  arrange(ChemGroup_P) |>
  count(ChemGroup_P, Vessel.type, Subfabric)

# Calculate label widths
get_text_width <- function(label, size = 2.5) {
  pushViewport(viewport())
  g <- textGrob(label, gp = gpar(fontsize = size * 10))
  width <- convertWidth(grobWidth(g), "npc", valueOnly = TRUE)
  popViewport()
  return(width)
}

# Define unique strata and associated axis
stratum_df <- tibble(
  stratum = unique(c(plot_data$ChemGroup_P, plot_data$Vessel.type, plot_data$Subfabric))
) |>
  mutate(
    axis = case_when(
      stratum %in% plot_data$ChemGroup_P   ~ "axis1",
      stratum %in% plot_data$Vessel.type   ~ "axis2",
      stratum %in% plot_data$Subfabric     ~ "axis3"
    )
  )

# Estimate text width
stratum_df <- stratum_df |>
  rowwise() |>
  mutate(text_width = get_text_width(stratum, size = 2.5)) |>
  ungroup()

# Compute max width per axis + padding
axis_widths <- stratum_df |>
  group_by(axis) |>
  summarise(max_width = max(text_width) + 0.02)  # Add padding

# Join width to strata
stratum_df <- stratum_df |>
  left_join(axis_widths, by = "axis")

# Plot with custom widths
alluvial_plot <- ggplot(plot_data, aes(
  axis1 = ChemGroup_P,
  axis2 = Vessel.type,
  axis3 = Subfabric,
  y = n
)) +
  geom_alluvium(
    aes(fill = ChemGroup_P),
    width = 0.1,  # constant for flow thickness
    alpha = 0.6
  ) +
  geom_stratum(
    aes(
      fill = after_stat(stratum),
      width = case_when(
        after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis1"] ~ axis_widths$max_width[axis_widths$axis == "axis1"],
        after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis2"] ~ axis_widths$max_width[axis_widths$axis == "axis2"],
        after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis3"] ~ axis_widths$max_width[axis_widths$axis == "axis3"]
      )
    ),
    color = "black"
  ) +
  geom_text(
    stat = "stratum",
    aes(label = after_stat(stratum)),
    size = 2.5,
    color = "black"
  ) +
  scale_x_discrete(
    limits = c("ChemGroup_P", "Vessel.type", "Subfabric"),
    expand = c(0.05, 0.05)
  ) +
  scale_fill_manual(
    values = c(chemgroup_colors, subfabric_colors),
    na.value = "grey",
    guide = "none"
  ) +
  labs(
    title = "Fig. D2.6",
    y = "Sample Count"
  ) +
  theme_minimal()

# Display the plot
alluvial_plot

# Export to .svg
svglite("../Figures/Fig_D2-6.svg", width = 12, height = 8)
print(alluvial_plot)
dev.off()
png 
  2 

9.5 Interactive plots of Vessel Types

# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]

# Define color palettes
mycolors <- c("bottle"="orange","body sherd"="slategrey","bottle or jar" = "dodgerblue1", "bowl " = "blue","bowl or jug"="green","closed bowl"="wheat","deep bowl"="firebrick","jar or pot"="pink","jar"="magenta","hole-mouth jar or pot"="red","krater"="yellow3","large flat bowl"="black","large vessel"="lightblue","open bowl"="brown4","rim sherd"="grey","small jar or juglet"="darkgreen")

# Sort data
data <- data[order(data$Vessel.type), ]

# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
  p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Vessel.type', shape = 'Vessel.type', text = 'SAMPLE')) +
    geom_point(size = 2) +
    scale_shape_manual(values = mysymbols) +
    scale_color_manual(values = mycolors) +
    labs(x = x_label, y = y_label, title = title) +
    theme_classic() +
    theme(
      axis.line = element_line(colour = "black", size = 0.25),
      legend.title = element_blank(),
      legend.text = element_text(size = 8),
      axis.title = element_text(size = 9),
      axis.text = element_text(size = 8, color = "black"),
      axis.ticks = element_line(size = 0.25, colour = "black"),
      legend.position = "bottom"
    )
  ggplotly(p, tooltip = c("text", "x", "y"))
}

# Create individual interactive plots 
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")

# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
  plot1, plot2, plot3,
  plot4, plot5, plot6,
  plot7,
  nrows = 4,
  shareX = FALSE, shareY = FALSE,
  titleX = TRUE, titleY = TRUE
)

# Display the grid
grid
# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Parthian_VesselTypes.html", selfcontained = TRUE)

10 Tab. D2.3 - Basic Information

# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")

# Create Hellenistic period table
table_h <- data %>%
  filter(Period == "Hellenistic") %>%
  count(ChemGroup_H, Subfabric) %>%
  pivot_wider(names_from = Subfabric, values_from = n, values_fill = 0) %>%
  mutate(Period = "Hellenistic")

# Create Parthian period table
table_p <- data %>%
  filter(Period == "Parthian") %>%
  count(ChemGroup_P, Subfabric) %>%
  pivot_wider(names_from = Subfabric, values_from = n, values_fill = 0) %>%
  mutate(Period = "Parthian")

# Standardizing column names for merging
colnames(table_h)[1] <- "ChemGroup"
colnames(table_p)[1] <- "ChemGroup"

# Stack the tables together
stacked_table <- bind_rows(table_h, table_p)

# Export to .xlsx
setwd("../Figures")
write_xlsx(stacked_table, path = "Tab_D2-3.xlsx")

11 References

Allaire, Joseph J., Charles Teague, Carlos Scheidegger, Yihui Xie, and Christophe Dervieux. 2024. “Quarto v.1.5.55.” Edited by P. B. C. Posit Software. Boston. https://quarto.org.
Bailiss, Christopher. 2021. “Basictabler: Construct Rich Tables for Output to ’HTML’/’excel’: R Package Version 1.0.2.” https://CRAN.R-project.org/package=basictabler.
Barrett, Tyson, Matt Dowle, Arun Srinivasan, Jan Gorecki, Michael Chirico, and Toby Hocking. 2024. “Data.table: Extension of ‘Data.frame‘: R Package Version 1.15.4.” https://CRAN.R-project.org/package=data.table.
Brunson, Jason Cory, and Quentin D. Read. 2023. “Ggalluvial: Alluvial Plots in ’Ggplot2: R Package Version 0.12.5.” http://corybrunson.github.io/ggalluvial/.
International Association of Geoanalysts. 2015. “Reference Material Data Sheet: SdAR-M2 Metal-Rich Sediment.” https://iageo.com/wp-content/uploads/2017/11/SdAR-M2_RM_data_sheet-1.pdf.
Kassambara, Alboukadel. 2023. “Ggpubr: ’Ggplot2’ Based Publication Ready Plots: R Package Version 0.6.0.” https://CRAN.R-project.org/package=ggpubr.
National Institute of Standards & Technology. 2018. “Certificate of Analysis: Standard Reference Material 2709a - San Joaquin Soil.” https://tsapps.nist.gov/srmext/certificates/2709a.pdf.
Ooms, Jeroen. 2025. “Writexl: Export Data Frames to Excel ’Xlsx’ Format: R Package Version 1.5.3.” https://CRAN.R-project.org/package=writexl.
R Core Team. 2024a. “Grid: The Grid Graphics Package: R Package Version 4.3.2.” https://stat.ethz.ch/R-manual/R-devel/library/grid/html/00Index.html.
———. 2024b. “R v.4.4.0: A Language and Environment for Statistical Computing.” Edited by R Foundation for Statistical Computing. Vienna. https://www.R-project.org/.
RStudio Team. 2024. “RStudio v. 2024.04.2: Integrated Development Environment for r.” Edited by P. B. C. Posit Software. Boston. https://www.rstudio.com/.
Schauberger, Philipp, and Alexander Walker. 2024. “Openxlsx: Read, Write and Edit Xlsx Files: R Package Version 4.2.6.1.” https://CRAN.R-project.org/package=openxlsx.
Schauer, Michaela. 2024. “Coefficient Corrections for Portable x-Ray Fluorescence Data of the Niton XL3t No. 97390 (Coefcor i-IV) Developed According to the Munich Procedure.” Data in Brief, no. 53: 109914. https://doi.org/10.1016/j.dib.2023.109914.
Schauer, Michaela, and Silvia Amicone. 2024. “E2. First Steps Towards a Fabric Classification for Assur: Portable x-Ray Fluorescence (p-XRF) and Petrographical Analyses on the 2023 Pottery.” In Assur 2023: Excavations and Other Research in the New Town, edited by Karen Radner and Andrea Squitieri, 150–59. Exploring Assur. Gladbeck: PeWe-Verlag.
———. 2025. “Chemical and Petrographical Analysis of Hellenistic and Parthian Pottery from Assur, 2023 and 2024.” In Assur 2024: Continuing the Excavations in the New Town and Other Research Across the Site, edited by Karen Radner and Andrea Squitieri. Exploring Assur. Gladbeck: PeWe-Verlag.
Scrucca, Luca. 2004. “Qcc: An r Package for Quality Control Charting and Statistical Process Control: R Package Version 1.0.2.” R News, no. 4/1: 11–17. https://cran.r-project.org/doc/Rnews/.
Sievert, Carson. 2020. Interactive Web-Based Data Visualization with r, Plotly, and Shiny. Chapman and Hall/CRC the r Series. Boca Raton, FL: CRC Press Taylor and Francis Group. https://search.ebscohost.com/login.aspx?direct=true&scope=site&db=nlebk&db=nlabk&AN=2366373.
Vaidyanathan, Ramnath, Xihui Xie, Joseph J. Allaire, Joe Cheng, Carson Sievert, and Kenton Russell. 2023. “Htmlwidgets: HTML Widgets for r: R Package Version 1.6.4.” https://CRAN.R-project.org/package=htmlwidgets.
Wickham, Hadley. 2023a. “Forcats: Tools for Working with Categorical Variables (Factors): R Package Version 1.0.0.” https://CRAN.R-project.org/package=forcats.
———. 2023b. “Stringr: Simple, Consistent Wrappers for Common String Operations: R Package Version 1.5.1.” https://CRAN.R-project.org/package=stringr.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software, no. 43/4: 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, Lionel Henry, Thomas Lin Pedersen, T. Jake Luciani, Matthieu Decorde, and Vaudor Lise. 2023. “Svglite: An ’SVG’ Graphics Device: R Package Version 2.1.3.” https://CRAN.R-project.org/package=svglite.
Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2024. “Readr: Read Rectangular Text Data: R Package Version 2.1.5.” https://CRAN.R-project.org/package=readr.
Wilke, Claus. 2024. “Cowplot: Streamlined Plot Theme and Plot Annotations for ’Ggplot2’: R Package Version 1.1.3.” https://CRAN.R-project.org/package=cowplot.