library(basictabler)
library(data.table)
library(dplyr)
library(ggalluvial)
library(ggplot2)
library(ggpubr)
library(grid)
library(htmlwidgets)
library(openxlsx)
library(plotly)
library(purrr)
library(qcc)
library(readr)
library(svglite)
library(tidyverse)
library(writexl)1 Introduction
1.1 Content & Aim
This R script contains data quality control, data processing and figures for the p-XRF data collected from pottery excavated in 2023 and 2024 in Assur in the scope of the project New Archaeological Exploration of Assur led by by Prof. Dr. Karen Radner (LMU Munich). The samples were selected in the field by Dr. Jana Richter, Dr. Andrea Squitieri (LMU Munich), Dr. Florian Janoscha Kreppner & Ellen Coster (University of Münster). p-XRF data was handled by Dr. Michaela Schauer (LMU Munich/VIAS), measurements undertaken by Marco Wolf (LMU Munich). Related petrographic analysis was performed by Dr. Silvia Amicone (University of Tübingen) (Schauer and Amicone 2024).
The dataset is part of the second Exploring Assur publication (Schauer and Amicone 2025).
1.2 Device & Measurement Parameters
The data was collected using the Niton XL3t (No.97390), previously owned by LMU Munich’s Department of Cultural Studies and Archaeology and, since late 2023, hosted by LMU Munich’s Department of Geology. Three measurements were taken per sample on different areas of a fresh break, using the TestAllGeo mode with an 8 mm collimator and a measurement time of 300 seconds (60 seconds each for the main, low, and high filters, and 120 seconds for the light filter).
Data collection was conducted by Marco Wolf from August 9 to 11, 2023 (temperature: 28-35°C, relative humidity: %35-42) at the Dep. for Cultural Studies (Schauer and Amicone 2024) and from November 13 to 25, 2024 (temperature: 20°C; relative humidity: 35–42%) in the geology lab.
1.3 Standards used
In 2023, Standard NIST 2709a (National Institute of Standards & Technology 2018), in 2024 Standard SdAR-M2 (International Association of Geoanalysts 2015) was used.
1.4 Data processing
The data of both measurement campaign has been compiled in one file and run though the precision assessment together. For accuracy assessment, due to the use of two different standards (see Section 1.3), the campaigns were analysed separately (see Section 5.1 and Section 5.2).
Following data quality assessment, the data was calibrated and major elements normalised to 100% (see Section 6).
1.4.1 Precision Assessment
To ensure the reliable application of the DQA procedures outlined here, LOD is replaced with ‘0’.
Precision is evaluated (see Section 4) through:
- Measurement Uncertainty (MU) is used to identify reliable elements and individual measurements (see Section 4.1). The benchmarks used is: Reported value ≥ ±2δ MU
- Coefficient of Variation (CV) is calculated for repeated measurements of the same sample to assess overall precision and determine which elements are reliable across the entire dataset (see Section 4.2). The benchmark is set based on the dataset. From the literature and the authors own work, a benchmark of 20% for archaeological pottery, stone, daub and in-situ measurements of soil are appropriate.
In both cases no more than 20% of individual measurements for any given chemical element and no more than 20% of elements considered reliable for a single measurement (MU) or sample (CV) may exceed the benchmarks.
1.4.2 Accuracy Assessment
For accuracy assessment (see Section 5), the following methods are employed to monitor the device performance based on the two standards (see Section 1.3) used for this project:
- Reference values (nominal values) are calculated to define the baseline of comparison (see Section 5.1.1 and Section 5.2.1). Control limits are set at mean ±2δSD.
- Shewhart Control Charts (SCCs) are constructed using said reference values to monitor device performance (see Section 5.1.2 and Section 5.2.2). Three out of five selected elements with CV ≤ 10% (1δSD) - Zn, Sr, Rb, Sr and Y - have to fall in the given value range.
- Relative Percentage Difference (%RD) is calculated to further evaluate accuracy (see Section 5.1.3 and Section 5.2.3). The %RD hast to be below 10% and above -10%.
1.4.3 Calibration & Normalisation
This dataset is calibrated using the coefficient correction coefcor IV (see Section 6) developed according to the Munich Procedure for this specific device using the Frankfurt pottery sample set (Schauer 2024). Major elements are normalised to a 100%.
1.5 Skript & Packages
This Quarto script (R Quarto v.1.5.55) (Allaire et al. 2024) was created using R v.4.4.1 (R Core Team 2024b) and RStudio v.2024.04.2 (RStudio Team 2024). The following R packages are used:
- basictabler (Bailiss 2021)
- data.table (Barrett et al. 2024)
- dplyr (Wickham 2023b)
- ggalluvial (Brunson and Read 2023)
- ggplot2 (Wilke 2024)
- ggpubr (Kassambara 2023)
- grid (R Core Team 2024a)
- htmlwidgets (Vaidyanathan et al. 2023)
- openxlsx (Schauberger and Walker 2024)
- plotly (Sievert 2020)
- purrr (Wickham 2023a)
- qcc (Scrucca 2004)
- readr (Wickham, Hester, and Bryan 2024)
- svglite (Wickham et al. 2023)
- tidyverse (Wickham et al. 2019)
- writexl (Ooms 2025)
Before running the analyses, all packages must be loaded (see Section 2) and the working directory set (see Section 3).
The script runs without errors, provided the predefined data structure is maintained.
2 Packages
3 Set working directory
knitr::opts_knit$set(root.dir = "./")4 Precision Assessment
4.1 Measurement Uncertainty
4.1.1 Measurement Uncertainty of Measurements
# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
data <- data1 %>%
mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>% # Ensure < LOD is replaced with 0
mutate(across(15:155, as.numeric)) # Ensure all values are numeric
# Define columns
variables <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "Cl", "Sc", "V", "Cr", "Co", "Ni", "Cu", "Zn", "As", "Se", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Pd", "Ag", "Cd", "Sn", "Sb", "Te", "Cs", "Ba", "La", "Ce", "Hf", "Ta", "W", "Re", "Au", "Hg", "Pb", "Bi", "Th", "U", "Bal")
# Calculate violations
for (variable in variables) {
filter <- data[data[, variable] < 2*data[, paste0(variable, ".Error")], ]
count <- nrow(filter)
assign(paste0("data", variable), filter)
assign(paste0("count", variable), count)
print(paste0("count", variable, ": ", count))}[1] "countSi: 0"
[1] "countTi: 0"
[1] "countAl: 0"
[1] "countFe: 0"
[1] "countMn: 0"
[1] "countMg: 29"
[1] "countCa: 0"
[1] "countK: 0"
[1] "countP: 4"
[1] "countS: 2"
[1] "countCl: 30"
[1] "countSc: 43"
[1] "countV: 0"
[1] "countCr: 0"
[1] "countCo: 250"
[1] "countNi: 0"
[1] "countCu: 6"
[1] "countZn: 1"
[1] "countAs: 36"
[1] "countSe: 276"
[1] "countRb: 0"
[1] "countSr: 0"
[1] "countY: 0"
[1] "countZr: 0"
[1] "countNb: 0"
[1] "countMo: 223"
[1] "countPd: 236"
[1] "countAg: 109"
[1] "countCd: 75"
[1] "countSn: 17"
[1] "countSb: 16"
[1] "countTe: 18"
[1] "countCs: 9"
[1] "countBa: 0"
[1] "countLa: 11"
[1] "countCe: 9"
[1] "countHf: 283"
[1] "countTa: 283"
[1] "countW: 283"
[1] "countRe: 283"
[1] "countAu: 282"
[1] "countHg: 191"
[1] "countPb: 15"
[1] "countBi: 283"
[1] "countTh: 45"
[1] "countU: 142"
[1] "countBal: 0"
# Compile list of elements with at least one but less then 20% of violations
l = list(dataMg,dataP,dataS,dataCl,dataSc,dataCu,dataZn,dataAs,dataSn,dataSb,dataTe,dataCs,dataLa,dataCe,dataPb,dataTh)
Tab1<-rbindlist(l, use.names=TRUE, fill=TRUE)
# Export to .csv
write.csv(Tab1,"../Data//Assur_Pottery_2024_SD_IndivMeas.csv",row.names=FALSE)Interpretation: Following the data quality criteria defined in Section 1.4.1, elements showing more then 56 measurements (20% of 283 measurements) violating the criteria, should not be used - or if necessary - only with great caution. This is the case for Co, Se, Mo, Pd, Ag, Cd, Hf, Ta, W, Re, Au, Hg, Bi and U. Therefore, from the original 47 elements, 33 remain. This means each sample can’t violate the criterium for more then 6 elements (20% of 33). Individual measurements violating this bench marks are only 656 (AS085) and 691 (AS074). Therefore, the related samples have to be viewed with caution in the next steps of data quality checking.
4.1.2 Measurement Uncertainty of Samples
# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
data <- data1 %>%
mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>% # Ensure < LOD is replaced with 0
mutate(across(15:155, as.numeric)) # Ensure all values are numeric
# Calculate mean per sample
Mean<-(data) %>%
group_by(SAMPLE) %>%
summarize(across(everything(),list(mean=mean)))
# Delete "_mean" from the column name
colnames(Mean) <- gsub("_mean", "", colnames(Mean))
# Define dataset
data<- Mean
# Define columns
variables <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "Cl", "Sc", "V", "Cr", "Co", "Ni", "Cu", "Zn", "As", "Se", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Pd", "Ag", "Cd", "Sn", "Sb", "Te", "Cs", "Ba", "La", "Ce", "Hf", "Ta", "W", "Re", "Au", "Hg", "Pb", "Bi", "Th", "U", "Bal")
# Calculate violations
for (variable in variables) {
filter <- data[data[, variable] < 2 * data[, paste0(variable, ".Error")], ]
count <- nrow(filter)
assign(paste0("data", variable), filter)
assign(paste0("count", variable), count)
print(paste0("count", variable, ": ", count))}[1] "countSi: 0"
[1] "countTi: 0"
[1] "countAl: 0"
[1] "countFe: 0"
[1] "countMn: 0"
[1] "countMg: 1"
[1] "countCa: 0"
[1] "countK: 0"
[1] "countP: 0"
[1] "countS: 0"
[1] "countCl: 1"
[1] "countSc: 2"
[1] "countV: 0"
[1] "countCr: 0"
[1] "countCo: 73"
[1] "countNi: 0"
[1] "countCu: 1"
[1] "countZn: 0"
[1] "countAs: 11"
[1] "countSe: 77"
[1] "countRb: 0"
[1] "countSr: 0"
[1] "countY: 0"
[1] "countZr: 0"
[1] "countNb: 0"
[1] "countMo: 73"
[1] "countPd: 72"
[1] "countAg: 40"
[1] "countCd: 22"
[1] "countSn: 3"
[1] "countSb: 3"
[1] "countTe: 2"
[1] "countCs: 0"
[1] "countBa: 0"
[1] "countLa: 3"
[1] "countCe: 1"
[1] "countHf: 77"
[1] "countTa: 77"
[1] "countW: 77"
[1] "countRe: 77"
[1] "countAu: 77"
[1] "countHg: 56"
[1] "countPb: 9"
[1] "countBi: 77"
[1] "countTh: 17"
[1] "countU: 46"
[1] "countBal: 0"
# Compile list of elements with at least one but less then 20% of violations
l = list(dataMg,dataCl,dataSc,dataCu,dataAs,dataSn,dataSb,dataTe,dataLa,dataCe,dataPb)
Tab1<-rbindlist(l, use.names=TRUE, fill=TRUE)
# Export to .csv
write.csv(Tab1,"../Data//Assur_Pottery_2024_SD_Samples.csv",row.names=FALSE)Interpretation: Following the data quality criteria defined in Section 1.4.1, elements showing more then 15 samples (20% of 77 samples) violating the criteria, should not be used - or if necessary - only with great caution. This is the case for Co, Se, Mo, Pd, Ag, Cd, Hf, Ta, W, Re, Au, Hg, Bi, Th and U. Therefore, from the original 47 elements, 32 remain. This means each sample can’t violate the criterium for more then 6 elements (20% of 32). No samples violate this criterium.
4.2 Coefficient of Variation
4.2.1 Create Number-Coded Spreadsheet
# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
# Replace '< LOD' with 0 in each column with measurement values, ensuring NA handling
data <- data1 %>%
mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>% # Ensure < LOD is replaced with 0
mutate(across(15:155, as.numeric)) # Ensure all values are numeric
# CV function
CV <- function(x) { (sd(x) / mean(x)) * 100 }
# Create the SampleControl1 dataframe with CV for each group
SampleControl1<-(data) %>%
group_by(SAMPLE) %>%
summarize(across(everything(),list(CV=CV)))
# Define columns
Control_CV<-SampleControl1[,c("SAMPLE","Si_CV","Ti_CV","Al_CV","Fe_CV","Mn_CV","Mg_CV","Ca_CV","K_CV","P_CV","V_CV","Cr_CV","Ni_CV","Cu_CV","Zn_CV","Rb_CV","Sr_CV","Y_CV","Zr_CV","Nb_CV","Ba_CV","Pb_CV")]
# Replace missing values with "99"
Control_CV[is.na(Control_CV)] <- 99
# Replace values below 20 with 0, above 20 with 1 and define sample name as last column
Control_CV_0 <- cbind(apply(Control_CV[,2:22], 2, function(x) ifelse(x < 20, 0, x)), Control_CV[,1])
Control_CV_0_1 <- cbind(apply(Control_CV_0[,1:21], 2, function(x) ifelse(x > 20, 1, x)), Control_CV_0[,22])
# Create new data frame
Control_CV_2 <- as.data.frame(Control_CV_0_1)
# Rename last column as "SAMPLE"
Control_CV_2<-Control_CV_2 %>% rename(SAMPLE=22)
# Define all other columns as numeric
Control_CV_2[,1:21] <- sapply(Control_CV_2[,1:21],as.numeric)
# Calculate row sum excluding specific columns
Control_CV_3<-Control_CV_2%>%mutate(rows_sum=rowSums(.[, !(names(.) %in% c("SAMPLE","Mg_CV","P_CV","Cu_CV","Zn_CV","Ba_CV","Pb_CV"))]))
# Calculate column sums except for column "SAMPLE"
col_sum <- colSums(Control_CV_3[, !colnames(Control_CV_3) %in% "SAMPLE"])
col_sum <- c(col_sum,0)
Control_CV_4 <- rbind(Control_CV_3,col_sum)
# Calculate column percent
rown<-(nrow(Control_CV_4)-1)
percent<-(100/rown*col_sum)
Control_CV_5 <- rbind(Control_CV_4,percent)
# Calculate row percent
rows_sum<-Control_CV_5$rows_sum
coln<-ncol(Control_CV_5)
rowper<-(100/(coln-7)*rows_sum)
Control_CV_6 <- cbind(Control_CV_5,rowper)
# Round values function
Numform <- function(x) {if(is.numeric(x)) {return(round(x))} else {
return(x)}}
# Create new dataframe
Control_CV_6 <- as.data.frame(lapply(Control_CV_6,Numform))
# Apply rounding function
Control_CV_6<-Control_CV_6[order(Control_CV_6$rowper, decreasing = TRUE),]
# Add column sum and percent information in specific cell
text<-"Column sum"
Control_CV_6[78,22]<-text
text<-"Column percent"
Control_CV_6[79,22]<-text
# Export to .csv
write.csv(Control_CV_6,"../Data//Assur_Pottery_2024_CV_numcode.csv",row.names=FALSE)Interpretation: Following the data control, the elements Si, Ti, Al, Fe, Mn, Ca, K, V, Cr, Ni, Rb, Sr, Y, Zr and Nb are adhering to the quality criteria defined in Section 1.4.1. The same is true for all samples.
4.2.2 Create Color-Coded Spreadsheet
# Create new spreadsheet
tbl <- BasicTable$new()
# Add data from Control_CV
tbl$addData(Control_CV, firstColumnAsRowHeaders=TRUE)
# Define Cells
cells <- tbl$getCells(rowNumbers=2:77, columnNumbers=2:22, matchMode="combinations")
# Apply the styling
tbl$mapStyling(cells=cells, styleProperty="background-color", valueType="color",mapType="logic",mappings=list("v>20", "red", "v<=20", "black"))
# Render the table with the applied styles
tbl$renderTable()# Save the table to an Excel workbook
wb <- createWorkbook(creator = Sys.getenv("M. Schauer"))
addWorksheet(wb, "Data")
tbl$writeToExcelWorksheet(wb=wb, wsName="Data",topRowNumber=1, leftMostColumnNumber=1, applyStyles=TRUE)
# Export to .xlsx
saveWorkbook(wb, file="../Data//Assur_Pottery_2024_CV_colcode.xlsx", overwrite = TRUE)Comment: This spreadsheet is important during data interpretation to understand outlieres. If those can be defined, it is worth comming back to this spreadsheet to check whether the distinguishing elements for the specific sample are adhering to the quality criteria defined in Section 1.4.1. If not, the spread has to be checked to identify if the outlier is due to its specific chemistry or a too high variation in the measurements which might have been caused by high heterogeneity, roughness etc. of the sample.
5 Accuracy Assessment
5.1 Standard 2709a Accuracy
This procedure follows the steps outlined in Section 1.4.2.
5.1.1 Calculate reference value (nominal value)
5.1.1.1 Compute metrics of reference values
The name of the .csv-spreadsheet includes the purpose the measurements (reference value - ref_val_meas) were made for, the standard they are based on (2709a), month and year of measurement (Mai 2023 - mai2023) as well as the location (Munich): ref_val_2709a_mai2023_munich.csv. This makes the file easy to identify for future use.
# Load data
data1 <- read.csv("../Data//ref_val_meas_2709a_mai2023_munich.csv")
# Replace '< LOD' with 0 in each column
data2 <- data1 %>%
mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric)) # Ensure all values are numeric
# Remove first 16 columns (no elemental data)
data <- data2[, -c(1:14)]
# Replace '.' with '_' in column names
data <- data %>%
rename_with(~ gsub("\\.", "_", .x))
# Define statistical functions
CV <- function(x) (sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
SEM <- function(x) sd(x, na.rm = TRUE) / sqrt(length(x))
UCL <- function(x) mean(x, na.rm = TRUE) + sd(x, na.rm = TRUE)
LCL <- function(x) mean(x, na.rm = TRUE) - sd(x, na.rm = TRUE)
CV2 <- function(x) (2*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
UCL2 <- function(x) mean(x, na.rm = TRUE) + 2*sd(x, na.rm = TRUE)
LCL2 <- function(x) mean(x, na.rm = TRUE) - 2*sd(x, na.rm = TRUE)
CV3 <- function(x) (3*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
UCL3 <- function(x) mean(x, na.rm = TRUE) + 3*sd(x, na.rm = TRUE)
LCL3 <- function(x) mean(x, na.rm = TRUE) - 3*sd(x, na.rm = TRUE)
# Compute summary statistics
NominalSdARM2 <- data %>%
summarize(across(everything(), list(
Mean = ~ mean(.x, na.rm = TRUE),
Sd = ~ sd(.x, na.rm = TRUE),
SEM = ~ SEM(.x),
CV = ~ CV(.x),
LCL = ~ LCL(.x),
UCL = ~ UCL(.x),
CV2 = ~ CV2(.x),
LCL2 = ~ LCL2(.x),
UCL2 = ~ UCL2(.x),
CV3 = ~ CV3(.x),
LCL3 = ~ LCL3(.x),
UCL3 = ~ UCL3(.x),
MD = ~ median(.x, na.rm = TRUE)
)))
# Define columns
Element <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
"Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Function to clean column names
cleanColumnNames <- function(df, prefix) {
colnames(df) <- gsub(paste0(prefix, "_"), "", colnames(df))
return(df)
}
# Function to extract and clean data for each element
extract_element_data <- function(element, df) {
cols <- grep(paste0("^", element, "_(Mean|Sd|SEM|CV|LCL|UCL|CV2|LCL2|UCL2|CV3|LCL3|UCL3|MD|Error_Mean|Error_Sd|Error_MD)"),
colnames(df), value = TRUE)
if (length(cols) > 0) {
df_out <- df %>%
select(all_of(cols)) %>%
cleanColumnNames(element)
# Add the element name as the first column
df_out <- cbind(Element = element, df_out)
return(df_out)
} else {
return(data.frame(Element = element, matrix(NA, nrow = 1, ncol = length(cols))))
}
}
# Process all elements & create final table
Tab3 <- map_dfr(Element, extract_element_data, df = NominalSdARM2)5.1.1.2 Create formated spreadsheet with all metrics
# Replace "Error" with "MU" in column names
colnames(Tab3) <- gsub("Error", "MU", colnames(Tab3))
# Rounding of values
numeric_cols <- c("Mean", "Sd", "SEM", "CV", "UCL", "LCL","CV2", "UCL2", "LCL2", "CV3",
"UCL3", "LCL3","MD", "MU_Mean", "MU_Sd", "MU_MD")
for (col in numeric_cols) {
if (col %in% colnames(Tab3)) {
Tab3[[col]] <- round(Tab3[[col]],
ifelse(col %in% c("CV", "CV2", "CV3"), 2,
ifelse(col %in% c("UCL", "LCL", "UCL2", "LCL2", "UCL3", "LCL3"), 0, 1)))
}
}
# Print table
print(Tab3) Element Mean Sd SEM CV LCL UCL CV2 LCL2 UCL2
1 Si 295284.0 3205.2 966.4 1.09 292079 298489 2.17 288874 301694
2 Ti 3730.5 34.9 10.5 0.93 3696 3765 1.87 3661 3800
3 Al 87220.5 2338.5 705.1 2.68 84882 89559 5.36 82544 91898
4 Fe 34286.9 99.7 30.1 0.29 34187 34387 0.58 34088 34486
5 Mn 492.0 14.9 4.5 3.03 477 507 6.07 462 522
6 Mg 25548.6 4147.1 1250.4 16.23 21402 29696 32.46 17254 33843
7 Ca 19666.9 88.2 26.6 0.45 19579 19755 0.90 19490 19843
8 K 19497.5 167.1 50.4 0.86 19330 19665 1.71 19163 19832
9 P 384.9 32.4 9.8 8.42 353 417 16.85 320 450
10 S 441.2 9.7 2.9 2.21 431 451 4.42 422 461
11 V 137.6 8.8 2.6 6.37 129 146 12.73 120 155
12 Cr 107.6 4.8 1.4 4.42 103 112 8.84 98 117
13 Ni 90.2 4.9 1.5 5.43 85 95 10.86 80 100
14 Cu 36.3 2.9 0.9 8.06 33 39 16.12 30 42
15 Zn 88.3 2.1 0.6 2.42 86 90 4.83 84 93
16 As 10.0 0.7 0.2 7.10 9 11 14.21 9 11
17 Rb 85.8 1.6 0.5 1.89 84 87 3.79 83 89
18 Sr 223.2 1.6 0.5 0.74 222 225 1.47 220 226
19 Y 15.3 0.6 0.2 3.83 15 16 7.67 14 16
20 Zr 141.0 1.2 0.3 0.82 140 142 1.64 139 143
21 Nb 7.6 0.6 0.2 7.85 7 8 15.70 6 9
22 Mo 1.3 1.2 0.4 96.87 0 3 193.73 -1 4
23 Ba 733.1 18.5 5.6 2.53 715 752 5.06 696 770
24 Pb 16.3 1.1 0.3 6.81 15 17 13.62 14 19
25 Th 3.2 0.3 0.1 8.10 3 3 16.20 3 4
26 U 5.9 1.1 0.3 18.27 5 7 36.55 4 8
CV3 LCL3 UCL3 MD MU_Mean MU_Sd MU_MD
1 3.26 285668 304900 295988.7 1035.9 7.3 1036.5
2 2.80 3626 3835 3732.6 64.3 0.5 64.3
3 8.04 80205 94236 87879.1 1493.9 42.2 1492.1
4 0.87 33988 34586 34293.2 213.0 1.4 212.8
5 9.10 447 537 499.3 36.2 0.3 36.3
6 48.70 13107 37990 24942.2 3954.6 100.1 3958.3
7 1.35 19402 19932 19691.9 180.4 1.0 180.1
8 2.57 18996 19999 19429.5 253.5 2.2 253.3
9 25.27 288 482 396.5 67.8 0.8 67.6
10 6.62 412 470 439.8 16.6 0.2 16.5
11 19.10 111 164 136.7 19.1 0.2 19.0
12 13.26 93 122 107.3 9.4 0.1 9.4
13 16.29 76 105 91.0 14.2 0.1 14.2
14 24.18 28 45 36.7 7.6 0.1 7.6
15 7.25 82 95 88.1 5.9 0.1 5.9
16 21.31 8 12 10.1 2.3 0.0 2.3
17 5.68 81 91 85.5 2.4 0.0 2.4
18 2.21 218 228 223.0 3.1 0.0 3.1
19 11.50 14 17 15.4 1.5 0.0 1.5
20 2.46 138 144 141.3 2.8 0.0 2.8
21 23.55 6 9 7.5 1.6 0.0 1.6
22 290.60 -2 5 2.1 1.6 0.3 1.3
23 7.59 677 789 735.3 19.2 0.1 19.2
24 20.44 13 20 16.7 2.8 0.0 2.8
25 24.30 2 4 3.2 0.7 0.0 0.7
26 54.82 3 9 5.4 2.3 0.0 2.3
# Export to .csv
write.csv(Tab3, "../Data//metrics_ref_val_2709a_mai2023_munich.csv", row.names = FALSE)5.1.1.3 Export reference values for daily device accuracy testing
This selection only includes the relevant elements and the values needed to define the test range in which the measurement values should lie: Element name with UCL and LCL of the 1δSD-range and UCL2/LCL2 of the 2δSD-range. You can choose the elements to export by adding them in the code for selected_elements.
# Select only necessary columns
dataset1 <- Tab3 %>% select(Element, LCL,UCL,LCL2,UCL2,LCL3,UCL3)
# Filter for the specific elements dynamically
selected_elements <- c("Rb", "Sr", "Y", "Zn", "Zr")
dataset <- dataset1 %>% filter(Element %in% selected_elements)
# Export to .csv
write.csv(dataset, "../Data//test_range_ref_val_2709a_mai2023_munich.csv", row.names = FALSE)Comment: This spreadsheet can be used to compare the mean of three standard measurements, as calculated by the device and displayed on the screen, with the test range of the selected elements.Three out of five elements have to be in the 2δSD-range for the test for accuracy to be successful.
5.1.1.4 Create spreadsheet for Shewhart Control Charts
# Transpose the data
Tab4 <- as.data.frame(t(Tab3))
# Store the original row names (before transposing)
row_names <- rownames(Tab3)
# Store the original column names (before transposing)
col_names <- colnames(Tab3)
# Assign the original row names as column names of the transposed data
colnames(Tab4) <- row_names
# Assign the original column names as row names of the transposed data
rownames(Tab4) <- col_names
# Set the first row as the column names (this will use the values of the first row as column headers)
colnames(Tab4) <- Tab4[1, ]
# Remove the first row (since it is now used as the column names)
Tab4 <- Tab4[-1, ]
# # Export to .csv
write.csv(Tab4, "../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv", row.names = TRUE)5.1.2 Compute Shewhart Control Charts
# Load and filter data
data <- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
filtered_data <- subset(data, SAMPLE == "2709a")
# Transform date field
data1 <- filtered_data |>
mutate(
Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"), # 24-hour format
Time = format(Time, "%m/%d/%Y") # Output as mm/dd/yyyy
)
# Replace '< LOD' with 0 in each column
data2 <- data1 %>%
mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric))
# Remove first 16 columns (no elemental data)
data <- data2[, -c(1:14)]
# Load data
data1 <- read.csv("../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv")
# Create PDF for all SCC plots
pdf("../Figures//SCC_2709a_accuracy_plots.pdf", width = 8, height = 6)
# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
"Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
for (col in selected_columns) {
# Extract the data for the current column
chartdata <- data[, col]
# Dynamically get metrics for current column from data1
CL <- data1[1, col]
SD <- data1[2, col]
LCL <- data1[5, col]
UCL <- data1[6, col]
LCL2 <- data1[8, col]
UCL2 <- data1[9, col]
LCL3 <- data1[11, col]
UCL3 <- data1[12, col]
# Round to 0 decimal places
CL <- round(CL, 0)
SD <- round(SD, 0)
LCL <- round(LCL, 0)
UCL <- round(UCL, 0)
LCL2 <- round(LCL2, 0)
UCL2 <- round(UCL2, 0)
LCL3 <- round(LCL3, 0)
UCL3 <- round(UCL3, 0)
# Create the control chart
qcc_result <- qcc(
data = chartdata,
type = "xbar",
nsigmas = 1,
std.dev = SD,
center = CL,
rules = NULL,
plot = FALSE
)
# Create the plot for the current column
plot(qcc_result, restore.par = FALSE,
title = paste("SCC for", col),
ylab = "Concentration (ppm)",
xlab = "Values of 2709a throughout the 2023 measurement campaign")
# Add 3-Sigma control limits
abline(h = c(UCL3, LCL3), lty = 3, col = "black")
text(x = length(chartdata) * 1, y = UCL3, labels = "UCL3", pos = 3, col = "black", cex = 0.8)
text(x = length(chartdata) * 1, y = LCL3, labels = "LCL3", pos = 1, col = "black", cex = 0.8)
# Add 2-Sigma control limits
abline(h = c(UCL2, LCL2), lty = 3, col = "black")
text(x = length(chartdata) * 1, y = UCL2, labels = "UCL2", pos = 3, col = "black", cex = 0.8)
text(x = length(chartdata) * 1, y = LCL2, labels = "LCL2", pos = 1, col = "black", cex = 0.8)
# Highlight points based on thresholds
below_1sigma <- chartdata < UCL | chartdata > LCL
below_2sigma <- chartdata > UCL | chartdata < LCL
above_3sigma <- chartdata > UCL3 | chartdata < LCL3
above_2sigma <- (chartdata > UCL2 & chartdata <= UCL3) | (chartdata < LCL2 & chartdata >= LCL3)
# Overlay red points for values in 1-sigma
points(which(below_1sigma), chartdata[below_1sigma], col = "lightgreen", pch = 19, cex = 1.2)
# Overlay yellow points for values between 1-sigma and 2-sigma
points(which(below_2sigma), chartdata[below_2sigma], col = "lightblue", pch = 19, cex = 1.2)
# Overlay yellow points for values between 2-sigma and 3-sigma
points(which(above_2sigma), chartdata[above_2sigma], col = "orange", pch = 19, cex = 1.2)
# Overlay red points for values beyond 3-sigma
points(which(above_3sigma), chartdata[above_3sigma], col = "red", pch = 19, cex = 1.2)
}
# Close the PDF file
dev.off()Shewhart Control Charts
Interpretation: Skimming through the SCCs of the same standard throughout the 2023 measurement campaign reveals that the measurement values do vary. Many values fall within the ±1δSD range (green), with some extending into the ±1δSD to ±2δSD range (blue) but also ±2δSD to ±3δSD range (orange). Especially for the light elements, some major drifts and values beyond the quality criterium are visible, making it clear that all interpretation has to be checked in detail with other information such as petrography and archaeological background information (see also Schauer and Amicone 2024).
5.1.3 Compute Relative Percentage Difference (%RD)
5.1.3.1 Calculate %RD
# Load and filter data
data <- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
filtered_data <- subset(data, SAMPLE == "2709a")
# Transform date field
data1 <- filtered_data |>
mutate(
Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"), # 24-hour format
Time = format(Time, "%m/%d/%Y") # Output as mm/dd/yyyy
)
# Replace '< LOD' with 0 in each column
data2 <- data1[, -c(1:14)] %>%
mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric))
# Merge data
data3<-cbind(data1[, c(1:14)],data2)
# Calculate mean
mean_data <- data3 %>%
group_by(Time) %>%
summarize(across(everything(), mean, na.rm = TRUE))
# Remove first 16 columns (no elemental data)
data4 <- mean_data[, -c(1:14)]
# Merge data
data<-cbind(mean_data[, c(1)],data4)
# Load second dataset
data1 <- read.csv("../Data//metrics_SCC_ref_val_2709a_mai2023_munich.csv")
# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
"Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Loop over each selected column
for (col in selected_columns) {
# Extract the data for the current column by name
mean_val <- data[, col] # This extracts the column by name (which is a string)
# Dynamically get the reference value for the current column from data1 (assuming row 1 contains reference values)
CL <- data1[1, col] # Reference value for the column from data1
# Define the percentage difference function (element-wise operation)
percentage_difference <- function(x) {
return(((x - CL) / CL) * 100) # Apply the percentage difference formula
}
# Apply the percentage difference function to each element in mean_val
percentage_diff <- sapply(mean_val, percentage_difference)
# Optionally, store the result in a new column in the data
data[, paste(col, "percentage_diff", sep = "_")] <- percentage_diff
}
# Select necessary columns, round values (except 'Time'), and remove '_percentage_diff' from column names
data_RD <- data[, c(1, 143:168)] %>%
mutate(across(-which(names(.) == "Time"), round, 0)) %>% # Round all columns except 'Date'
setNames(gsub("_percentage_diff", "", colnames(.))) %>% # Remove '_percentage_diff' from column names
setNames(gsub("^Time$", "%RD", colnames(.))) # Replace "Time" with "%RD"5.1.3.2 Format %RD-spreadsheet
# Create new spreadsheet
tbl <- BasicTable$new()
tbl$addData(data_RD, firstColumnAsRowHeaders=TRUE)
# Get the number of rows and columns in the dataset
num_rows <- nrow(data_RD)
num_columns <- ncol(data_RD)
# Find the index of the column with the header "%RD"
rd_column_index <- which(names(data_RD) == "%RD")
# Define the cells for which the styling needs to be applied (excluding the '%RD' column)
column_indices_to_style <- setdiff(2:num_columns, rd_column_index)
# Ensure the last row is included in styling
cells <- tbl$getCells(
rowNumbers = 2:(num_rows + 1),
columnNumbers = column_indices_to_style,
matchMode = "combinations"
)
# Apply the styling
tbl$mapStyling(
cells = cells,
styleProperty = "background-color",
valueType = "color",
mapType = "logic",
mappings = list(
"v >= -3 & v <= 3", "lightgreen", # Values between -3 and 3
"v > -7 & v < -3", "lightblue", # Values between -7 and -3
"v > 3 & v <= 7", "lightblue", # Values between 3 and 7
"v > 7 & v <= 10", "yellow", # Values between 7 and 10
"v > -10 & v <= -7", "yellow", # Values between -10 and -7
"v >= 10", "firebrick", # Values greater than 10
"v <= -10", "firebrick" # Values less than -10
)
)
# Render the table with the applied styles
tbl$renderTable()# Save the table to an Excel workbook
wb <- createWorkbook(creator = Sys.getenv("M. Schauer"))
addWorksheet(wb, "Data")
tbl$writeToExcelWorksheet(wb = wb, wsName = "Data", topRowNumber = 1, leftMostColumnNumber = 1, applyStyles = TRUE)
# Export to .xlsx
saveWorkbook(wb, file = "../Data//relRD_2709a_mai2023_munich.xlsx", overwrite = TRUE)Interpretation: The %RD demonstrates excellent accuracy (between -3 and +3) for the elements Si, Fe, Ca, K, Zn, Rb and Sr, with some elements — Ti, Al, S, Cr, Ni, Nb, Ba and Pb — also including highly accurate %RD values (3 to 7, -3 to -7). Mn, V, As and Zr show some %RD values of good accuracy (7 to 10, -7 to -10), whereas for Mg, P, Cu, Y, Mo, Th and U, the %RD values are not accurate enough (higher than 10, lower than -10). Therefore, these latter elements are not measured with sufficient accuracy and should either not be used or, if used, be handled with great care in further analysis.
5.1.3.3 Create figures of %RD
# Load data
data1 <- read.xlsx("../Data//relRD_2709a_mai2023_munich.xlsx")
# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
"Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Create a PDF file to save all plots
pdf("../Figures//relRD_2709a_mai2023_munich_plots.pdf", width = 10, height = 6)
# Loop through each selected column and create plots
for (col in selected_columns) {
# Extract data for plotting
plot_data <- data.frame(
Day = as.factor(data1$`%RD`), # Treat measurement days as categorical on X-axis
Value = data1[[col]]
)
# Create scatter plot with lines connecting measurement days
p <- ggplot(plot_data, aes(x = Day, y = Value, group = 1)) +
geom_point() +
geom_line() + # Connect points across measurement days
geom_hline(yintercept = 0, linetype = "dashed", color = "red") + # Reference line
labs(title = paste("%RD-Plot for", col), x = "Measurement Day", y = col) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) # Tilt x-axis labels
# Display the plot in R
print(p)
}
# Close the PDF device
dev.off()%RD plots
5.2 Standard SdAR-M2 Accuracy
This procedure follows the steps outlined in Section 1.4.2.
5.2.1 Calculate reference value (nominal value)
5.2.1.1 Compute metrics of reference values
The name of the .csv-spreadsheet includes the purpose the measurements (reference value - ref_val_meas) were made for, the standard they are based on (SdAR-M2), month and year of measurement (November 2024 - nov2024) as well as the location (Munich): ref_val_SdAR-M2_nov2024_munich.csv. This makes the file easy to identify for future use.
# Load data
data1 <- read.csv("../Data//ref_val_meas_SdAR-M2_nov2024_munich.csv")
# Replace '< LOD' with 0 in each column
data2 <- data1 %>%
mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric)) # Ensure all values are numeric
# Remove first 16 columns (no elemental data)
data <- data2[, -c(1:14)]
# Replace '.' with '_' in column names
data <- data %>%
rename_with(~ gsub("\\.", "_", .x))
# Define statistical functions
CV <- function(x) (sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
SEM <- function(x) sd(x, na.rm = TRUE) / sqrt(length(x))
UCL <- function(x) mean(x, na.rm = TRUE) + sd(x, na.rm = TRUE)
LCL <- function(x) mean(x, na.rm = TRUE) - sd(x, na.rm = TRUE)
CV2 <- function(x) (2*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
UCL2 <- function(x) mean(x, na.rm = TRUE) + 2*sd(x, na.rm = TRUE)
LCL2 <- function(x) mean(x, na.rm = TRUE) - 2*sd(x, na.rm = TRUE)
CV3 <- function(x) (3*sd(x, na.rm = TRUE) / mean(x, na.rm = TRUE)) * 100
UCL3 <- function(x) mean(x, na.rm = TRUE) + 3*sd(x, na.rm = TRUE)
LCL3 <- function(x) mean(x, na.rm = TRUE) - 3*sd(x, na.rm = TRUE)
# Compute summary statistics
NominalSdARM2 <- data %>%
summarize(across(everything(), list(
Mean = ~ mean(.x, na.rm = TRUE),
Sd = ~ sd(.x, na.rm = TRUE),
SEM = ~ SEM(.x),
CV = ~ CV(.x),
LCL = ~ LCL(.x),
UCL = ~ UCL(.x),
CV2 = ~ CV2(.x),
LCL2 = ~ LCL2(.x),
UCL2 = ~ UCL2(.x),
CV3 = ~ CV3(.x),
LCL3 = ~ LCL3(.x),
UCL3 = ~ UCL3(.x),
MD = ~ median(.x, na.rm = TRUE)
)))
# Define columns
Element <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
"Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Function to clean column names
cleanColumnNames <- function(df, prefix) {
colnames(df) <- gsub(paste0(prefix, "_"), "", colnames(df))
return(df)
}
# Function to extract and clean data for each element
extract_element_data <- function(element, df) {
cols <- grep(paste0("^", element, "_(Mean|Sd|SEM|CV|LCL|UCL|CV2|LCL2|UCL2|CV3|LCL3|UCL3|MD|Error_Mean|Error_Sd|Error_MD)"),
colnames(df), value = TRUE)
if (length(cols) > 0) {
df_out <- df %>%
select(all_of(cols)) %>%
cleanColumnNames(element)
# Add the element name as the first column
df_out <- cbind(Element = element, df_out)
return(df_out)
} else {
return(data.frame(Element = element, matrix(NA, nrow = 1, ncol = length(cols))))
}
}
# Process all elements & create final table
Tab3 <- map_dfr(Element, extract_element_data, df = NominalSdARM2)5.2.1.2 Create formated spreadsheet with all metrics
# Replace "Error" with "MU" in column names
colnames(Tab3) <- gsub("Error", "MU", colnames(Tab3))
# Rounding of values
numeric_cols <- c("Mean", "Sd", "SEM", "CV", "UCL", "LCL","CV2", "UCL2", "LCL2", "CV3",
"UCL3", "LCL3","MD", "MU_Mean", "MU_Sd", "MU_MD")
for (col in numeric_cols) {
if (col %in% colnames(Tab3)) {
Tab3[[col]] <- round(Tab3[[col]],
ifelse(col %in% c("CV", "CV2", "CV3"), 2,
ifelse(col %in% c("UCL", "LCL", "UCL2", "LCL2", "UCL3", "LCL3"), 0, 1)))
}
}
# Print table
print(Tab3) Element Mean Sd SEM CV LCL UCL CV2 LCL2 UCL2
1 Si 377487.7 7948.9 2513.7 2.11 369539 385437 4.21 361590 393385
2 Ti 1578.7 43.7 13.8 2.77 1535 1622 5.54 1491 1666
3 Al 75023.3 3821.6 1208.5 5.09 71202 78845 10.19 67380 82666
4 Fe 17581.0 171.4 54.2 0.98 17410 17752 1.95 17238 17924
5 Mn 847.6 17.1 5.4 2.01 831 865 4.02 813 882
6 Mg 3280.3 4261.1 1347.5 129.90 -981 7541 259.80 -5242 11803
7 Ca 5233.6 134.4 42.5 2.57 5099 5368 5.14 4965 5502
8 K 41940.1 204.7 64.7 0.49 41735 42145 0.98 41531 42349
9 P 242.7 94.4 29.8 38.88 148 337 77.75 54 431
10 S 783.4 44.3 14.0 5.65 739 828 11.30 695 872
11 V 50.7 5.1 1.6 10.06 46 56 20.11 41 61
12 Cr 49.2 7.2 2.3 14.60 42 56 29.19 35 64
13 Ni 81.3 10.1 3.2 12.40 71 91 24.80 61 101
14 Cu 213.2 4.7 1.5 2.19 209 218 4.39 204 223
15 Zn 699.6 7.1 2.2 1.01 692 707 2.03 685 714
16 As 80.6 4.8 1.5 5.95 76 85 11.90 71 90
17 Rb 143.9 1.3 0.4 0.87 143 145 1.74 141 146
18 Sr 146.2 1.2 0.4 0.84 145 147 1.68 144 149
19 Y 30.8 0.9 0.3 3.07 30 32 6.14 29 33
20 Zr 251.0 5.0 1.6 1.99 246 256 3.98 241 261
21 Nb 25.2 0.5 0.2 1.99 25 26 3.98 24 26
22 Mo 13.8 1.0 0.3 7.16 13 15 14.31 12 16
23 Ba 783.6 14.6 4.6 1.87 769 798 3.74 754 813
24 Pb 850.9 7.3 2.3 0.86 844 858 1.72 836 865
25 Th 6.7 0.5 0.2 7.26 6 7 14.51 6 8
26 U 7.4 1.8 0.6 24.73 6 9 49.46 4 11
CV3 LCL3 UCL3 MD MU_Mean MU_Sd MU_MD
1 6.32 353641 401334 375499.0 1233.5 30.8 1235.4
2 8.31 1448 1710 1569.1 42.3 0.8 42.3
3 15.28 63559 86488 74631.7 1406.4 152.6 1377.2
4 2.93 17067 18095 17564.8 152.5 1.0 152.8
5 6.03 796 899 847.9 40.9 0.3 40.9
6 389.70 -9503 16064 0.0 5217.6 1617.8 5270.5
7 7.70 4830 5637 5174.2 83.6 1.4 83.8
8 1.46 41326 42554 41992.7 340.9 1.8 341.3
9 116.63 -40 526 209.9 78.3 1.8 78.3
10 16.95 651 916 776.4 19.4 0.4 19.4
11 30.17 35 66 51.8 13.1 0.2 13.2
12 43.79 28 71 47.7 6.9 0.1 6.9
13 37.19 51 111 80.6 14.1 0.1 14.0
14 6.58 199 227 213.5 10.8 0.1 10.8
15 3.04 678 721 698.7 13.7 0.1 13.7
16 17.84 66 95 79.9 9.9 0.0 9.9
17 2.61 140 148 144.1 3.1 0.0 3.1
18 2.52 142 150 145.9 2.6 0.0 2.6
19 9.21 28 34 30.8 2.3 0.0 2.3
20 5.96 236 266 250.8 3.4 0.0 3.4
21 5.97 24 27 25.1 2.1 0.0 2.1
22 21.47 11 17 13.8 1.5 0.0 1.5
23 5.60 740 828 786.0 19.3 0.1 19.3
24 2.57 829 873 853.3 13.1 0.1 13.1
25 21.77 5 8 6.7 1.2 0.0 1.2
26 74.18 2 13 7.6 2.7 0.0 2.7
# Save final table with all reference value metrics
write.csv(Tab3, "../Data//metrics_ref_val_SdAR-M2_nov2024_munich.csv", row.names = FALSE)5.2.1.3 Export reference values for daily device accuracy testing
This selection only includes the relevant elements and the values needed to define the test range in which the measurement values should lie: Element name with UCL and LCL of the 1δSD-range and UCL2/LCL2 of the 2δSD-range. You can choose the elements to export by adding them in the code for selected_elements.
# Select only necessary columns
dataset1 <- Tab3 %>% select(Element, LCL,UCL,LCL2,UCL2,LCL3,UCL3)
# Filter for the specific elements dynamically
selected_elements <- c("Rb", "Sr", "Y", "Zn", "Zr")
dataset <- dataset1 %>% filter(Element %in% selected_elements)
# Export to .csv
write.csv(dataset, "../Data//test_range_ref_val_SdAR-M2_nov2024_munich.csv", row.names = FALSE)Comment: This spreadsheet can be used to compare the mean of three standard measurements, as calculated by the device and displayed on the screen, with the test range of the selected elements.Three out of five elements have to be in the 2δSD-range for the test for accuracy to be successful.
5.2.1.4 Create spreadsheet for Shewhart Control Charts
# Transpose the data
Tab4 <- as.data.frame(t(Tab3))
# Store the original row names (before transposing)
row_names <- rownames(Tab3)
# Store the original column names (before transposing)
col_names <- colnames(Tab3)
# Assign the original row names as column names of the transposed data
colnames(Tab4) <- row_names
# Assign the original column names as row names of the transposed data
rownames(Tab4) <- col_names
# Set the first row as the column names (this will use the values of the first row as column headers)
colnames(Tab4) <- Tab4[1, ]
# Remove the first row (since it is now used as the column names)
Tab4 <- Tab4[-1, ]
# # Export to .csv
write.csv(Tab4, "../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv", row.names = TRUE)5.2.2 Compute Shewhart Control Charts
# Load and filter data
data <- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
filtered_data <- subset(data, SAMPLE == "SdAR-M2")
data1 <- filtered_data |>
mutate(
Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"), # 24-hour format
Time = format(Time, "%m/%d/%Y") # Output as mm/dd/yyyy
)
# Replace '< LOD' with 0 in each column
data2 <- data1 %>%
mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric))
# Remove first 16 columns (no elemental data)
data <- data2[, -c(1:14)]
# Load data
data1 <- read.csv("../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv")
# Create PDF for all SCC plots
pdf("../Figures//SCC_SdAR-M2_accuracy_plots.pdf", width = 8, height = 6)
# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
"Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
for (col in selected_columns) {
# Extract the data for the current column
chartdata <- data[, col]
# Dynamically get metrics for current column from data1
CL <- data1[1, col]
SD <- data1[2, col]
LCL <- data1[5, col]
UCL <- data1[6, col]
LCL2 <- data1[8, col]
UCL2 <- data1[9, col]
LCL3 <- data1[11, col]
UCL3 <- data1[12, col]
# Round to 0 decimal places
CL <- round(CL, 0)
SD <- round(SD, 0)
LCL <- round(LCL, 0)
UCL <- round(UCL, 0)
LCL2 <- round(LCL2, 0)
UCL2 <- round(UCL2, 0)
LCL3 <- round(LCL3, 0)
UCL3 <- round(UCL3, 0)
# Create the control chart
qcc_result <- qcc(
data = chartdata,
type = "xbar",
nsigmas = 1,
std.dev = SD,
center = CL,
rules = NULL,
plot = FALSE
)
# Create the plot for the current column
plot(qcc_result, restore.par = FALSE,
title = paste("SCC for", col),
ylab = "Concentration (ppm)",
xlab = "Values of SdAR-M2 throughout the 2024 measurement campaign")
# Add 3-Sigma control limits
abline(h = c(UCL3, LCL3), lty = 3, col = "black")
text(x = length(chartdata) * 1, y = UCL3, labels = "UCL3", pos = 3, col = "black", cex = 0.8)
text(x = length(chartdata) * 1, y = LCL3, labels = "LCL3", pos = 1, col = "black", cex = 0.8)
# Add 2-Sigma control limits
abline(h = c(UCL2, LCL2), lty = 3, col = "black")
text(x = length(chartdata) * 1, y = UCL2, labels = "UCL2", pos = 3, col = "black", cex = 0.8)
text(x = length(chartdata) * 1, y = LCL2, labels = "LCL2", pos = 1, col = "black", cex = 0.8)
# Highlight points based on thresholds
below_1sigma <- chartdata < UCL | chartdata > LCL
below_2sigma <- chartdata > UCL | chartdata < LCL
above_3sigma <- chartdata > UCL3 | chartdata < LCL3
above_2sigma <- (chartdata > UCL2 & chartdata <= UCL3) | (chartdata < LCL2 & chartdata >= LCL3)
# Overlay red points for values in 1-sigma
points(which(below_1sigma), chartdata[below_1sigma], col = "lightgreen", pch = 19, cex = 1.2)
# Overlay yellow points for values between 1-sigma and 2-sigma
points(which(below_2sigma), chartdata[below_2sigma], col = "lightblue", pch = 19, cex = 1.2)
# Overlay yellow points for values between 2-sigma and 3-sigma
points(which(above_2sigma), chartdata[above_2sigma], col = "orange", pch = 19, cex = 1.2)
# Overlay red points for values beyond 3-sigma
points(which(above_3sigma), chartdata[above_3sigma], col = "red", pch = 19, cex = 1.2)
}
# Close the PDF file
dev.off()Shewhart Control Charts
Interpretation: Skimming through the SCCs of the same standard throughout the 2024 measurement campaign reveals quite taccurate results for all elements. Many values fall within the ±1δSD range (green), with some extending into the ±1δSD to ±2δSD range (blue) but more rarely to the ±2δSD to ±3δSD range (orange). Especially for the light elements, some drifts and values beyond the quality criterium are visible, making it clear that all interpretation has to be checked in detail with other information such as petrography and archaeological background information (see also Schauer and Amicone 2024).
5.2.3 Compute Relative Percentage Difference (%RD)
5.2.3.1 Calculate %RD
# Load data
data <- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
filtered_data <- subset(data, SAMPLE == "SdAR-M2")
# Transform date field
data1 <- filtered_data |>
mutate(
Time = as.POSIXct(Time, format = "%m/%d/%Y %H:%M"), # 24-hour format
Time = format(Time, "%m/%d/%Y") # Output as mm/dd/yyyy
)
# Replace '< LOD' with 0 in each column
data2 <- data1[, -c(1:14)] %>%
mutate(across(everything(), ~ ifelse(. == "< LOD", 0, .))) %>%
mutate(across(everything(), as.numeric))
# Merge data
data3<-cbind(data1[, c(1:14)],data2)
# Calculate mean
mean_data <- data3 %>%
group_by(Time) %>%
summarize(across(everything(), mean, na.rm = TRUE))
# Remove first 16 columns (no elemental data)
data4 <- mean_data[, -c(1:14)]
# Merge data
data<-cbind(mean_data[, c(1)],data4)
# Load second dataset
data1 <- read.csv("../Data//metrics_SCC_ref_val_SdAR-M2_nov2024_munich.csv")
# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
"Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Loop over each selected column
for (col in selected_columns) {
# Extract the data for the current column by name
mean_val <- data[, col] # This extracts the column by name (which is a string)
# Dynamically get the reference value for the current column from data1 (assuming row 1 contains reference values)
CL <- data1[1, col] # Reference value for the column from data1
# Define the percentage difference function (element-wise operation)
percentage_difference <- function(x) {
return(((x - CL) / CL) * 100) # Apply the percentage difference formula
}
# Apply the percentage difference function to each element in mean_val
percentage_diff <- sapply(mean_val, percentage_difference)
# Optionally, store the result in a new column in the data
data[, paste(col, "percentage_diff", sep = "_")] <- percentage_diff
}
# Select necessary columns, round values (except 'Time'), and remove '_percentage_diff' from column names
data_RD <- data[, c(1, 143:168)] %>%
mutate(across(-which(names(.) == "Time"), round, 0)) %>% # Round all columns except 'Date'
setNames(gsub("_percentage_diff", "", colnames(.))) %>% # Remove '_percentage_diff' from column names
setNames(gsub("^Time$", "%RD", colnames(.))) # Replace "Time" with "%RD"5.2.3.2 Format %RD-spreadsheet
# Create new spreadsheet
tbl <- BasicTable$new()
tbl$addData(data_RD, firstColumnAsRowHeaders=TRUE)
# Get the number of rows and columns in the dataset
num_rows <- nrow(data_RD)
num_columns <- ncol(data_RD)
# Find the index of the column with the header "%RD"
rd_column_index <- which(names(data_RD) == "%RD")
# Define the cells for which the styling needs to be applied (excluding the '%RD' column)
column_indices_to_style <- setdiff(2:num_columns, rd_column_index)
# Ensure the last row is included in styling
cells <- tbl$getCells(
rowNumbers = 2:(num_rows + 1),
columnNumbers = column_indices_to_style,
matchMode = "combinations"
)
# Apply the styling
tbl$mapStyling(
cells = cells,
styleProperty = "background-color",
valueType = "color",
mapType = "logic",
mappings = list(
"v >= -3 & v <= 3", "lightgreen", # Values between -3 and 3
"v > -7 & v < -3", "lightblue", # Values between -7 and -3
"v > 3 & v <= 7", "lightblue", # Values between 3 and 7
"v > 7 & v <= 10", "yellow", # Values between 7 and 10
"v > -10 & v <= -7", "yellow", # Values between -10 and -7
"v >= 10", "firebrick", # Values greater than 10
"v <= -10", "firebrick" # Values less than -10
)
)
# Render the table with the applied styles
tbl$renderTable()# Save the table to an Excel workbook
wb <- createWorkbook(creator = Sys.getenv("M. Schauer"))
addWorksheet(wb, "Data")
tbl$writeToExcelWorksheet(wb = wb, wsName = "Data", topRowNumber = 1, leftMostColumnNumber = 1, applyStyles = TRUE)
# Export to .xlsx
saveWorkbook(wb, file = "../Data//relRD_SdAR-M2_nov2024_munich.xlsx", overwrite = TRUE)Interpretation: The %RD demonstrates excellent accuracy (between -3 and +3) for the elements Si, Ti, Al, Fe, K, Cu, Zn, Rb, Sr, Y, Zr, Nb and Pb, with some elements — Mn, Ca, Ni, As, Mo and Ba — also including highly accurate %RD values (3 to 7, -3 to -7). V shows some %RD values of good accuracy (7 to 10, -7 to -10), whereas for Mg, P, S, Cr, Th and U, generally speaking, the %RD values are not accurate enough (higher than 10, lower than -10). Therefore, these latter elements are not measured with sufficient accuracy and should either not be used or, if used, be handled with great care in further analysis.
5.2.3.3 Create figures of %RD
# Load data
data1 <- read.xlsx("../Data//relRD_SdAR-M2_nov2024_munich.xlsx")
# Define columns
selected_columns <- c("Si", "Ti", "Al", "Fe", "Mn", "Mg", "Ca", "K", "P", "S", "V", "Cr",
"Ni", "Cu", "Zn", "As", "Rb", "Sr", "Y", "Zr", "Nb", "Mo", "Ba",
"Pb", "Th", "U")
# Create a PDF file to save all plots
pdf("../Figures//relRD_SdAR-M2_nov2024_munich_plots.pdf", width = 10, height = 6)
# Loop through each selected column and create plots
for (col in selected_columns) {
# Extract data for plotting
plot_data <- data.frame(
Day = as.factor(data1$`%RD`), # Treat measurement days as categorical on X-axis
Value = data1[[col]]
)
# Create scatter plot with lines connecting measurement days
p <- ggplot(plot_data, aes(x = Day, y = Value, group = 1)) +
geom_point() +
geom_line() + # Connect points across measurement days
geom_hline(yintercept = 0, linetype = "dashed", color = "red") + # Reference line
labs(title = paste("%RD-Plot for", col), x = "Measurement Day", y = col) +
theme_minimal() +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) # Tilt x-axis labels
# Display the plot in R
print(p)
}
# Close the PDF device
dev.off()%RD plots
6 Apply coefficient correction (coefcorIV)
More information on coefficient correction coefcor IV can be found in (Schauer 2024).
6.1 Compile Data
# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_AnalyticalData.csv")
# Replace '< LOD' with 0 in each column
data <- data1 %>%
mutate(across(15:155, ~ ifelse(. == "< LOD", 0, as.character(.)))) %>% # Ensure < LOD is replaced with 0
mutate(across(15:155, as.numeric)) # Ensure all values are numeric
# Define columns
dataset<-data[, c("SAMPLE","Time","Si","Ti","Al","Fe","Mn","Mg","Ca","K","P","S","Cl","Sc","V","Cr","Co","Ni","Cu","Zn","As","Se","Rb","Sr","Y","Zr","Nb","Mo","Pd","Ag","Cd","Sn","Sb","Te","Cs","Ba","La","Ce","Hf","Ta","W","Re","Au","Hg","Pb","Bi","Th","U")]
# Load data
coefcor<- read.csv("../RawData//coefcorIV_factors.csv")
# Define relevant variables
SAMPLE<-dataset$SAMPLE
Time<-dataset$Time6.2 Process Coefcor IV
# Select element column from the measurement data
Si<-dataset$Si
# Select slope correction
Si_a<-coefcor$Si_a
# Select offset correction (if applicable)
Si_b<-coefcor$Si_b
# Calculate coefficient correction and transform to oxide percent
SiO2<-(Si_a*Si+Si_b)*0.00021393
Ti<-dataset$Ti
Ti_a<-coefcor$Ti_a
Ti_b<-coefcor$Ti_b
TiO2<-(Ti_a*Ti+Ti_b)*0.0001668
Al<-dataset$Al
Al_a<-coefcor$Al_a
Al_b<-coefcor$Al_b
Al2O3<-(Al_a*Al+Al_b)*0.00018895
Fe<-dataset$Fe
Fe_a<-coefcor$Fe_a
Fe_b<-coefcor$Fe_b
Fe2O3<-(Fe_a*Fe+Fe_b)*0.000143
Mn<-dataset$Mn
Mn_a<-coefcor$Mn_a
Mn_b<-coefcor$Mn_b
MnO<-(Mn_a*Mn+Mn_b)*0.00012912
Mg<-dataset$Mg
Mg_a<-coefcor$Mg_a
Mg_b<-coefcor$Mg_b
MgO<-(Mg_a*Mg+Mg_b)*0.00016583
Ca<-dataset$Ca
Ca_a<-coefcor$Ca_a
Ca_b<-coefcor$Ca_b
CaO<-(Ca_a*Ca+Ca_b)*0.00013992
K<-dataset$K
K_a<-coefcor$K_a
K_b<-coefcor$K_b
K2O<-(K_a*K+K_b)*0.00012046
P<-dataset$P
P_a<-coefcor$P_a
P_b<-coefcor$P_b
P2O5<-(P_a*P+P_b)*0.00022914
# Create new data frame
data_norm<-data.frame(SiO2,TiO2,Al2O3,Fe2O3,MnO,MgO,CaO,K2O,P2O5)
# Calculate row sum
data_norm_withsum<-data_norm %>% rowwise() %>% mutate(sum = sum(c(SiO2,TiO2,Al2O3,
Fe2O3,MnO,MgO,CaO,K2O,P2O5)))
# Define formular for row percent
sumpct<-100/data_norm_withsum$sum
# Normalise major elements
SiO2<-SiO2*sumpct
TiO2<-TiO2*sumpct
Al2O3<-Al2O3*sumpct
Fe2O3<-Fe2O3*sumpct
MnO<-MnO*sumpct
MgO<-MgO*sumpct
CaO<-CaO*sumpct
K2O<-K2O*sumpct
P2O5<-P2O5*sumpct
# Select element column from the measurement data
V<-dataset$V
# Select slope correction
V_a<-coefcor$V_a
# Select offset correction (if applicable)
V_b<-coefcor$V_b
# Calculate coefficient correction
V<-V_a*V+V_b
Cr<-dataset$Cr
Cr_a<-coefcor$Cr_a
Cr_b<-coefcor$Cr_b
Cr<-Cr_a*Cr+Cr_b
Ni<-dataset$Ni
Ni_a<-coefcor$Ni_a
Ni_b<-coefcor$Ni_b
Ni<-Ni_a*Ni+Ni_b
Cu<-dataset$Cu
Cu_a<-coefcor$Cu_a
Cu_b<-coefcor$Cu_b
Cu<-Cu_a*Cu+Cu_b
Zn<-dataset$Zn
Zn_a<-coefcor$Zn_a
Zn_b<-coefcor$Zn_b
Zn<-Zn_a*Zn+Zn_b
Rb<-dataset$Rb
Rb_a<-coefcor$Rb_a
Rb_b<-coefcor$Rb_b
Rb<-Rb_a*Rb+Rb_b
Sr<-dataset$Sr
Sr_a<-coefcor$Sr_a
Sr_b<-coefcor$Sr_b
Sr<-Sr_a*Sr+Sr_b
Y<-dataset$Y
Y_a<-coefcor$Y_a
Y_b<-coefcor$Y_b
Y<-Y_a*Y+Y_b
Zr<-dataset$Zr
Zr_a<-coefcor$Zr_a
Zr_b<-coefcor$Zr_b
Zr<-Zr_a*Zr+Zr_b
Nb<-dataset$Nb
Nb_a<-coefcor$Nb_a
Nb_b<-coefcor$Nb_b
Nb<-Nb_a*Nb+Nb_b
Ba<-dataset$Ba
Ba_a<-coefcor$Ba_a
Ba_b<-coefcor$Ba_b
Ba<-Ba_a*Ba+Ba_b
Pb<-dataset$Pb
Pb_a<-coefcor$Pb_a
Pb_b<-coefcor$Pb_b
Pb<-Pb_a*Pb+Pb_b6.3 Compile Corrected Data
# Create new data frame
data<-data.frame(SAMPLE,SiO2,TiO2,Al2O3,Fe2O3,MnO,MgO,CaO,
K2O,P2O5,V,Cr,Ni,Cu,Zn,Rb,Sr,Y,Zr,Nb,Ba,Pb)
# Round selected columns to .000 for visually better plotting
data[, c("SiO2","TiO2","Al2O3","Fe2O3","MnO","MgO","CaO",
"K2O","P2O5")] <- round(data[, c("SiO2","TiO2","Al2O3","Fe2O3",
"MnO","MgO","CaO","K2O","P2O5")], 3)
# Round selected columns to .00 for visually better plotting
data[, c("V","Cr","Ni","Cu","Zn","Rb","Sr","Y","Zr","Nb",
"Ba","Pb")] <- round(data[, c("V","Cr","Ni","Cu","Zn",
"Rb","Sr","Y","Zr","Nb","Ba","Pb")], 2)
# Export to .csv
write.csv(data,"../Data//Assur_Pottery_2024_Data_Coefcor.csv",row.names=TRUE)6.4 Calculate Mean of Corrected Data
# Load data
data1<- read.csv("../Data//Assur_Pottery_2024_Data_Coefcor.csv")
# Calculate mean
data4<-(data1) %>%
group_by(SAMPLE) %>%
summarize(across(everything(),list(mean=mean)))
# Delete "_mean" from the column name
colnames(data4) <- gsub("_mean", "", colnames(data4))
# Define all columns except for the first as numeric
data4[, -1] <- sapply(data4[, -1], as.numeric)
# Round selected columns to .000 for visually better plotting
data4[, c("SiO2","TiO2","Al2O3","Fe2O3","MnO","MgO","CaO",
"K2O","P2O5")] <- round(data4[, c("SiO2","TiO2","Al2O3","Fe2O3",
"MnO","MgO","CaO","K2O","P2O5")], 3)
# Round selected columns to .00 for visually better plotting
data4[, c("V","Cr","Ni","Cu","Zn","Rb","Sr","Y","Zr","Nb",
"Ba","Pb")] <- round(data4[, c("V","Cr","Ni","Cu","Zn",
"Rb","Sr","Y","Zr","Nb","Ba","Pb")], 2)
# Select columns
data4<-data4[,c(1,3:7,9,10,12:14,15:21)]
# Export to .csv
write.csv(data4,"../Data//Assur_Pottery_2024_Data_Coefcor_MW.csv",row.names=FALSE)6.5 Merge with Archaeological Information
# Load data
data2<- read.csv("../Data//Assur_Pottery_2024_ArchData.csv")
# Merge data
data3<-merge(data4,data2, by="SAMPLE", all=FALSE)
# Export to .csv
write.csv(data3,"../Data//Assur_Pottery_2024_complete_MW.csv",row.names=FALSE)7 Tab. D2.1
# Load data
data1 <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
# Select columns
data<-data1[,c(1:18,20,21,27,22,23,25)]
# Replace "A_Pathian" in column 23 with value from column 22 (same row)
data[[23]] <- ifelse(
data[[23]] == "A_Parthian",
data[[22]],
data[[23]]
)
# Rename column 23 to "Group"
colnames(data)[23] <- "Group"
# Drop column 22
data <- data[, -22]
# Round columns 2 to 8 to two decimal places
data[, 2:8] <- lapply(data[, 2:8], function(x) round(x, 2))
# Round columns 9 to 18 to whole numbers
data[, 9:18] <- lapply(data[, 9:18], function(x) round(x, 0))
# Export to .xlsx
setwd("../Figures")
write_xlsx(data, path = "Tab_D2-1.xlsx")8 Hellenistic Chemical Groups
8.1 Fig. D2.1
# Load and filter data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
data <- data[order(data$ChemGroup_H), ]
# Define custom colors and shapes
mycolors <- c(
"HG1" = "dodgerblue1", "HG2" = "blue", "HG3" = "orange", "HG4" = "green",
"HG5" = "wheat", "HO1" = "firebrick", "HO2" = "pink", "HO3" = "magenta",
"HO4" = "red", "HO5" = "yellow3", "HO6" = "black", "HO7" = "lightblue",
"HO8" = "slategrey", "HO9" = "seagreen"
)
myshapes <- c(
"HG1" = 17, "HG2" = 19, "HG3" = 15, "HG4" = 1, "HG5" = 2,
"HO1" = 3, "HO2" = 4, "HO3" = 8, "HO4" = 9, "HO5" = 15,
"HO6" = 16, "HO7" = 18, "HO8" = 20, "HO9" = 10
)
# Define a reusable plot theme
base_theme <- theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.title = element_blank(),
legend.text = element_text(size = 8),
legend.position = "bottom"
)
# Function to generate a single plot
make_plot <- function(x, y, x_lab, y_lab) {
ggplot(data, aes(x = .data[[x]], y = .data[[y]], color = ChemGroup_H, shape = ChemGroup_H)) +
geom_point(size = 2) +
scale_color_manual(values = mycolors) +
scale_shape_manual(values = myshapes) +
labs(x = x_lab, y = y_lab) +
base_theme
}
# Create all plots
plot_1 <- make_plot("Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %")
plot_2 <- make_plot("CaO", "K2O", "CaO in %", "K2O in %")
plot_3 <- make_plot("TiO2", "MnO", "TiO2 in %", "MnO in %")
plot_4 <- make_plot("Rb", "Sr", "Rb in ppm", "Sr in ppm")
plot_5 <- make_plot("V", "Cr", "V in ppm", "Cr in ppm")
plot_6 <- make_plot("Ni", "Nb", "Ni in ppm", "Nb in ppm")
plot_7 <- make_plot("Y", "Zr", "Y in ppm", "Zr in ppm")
# Arrange the plots
grid_plot <- ggarrange(
plot_1, plot_2, plot_3,
plot_4, plot_5, plot_6,
plot_7, NULL,
ncol = 2, nrow = 4,
common.legend = TRUE,
legend = "bottom",
align = "hv"
)
# Export to .eps
ggsave(
filename = "Fig_D2-1.eps",
plot = grid_plot,
path = "../Figures",
device = cairo_ps,
width = 15.3,
height = 21,
units = "cm",
dpi = 1200
)8.2 Fig. D2.1 - Interactive with Parthian material
# Load data
data <- read.csv("../Data//Assur_Pottery_2024_complete_MW_ed.csv")
# Define custom colors and shapes
mycolors <- c("HG1" = "dodgerblue1", "HG2" = "blue","HG3"="orange","HG4"="green","HG5"="wheat","HO1"="firebrick","HO2"="pink","HO3"="magenta","HO4"="red","HO5"="yellow3","HO6"="black","HO7"="lightblue","HO8"="slategrey","HO9"="seagreen","A_Parthian"="grey85")
myshapes <- c(
"HG1" = 17, "HG2" = 19, "HG3" = 15, "HG4" = 1, "HG5" = 2,
"HO1" = 3, "HO2" = 4, "HO3" = 8, "HO4" = 9, "HO5" = 15,
"HO6" = 16, "HO7" = 18, "HO8" = 20, "HO9" = 10, "A_Parthian"=17
)
# Sort data
data <- data[order(data$ChemGroup_H), ]
# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'ChemGroup_H', shape = 'ChemGroup_H', text = 'SAMPLE')) +
geom_point(size = 2) +
scale_shape_manual(values = myshapes) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)
ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Fig. D2.1 - Interactive with Parthian Material")
# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,
nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Fig_D2-1_interactive.html", selfcontained = TRUE)8.3 Interactive plots of Subfabric Groups
# Load and filter data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
# Define custom colors and shapes
mycolors <- c( "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854", "6" = "#B3B3B3"
)
mysymbols <- c("1A"="17","1B"="17","2A" = "19","2B" = "19","2C"="19","2E"="19","3A"="2","3B"="2","4A"="6","4B"="6","4C"="6","5"="15","6"="16")
# Sort data
data <- data[order(data$ChemGroup_H), ]
# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Subfabric', shape = 'Subfabric', text = 'SAMPLE')) +
geom_point(size = 2) +
scale_shape_manual(values = mysymbols) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)
ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Scatterplot of Hellenistic Subfabric groups")
# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,
nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Hellenistic_Subfabricgroups.html", selfcontained = TRUE)8.4 Fig. D2.5
# Load and filter data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
# Define color palettes
subfabric_colors <- c( "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854", "6" = "#B3B3B3"
)
chemgroup_colors <- c("HG1" = "dodgerblue1", "HG2" = "blue","HG3"="orange","HG4"="green","HG5"="wheat","HO1"="firebrick","HO2"="pink","HO3"="magenta","HO4"="red","HO5"="yellow3","HO6"="black","HO7"="lightblue","HO8"="slategrey","HO9"="seagreen","A_Parthian"="grey85")
# Sort and aggregate data
plot_data <- data |>
filter(!is.na(ChemGroup_H), !is.na(Vessel.type), !is.na(Subfabric)) |>
arrange(ChemGroup_H) |>
count(ChemGroup_H, Vessel.type, Subfabric)
# Calculate label widths
get_text_width <- function(label, size = 2.5) {
pushViewport(viewport())
g <- textGrob(label, gp = gpar(fontsize = size * 10))
width <- convertWidth(grobWidth(g), "npc", valueOnly = TRUE)
popViewport()
return(width)
}
# Define unique strata and associated axis
stratum_df <- tibble(
stratum = unique(c(plot_data$ChemGroup_H, plot_data$Vessel.type, plot_data$Subfabric))
) |>
mutate(
axis = case_when(
stratum %in% plot_data$ChemGroup_H ~ "axis1",
stratum %in% plot_data$Vessel.type ~ "axis2",
stratum %in% plot_data$Subfabric ~ "axis3"
)
)
# Estimate text width
stratum_df <- stratum_df |>
rowwise() |>
mutate(text_width = get_text_width(stratum, size = 2.5)) |>
ungroup()# Compute max width per axis + padding
axis_widths <- stratum_df |>
group_by(axis) |>
summarise(max_width = max(text_width) + 0.02) # Add padding
# Join width to strata
stratum_df <- stratum_df |>
left_join(axis_widths, by = "axis")
# Plot with custom widths
alluvial_plot <- ggplot(plot_data, aes(
axis1 = ChemGroup_H,
axis2 = Vessel.type,
axis3 = Subfabric,
y = n
)) +
geom_alluvium(
aes(fill = ChemGroup_H),
width = 0.1, # constant for flow thickness
alpha = 0.6
) +
geom_stratum(
aes(
fill = after_stat(stratum),
width = case_when(
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis1"] ~ axis_widths$max_width[axis_widths$axis == "axis1"],
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis2"] ~ axis_widths$max_width[axis_widths$axis == "axis2"],
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis3"] ~ axis_widths$max_width[axis_widths$axis == "axis3"]
)
),
color = "black"
) +
geom_text(
stat = "stratum",
aes(label = after_stat(stratum)),
size = 2.5,
color = "black"
) +
scale_x_discrete(
limits = c("ChemGroup_H", "Vessel.type", "Subfabric"),
expand = c(0.05, 0.05)
) +
scale_fill_manual(
values = c(chemgroup_colors, subfabric_colors),
na.value = "grey",
guide = "none"
) +
labs(
title = "Fig. D2.5",
y = "Sample Count"
) +
theme_minimal()
# Display the plot
alluvial_plot# Export to .svg
svglite("../Figures/Fig_D2-5.svg", width = 12, height = 8)
print(alluvial_plot)
dev.off()png
2
8.5 Interactive plots of Vessel Types
# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Hellenistic"), ]
# Define color palettes
mycolors <- c("beaker" = "dodgerblue1", "beaker/jar" = "blue","bottle"="orange","body sherd"="slategrey","carinated bowl"="green","cooking pot"="wheat","echinus bowl"="firebrick","folded rim jar"="pink","jar"="magenta","large-mouth pot"="red","large storage jar"="yellow3","medium jar"="black","rolled rim plate"="lightblue","small jar"="brown4","thin walled beaker"="grey")
# Sort data
data <- data[order(data$Vessel.type), ]
# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Vessel.type', shape = 'Vessel.type', text = 'SAMPLE')) +
geom_point(size = 2) +
scale_shape_manual(values = mysymbols) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)
ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")
# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,
nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Hellenistic_VesselTypes.html", selfcontained = TRUE)9 Parthian Chemical Groups
9.1 Fig. D2.2
# Load and filter data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
data <- data[order(data$ChemGroup_P), ]
# Define custom colors and shapes
mycolors <- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")
myshapes <- c(
"PG1" = 17, "PG2" = 19, "PG3" = 15, "PG4" = 1, "PG5" = 2,
"PO1" = 3, "PO2" = 4, "PO3" = 8, "PO4" = 9, "PO5" = 15,
"PO6" = 16, "PO7" = 18, "PO8" = 20, "PO9" = 10
)
# Define a reusable plot theme
base_theme <- theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.title = element_blank(),
legend.text = element_text(size = 8),
legend.position = "bottom"
)
# Function to generate a single plot
make_plot <- function(x, y, x_lab, y_lab) {
ggplot(data, aes(x = .data[[x]], y = .data[[y]], color = ChemGroup_P, shape = ChemGroup_P)) +
geom_point(size = 2) +
scale_color_manual(values = mycolors) +
scale_shape_manual(values = myshapes) +
labs(x = x_lab, y = y_lab) +
base_theme
}
# Create all plots
plot_1 <- make_plot("Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %")
plot_2 <- make_plot("CaO", "K2O", "CaO in %", "K2O in %")
plot_3 <- make_plot("TiO2", "MnO", "TiO2 in %", "MnO in %")
plot_4 <- make_plot("Rb", "Sr", "Rb in ppm", "Sr in ppm")
plot_5 <- make_plot("V", "Cr", "V in ppm", "Cr in ppm")
plot_6 <- make_plot("Ni", "Nb", "Ni in ppm", "Nb in ppm")
plot_7 <- make_plot("Y", "Zr", "Y in ppm", "Zr in ppm")
# Arrange the plots
grid_plot <- ggarrange(
plot_1, plot_2, plot_3,
plot_4, plot_5, plot_6,
plot_7, NULL,
ncol = 2, nrow = 4,
common.legend = TRUE,
legend = "bottom",
align = "hv"
)
# Export to .eps
ggsave(
filename = "Fig_D2-2.eps",
plot = grid_plot,
path = "../Figures",
device = cairo_ps,
width = 15.3,
height = 21,
units = "cm",
dpi = 1200
)9.2 Fig. D2.2 - Interactive with Hellenistic material
# Load data
data <- read.csv("../Data//Assur_Pottery_2024_complete_MW_ed.csv")
# Define custom colors
mycolors <- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")
# Sort data
data <- data[order(data$ChemGroup_P), ]
# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'ChemGroup_P', shape = 'ChemGroup_P', text = 'SAMPLE')) +
geom_point(size = 2) +
scale_shape_manual(values = c(10, 11,17, 19, 15, 1, 2, 3, 4, 8, 9, 15, 16, 18, 20)) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)
ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Fig. D2.1 - Interactive with Hellenistic Material")
# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,
nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Fig_D2-2_interactive.html", selfcontained = TRUE)9.3 Interactive plots of Subfabric Groups
# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
# Define color palettes
mycolors <- c( "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854", "6" = "#B3B3B3"
)
mysymbols <- c("1A"="17","1B"="17","2A" = "19","2B" = "19","2C"="19","2E"="19","3A"="2","3B"="2","4A"="6","4B"="6","4C"="6","5"="15","6"="16")
# Sort data
data <- data[order(data$ChemGroup_P), ]
# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Subfabric', shape = 'Subfabric', text = 'SAMPLE')) +
geom_point(size = 2) +
scale_shape_manual(values = mysymbols) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)
ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")
# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,
nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Parthian_Subfabricgroups.html", selfcontained = TRUE)9.4 Fig. D2.6
# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
# Define your color palettes
subfabric_colors <- c( "1A" = "#E41A1C", "1B" = "#4DAF4A","2A" = "#984EA3", "2B" = "#FFFF33","2C" = "#A65628","2E" = "#377EB8","3A" = "#F781BF","3B" = "#999999","4A" = "#FC8D62", "4B" = "#FFD92F", "4C" = "#FF7F00", "5" = "#A6D854", "6" = "#B3B3B3"
)
chemgroup_colors <- c("PG1" = "grey85", "PG2" = "blue","PG3"="orange","PG4"="green","PG5"="wheat","PO1"="firebrick","PO2"="pink","PO3"="magenta","PO4"="red","PO5"="yellow3","PO6"="black","Hellenistic"="lightblue","Hellenistic_out"="seagreen")
# Sort and aggregate data
plot_data <- data |>
filter(!is.na(ChemGroup_P), !is.na(Vessel.type), !is.na(Subfabric)) |>
arrange(ChemGroup_P) |>
count(ChemGroup_P, Vessel.type, Subfabric)
# Calculate label widths
get_text_width <- function(label, size = 2.5) {
pushViewport(viewport())
g <- textGrob(label, gp = gpar(fontsize = size * 10))
width <- convertWidth(grobWidth(g), "npc", valueOnly = TRUE)
popViewport()
return(width)
}
# Define unique strata and associated axis
stratum_df <- tibble(
stratum = unique(c(plot_data$ChemGroup_P, plot_data$Vessel.type, plot_data$Subfabric))
) |>
mutate(
axis = case_when(
stratum %in% plot_data$ChemGroup_P ~ "axis1",
stratum %in% plot_data$Vessel.type ~ "axis2",
stratum %in% plot_data$Subfabric ~ "axis3"
)
)
# Estimate text width
stratum_df <- stratum_df |>
rowwise() |>
mutate(text_width = get_text_width(stratum, size = 2.5)) |>
ungroup()# Compute max width per axis + padding
axis_widths <- stratum_df |>
group_by(axis) |>
summarise(max_width = max(text_width) + 0.02) # Add padding
# Join width to strata
stratum_df <- stratum_df |>
left_join(axis_widths, by = "axis")
# Plot with custom widths
alluvial_plot <- ggplot(plot_data, aes(
axis1 = ChemGroup_P,
axis2 = Vessel.type,
axis3 = Subfabric,
y = n
)) +
geom_alluvium(
aes(fill = ChemGroup_P),
width = 0.1, # constant for flow thickness
alpha = 0.6
) +
geom_stratum(
aes(
fill = after_stat(stratum),
width = case_when(
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis1"] ~ axis_widths$max_width[axis_widths$axis == "axis1"],
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis2"] ~ axis_widths$max_width[axis_widths$axis == "axis2"],
after_stat(stratum) %in% stratum_df$stratum[stratum_df$axis == "axis3"] ~ axis_widths$max_width[axis_widths$axis == "axis3"]
)
),
color = "black"
) +
geom_text(
stat = "stratum",
aes(label = after_stat(stratum)),
size = 2.5,
color = "black"
) +
scale_x_discrete(
limits = c("ChemGroup_P", "Vessel.type", "Subfabric"),
expand = c(0.05, 0.05)
) +
scale_fill_manual(
values = c(chemgroup_colors, subfabric_colors),
na.value = "grey",
guide = "none"
) +
labs(
title = "Fig. D2.6",
y = "Sample Count"
) +
theme_minimal()
# Display the plot
alluvial_plot# Export to .svg
svglite("../Figures/Fig_D2-6.svg", width = 12, height = 8)
print(alluvial_plot)
dev.off()png
2
9.5 Interactive plots of Vessel Types
# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
data <- data[data$Period %in% c("Parthian"), ]
# Define color palettes
mycolors <- c("bottle"="orange","body sherd"="slategrey","bottle or jar" = "dodgerblue1", "bowl " = "blue","bowl or jug"="green","closed bowl"="wheat","deep bowl"="firebrick","jar or pot"="pink","jar"="magenta","hole-mouth jar or pot"="red","krater"="yellow3","large flat bowl"="black","large vessel"="lightblue","open bowl"="brown4","rim sherd"="grey","small jar or juglet"="darkgreen")
# Sort data
data <- data[order(data$Vessel.type), ]
# Create scatterplot function with hover support
create_interactive_plot <- function(data4, x_col, y_col, x_label, y_label, title) {
p <- ggplot(data4, aes_string(x = x_col, y = y_col, color = 'Vessel.type', shape = 'Vessel.type', text = 'SAMPLE')) +
geom_point(size = 2) +
scale_shape_manual(values = mysymbols) +
scale_color_manual(values = mycolors) +
labs(x = x_label, y = y_label, title = title) +
theme_classic() +
theme(
axis.line = element_line(colour = "black", size = 0.25),
legend.title = element_blank(),
legend.text = element_text(size = 8),
axis.title = element_text(size = 9),
axis.text = element_text(size = 8, color = "black"),
axis.ticks = element_line(size = 0.25, colour = "black"),
legend.position = "bottom"
)
ggplotly(p, tooltip = c("text", "x", "y"))
}
# Create individual interactive plots
plot1 <- create_interactive_plot(data, "Al2O3", "SiO2", "Al2O3 in %", "SiO2 in %", "Al2O3 vs SiO2")
plot2 <- create_interactive_plot(data, "CaO", "K2O", "CaO in %", "K2O in %", "CaO vs K2O")
plot3 <- create_interactive_plot(data, "TiO2", "MnO", "TiO2 in %", "MnO in %", "TiO2 vs MnO")
plot4 <- create_interactive_plot(data, "Rb", "Sr", "Rb in ppm", "Sr in ppm", "Rb vs Sr")
plot5 <- create_interactive_plot(data, "V", "Cr", "V in ppm", "Cr in ppm", "V vs Cr")
plot6 <- create_interactive_plot(data, "Ni", "Nb", "Ni in ppm", "Nb in ppm", "Ni vs Nb")
plot7 <- create_interactive_plot(data, "Y", "Zr", "Y in ppm", "Zr in ppm", "Y vs Zr")
# Arrange the plots in a grid layout with individual axis titles
grid <- subplot(
plot1, plot2, plot3,
plot4, plot5, plot6,
plot7,
nrows = 4,
shareX = FALSE, shareY = FALSE,
titleX = TRUE, titleY = TRUE
)
# Display the grid
grid# Export to .html
setwd("../Figures")
saveWidget(grid, file = "Assur_Pottery_2024_Parthian_VesselTypes.html", selfcontained = TRUE)10 Tab. D2.3 - Basic Information
# Load data
data <- read.csv("../Data/Assur_Pottery_2024_complete_MW_ed.csv")
# Create Hellenistic period table
table_h <- data %>%
filter(Period == "Hellenistic") %>%
count(ChemGroup_H, Subfabric) %>%
pivot_wider(names_from = Subfabric, values_from = n, values_fill = 0) %>%
mutate(Period = "Hellenistic")
# Create Parthian period table
table_p <- data %>%
filter(Period == "Parthian") %>%
count(ChemGroup_P, Subfabric) %>%
pivot_wider(names_from = Subfabric, values_from = n, values_fill = 0) %>%
mutate(Period = "Parthian")
# Standardizing column names for merging
colnames(table_h)[1] <- "ChemGroup"
colnames(table_p)[1] <- "ChemGroup"
# Stack the tables together
stacked_table <- bind_rows(table_h, table_p)
# Export to .xlsx
setwd("../Figures")
write_xlsx(stacked_table, path = "Tab_D2-3.xlsx")