The synthesis of novel, complex drug molecules to establish structure-activity relationships (SAR) is often the limiting step in early drug discovery. To expedite SAR exploration and enhance the pharmacological profiles of lead structures within the design-make-test-analyze (DMTA) cycle, it is crucial to refine synthetic methodologies. Late-stage functionalization (LSF) offers an effective, step-saving approach for modifying advanced leads by directly substituting C–H bonds with other moieties, thereby facilitating chemical space exploration and modulating adsorption, distribution, metabolism and excretion (ADME) properties. However, the similarity of C–H bonds within structurally intricate drug and drug-like molecules necessitates a detailed understanding of their reactivity for targeted functionalization, which complicates the standardization of experimental protocols. This complexity often results in resource-intensive wet lab explorations, which may conflict with the stringent timelines and budgets of drug discovery projects. High-throughput experimentation (HTE) has emerged as a key technology to streamline synthesis by efficiently evaluating reaction conditions in a plate format using automation equipment. Tackling certain remaining bottlenecks of HTE, specifically in the field of software/hardware integration and data governance, the technology has the potential to efficiently assess LSF reaction methodologies with the lowest possible material consumption. The LSF reaction data sets from HTE campaigns combined with big data analytics and machine learning (ML) are expected to enable the development of predictive models for C–H bond transformations. This would allow the estimation of reaction outcomes before carrying out resource and time-intensive experimentation in the laboratory facilitating the synthesis of target molecules in an environmentally conscious and material-efficient manner. Despite the potential of making LSF a more efficient methodology to enable fast drug diversification and, consequently, speed up the development of novel medicines, a seamless connection between all three research fields, namely, LSF, HTE and reactivity prediction has not been made so far. This thesis presents the development of a digital, semi-automated HTE system designed to systematically evaluate LSF methodologies on drug-like molecules. Dolphin, the Data orchestrated laboratory platform harnessing innovative neural network, is an end-to-end platform tailored for LSF that incorporates automation, digitalization, and ML to enhance compound synthesis efficiency in early drug discovery. Advanced automated laboratory equipment, such as solid and liquid dosing robots, is employed to simultaneously initiate reactions and prepare controls, ensuring sample quality for subsequent analyses. A high level of software/hardware integration supports the workflow from literature analysis and reaction plate screening to scale-up planning and data management. To allow the extraction, curation, storage and analysis of reaction data from the literature, in parallel with the development of Dolphin, efforts have been directed towards the development of a simple, user-friendly reaction format (SURF). After evaluating current data-sharing practices and identifying bottlenecks, SURF was designed to be both human- and machine-readable, streamlining the use of reaction data in ML applications. Application of this format to curate data from selected publications enabled systematic HTE plate design and provided high-quality data sets for ML model development. Applying Dolphin and SURF in two case studies with different LSF reaction types enabled reactivity prediction. The first case study was centered around assessing the applicability of C–H borylation reactions for the late-stage diversification of complex molecules. Hundreds of HTE reactions were performed on systematically chosen commercial drugs under a wide array of conditions. The data generated from these experiments were captured in SURF and used to support the development of an ML algorithm capable of predicting binary reaction outcomes, yields, and regioselectivity for novel substrates. The influence of steric and electronic effects on model performance was quantified by featurization of the input molecular graphs with 2D, 3D and quantum mechanics (QM) augmented information. The reactivity of novel reactions with known and unknown substrates was classified with a balanced accuracy of 92% and 67%, respectively, while computational models predicted reaction yields for diverse reaction conditions with a mean absolute error (MAE) margin of 4–5%. The platform delivered numerous starting points for the structural diversification of commercial pharmaceuticals and advanced drug-like fragments. The second case study investigated a library-type screening approach for determining the substrate scope of late-stage Minisci-type C–H alkylations to explore new exit vectors. This approach aimed to facilitate the in silico prediction of suitable substrates that can undergo coupling with a diverse array of sp3-rich carboxylic acids. Again, Dolphin and SURF provided the experimental data sets to train ML models for the described task. The algorithms predicted reaction yields with an MAE of 11–12% and suggested starting points for scale-up reactions of 3180 advanced heterocyclic building blocks with various carboxylic acid building blocks. From those, a set of promising candidates was chosen, reactions were scaled up to the 50 to 100 mg range and products were isolated and characterized. This process led to the creation of 30 novel, functionally modified molecules that hold potential for further optimization. The results from both case studies positively advocate the application of ML based on high-quality HTE data for reactivity prediction in the LSF space and beyond. \medskip In summary, this thesis established a semi-automated platform (Dolphin) and a new reaction format (SURF), facilitating the development of ML models for LSF reaction screening, thereby contributing to enhancing the compound synthesis efficiency in drug discovery through the strategic application of laboratory automation and artificial intelligence.