Resolution-vs.-Accuracy Dilemma in Machine Learning Modeling of Electronic Excitation Spectra

22 Oct 2021  ·  Prakriti Kayastha, Sabyasachi Chakraborty, Raghunathan Ramakrishnan ·

In this study, we explore the potential of machine learning for modeling molecular electronic spectral intensities as a continuous function in a given wavelength range. Since presently available chemical space datasets provide excitation energies and corresponding oscillator strengths for only a few valence transitions, here, we present a new dataset -- \bigqm -- with 12,880 molecules containing up to 7 CONF atoms and report ground state and excited state properties. A publicly accessible web-based data-mining platform is presented to facilitate on-the-fly screening of several molecular properties including harmonic vibrational and electronic spectra. We present all singlet electronic transitions from the ground state calculated using the time-dependent density functional theory framework with the $\omega$B97XD exchange-correlation functional and a diffuse-function augmented basis set. The resulting spectra predominantly span the X-ray to deep-UV region (10--120 nm). To compare the target spectra with predictions based on small basis sets, we bin spectral intensities and show good agreement is obtained only at the expense of the resolution. Compared to this, machine learning models with latest structural representations trained directly using $<10 \%$ of the target data recover the spectra of the remaining molecules with better accuracies at a desirable $<1$ nm wavelength resolution.

PDF Abstract
No code implementations yet. Submit your code now

Categories


Chemical Physics