[docs]classIlluminate:def__init__(self,config)->None:""" Initialize the Illuminate class with the given configuration. The Illuminate class implements a graph-based Bayesian illumination algorithm for optimizing small molecules. The algorithm begins by initializing the population from a given database. Subsequent populations are formed by mutations and crossovers. Molecules are filtered based on structural criteria and physicochemical descriptors are calculated for the remaining ones. Those molecules are assigned to niches based on their descriptors. Surrogate models predict the fitness of molecules, and acquisition functions guide the selection of promising molecules. Selected molecules are compared in direct evolutionary competition with current niche occupants. The process continues until a predetermined fitness function budget is exhausted or a maximum number generations is reached. Args: config: Configuration object containing settings for all components. """self.arbiter=Arbiter(config.arbiter)self.fitness=Fitness(config.fitness)self.generator=Generator(config.generator)self.descriptor=Descriptor(config.descriptor)self.surrogate=Surrogate(config.surrogate)self.acquisition=Acquisition(config.acquisition)self.controller=Controller(config.controller)self.archive=Archive(config.archive,self.descriptor.dimensionality)self.generator.set_archive(self.archive)self.controller.set_archive(self.archive)self.acquisition.set_archive(self.archive)returnNonedef__call__(self)->None:""" Executes the Bayesian Illumination optimization process. This function initializes the population and iteratively generates, processes, and evaluates molecules until the controller deactivates when the maximum amount of fitness calls or generations is reached. It then stores the final archive of molecules on disk. """self.initial_population()whileself.controller.active():molecules=self.generator()molecules=self.process_molecules(molecules)self.archive.add_to_archive(molecules)self.surrogate.add_to_prior_data(molecules)self.controller.update()self.controller.store_molecules()returnNone
[docs]defprocess_molecules(self,molecules:List[Molecule])->List[Molecule]:""" Process a list of molecules by fitlering out unwanted or invalid structures, calcualting phsyichcemical descriptors, applying the acquisition rules based on the surrogate model and calculating the actual fitness for the remaining molecules. Args: molecules: List of molecules to be processed. Returns: List of processed molecules. """molecules=self.arbiter(molecules)molecules=self.calculate_descriptors(molecules)molecules=self.apply_acquisition(molecules)molecules=self.calculate_fitnesses(molecules)returnmolecules
[docs]defcalculate_descriptors(self,molecules:List[Molecule])->List[Molecule]:""" Calculate descriptors for a list of molecules and update their niche index. Removes the molcules that all outside the physicochemical ranges of the archive as specified in the configuration file. Args: molecules: List of molecules. Returns: List of molecules with valid descriptors and updated niche indices. """molecules=[self.descriptor(molecule)formoleculeinmolecules]molecules=[moleculeformoleculeinmoleculesifall(1.0>property>0.0forpropertyinmolecule.descriptor)]molecules=[self.archive.update_niche_index(molecule)formoleculeinmolecules]returnmolecules
[docs]defcalculate_fitnesses(self,molecules:List[Molecule])->List[Molecule]:""" Calculate fitnesses for a list of molecules. Splits the incoming list in the case that the maximum amount of fitness calls would be exceeded. Args: molecules: List of molecules. Returns: List of molecules with calculated fitnesses. """ifself.controller.remaining_fitness_calls>=len(molecules):molecules=[self.fitness(molecule)formoleculeinmolecules]else:molecules=molecules[:self.controller.remaining_fitness_calls]molecules=[self.fitness(molecule)formoleculeinmolecules]self.controller.add_fitness_calls(len(molecules))returnmolecules
[docs]defapply_acquisition(self,molecules:List[Molecule])->List[Molecule]:""" Apply the surrogate function to a list of molecules and filter the molecules based on their acquisition function values. Args: molecules: List of molecules. Returns: List of molecules after acquisition function application. """molecules=self.surrogate(molecules)molecules=self.acquisition(molecules)returnmolecules
[docs]definitial_population(self)->None:""" Generate and process the initial population of molecules. This function loads initial molecules from a database, processes them through the arbiter, applies the descriptor and fitness calculations, adds them to the archive and uses them es a prior for the surrogate model. Finally, it updates the controller state. """molecules=self.generator.load_from_database()molecules=self.arbiter(molecules)molecules=self.calculate_descriptors(molecules)molecules=self.calculate_fitnesses(molecules)self.archive.add_to_archive(molecules)self.surrogate.add_to_prior_data(molecules)self.controller.update()returnNone