![]() |
RDKitEnumerateCompoundLibrary.py - Enumerate a virtual compound library
RDKitEnumerateCompoundLibrary.py [--compute2DCoords <yes or no>] [--infileParams <Name,Value,...>] [--mode <RxnByName or RxnBySMIRKS>] [--outfileParams <Name,Value,...>] [--overwrite] [--prodMolNames <UseReactants or Sequential>] [--rxnName <text>] [--rxnNamesFile <FileName or auto>] [--smirksRxn <text>] [--sanitize <yes or no>] [-w <dir>] -i <ReactantFile1,...> -o <outfile>
RDKitEnumerateCompoundLibrary.py [--rxnNamesFile <FileName or auto>] -l | --list
RDKitEnumerateCompoundLibrary.py -h | --help | -e | --examples
Perform a combinatorial enumeration of a virtual library of molecules for a reaction specified using a reaction name or SMIRKS pattern and reactant input files.
The SMIRKS patterns for supported reactions names [ Ref 134 ] are retrieved from file, ReactionNamesAndSMIRKS.csv, available in MayaChemTools data directory. The current list of supported reaction names is shown below:
'1,2,4_triazole_acetohydrazide', '1,2,4_triazole_carboxylic_acid_ester', 3_nitrile_pyridine, Benzimidazole_derivatives_aldehyde, Benzimidazole_derivatives_carboxylic_acid_ester, Benzofuran, Benzothiazole, Benzothiophene, Benzoxazole_aromatic_aldehyde, Benzoxazole_carboxylic_acid, Buchwald_Hartwig, Decarboxylative_coupling, Fischer_indole, Friedlaender_chinoline, Grignard_alcohol, Grignard_carbonyl, Heck_non_terminal_vinyl, Heck_terminal_vinyl, Heteroaromatic_nuc_sub, Huisgen_Cu_catalyzed_1,4_subst, Huisgen_disubst_alkyne, Huisgen_Ru_catalyzed_1,5_subst, Imidazole, Indole, Mitsunobu_imide, Mitsunobu_phenole, Mitsunobu_sulfonamide, Mitsunobu_tetrazole_1, Mitsunobu_tetrazole_2, Mitsunobu_tetrazole_3, Mitsunobu_tetrazole_4, N_arylation_heterocycles, Negishi, Niementowski_quinazoline, Nucl_sub_aromatic_ortho_nitro, Nucl_sub_aromatic_para_nitro, Oxadiazole, Paal_Knorr_pyrrole, Phthalazinone, Pictet_Spengler, Piperidine_indole, Pyrazole, Reductive_amination, Schotten_Baumann_amide, Sonogashira, Spiro_chromanone, Stille, Sulfon_amide, Suzuki, Tetrazole_connect_regioisomer_1, Tetrazole_connect_regioisomer_2, Tetrazole_terminal, Thiazole, Thiourea, Triaryl_imidazole, Urea, Williamson_ether, Wittig
The supported input file formats are: SD (.sdf, .sd), SMILES (.smi, .csv, .tsv, .txt)
The supported output file formats are: SD (.sdf, .sd), SMILES (.smi)
Compute 2D coordinates of product molecules before writing them out.
Comma delimited list of reactant file names for enumerating a compound library using reaction SMIRKS. The number of reactant files must match number of reaction components in reaction SMIRKS. All reactant input files must have the same format.
A comma delimited list of parameter name and value pairs for reading molecules from files. The supported parameter names for different file formats, along with their default values, are shown below:
Possible values for smilesDelimiter: space, comma or tab. These parameters apply to all reactant input files, which must have the same file format.
Print examples.
Print this help message.
List available reaction names along with corresponding SMIRKS patterns without performing any enumeration.
Indicate whether a reaction is specified by a reaction name or a SMIRKS pattern. Possible values: RxnByName or RxnBySMIRKS.
Output file name.
A comma delimited list of parameter name and value pairs for writing molecules to files. The supported parameter names for different file formats, along with their default values, are shown below:
Generate names of product molecules using reactant names or assign names in a sequential order. Possible values: UseReactants or Sequential. Format of molecule names: UseReactants - <ReactName1>_<ReactName2>..._Prod<Num>; Sequential - Prod<Num>
Overwrite existing files.
Name of a reaction to use for enumerating a compound library. This option is only used during 'RxnByName' value of '-m, --mode' option.
Specify a file name containing data for names of reactions and SMIRKS patterns or use default file, ReactionNamesAndSMIRKS.csv, available in MayaChemTools data directory.
Reactions SMIRKS file format: RxnName,RxnSMIRKS.
The format of data in local reaction names file must match format of the reaction SMIRKS file available in MayaChemTools data directory.
SMIRKS pattern of a reaction to use for enumerating a compound library. This option is only used during 'RxnBySMIRKS' value of '-m, --mode' option.
Sanitize product molecules before writing them out.
Location of working directory which defaults to the current directory.
To list all available reaction names along with their SMIRKS pattern, type:
To perform a combinatorial enumeration of a virtual compound library corresponding to named amide reaction, Schotten_Baumann_amide and write out a SMILES file type:
To perform a combinatorial enumeration of a virtual compound library corresponding to an amide reaction specified using a SMIRKS pattern and write out a SD file containing sanitized molecules, computed 2D coordinates, and generation of molecule names from reactant names, type:
To perform a combinatorial enumeration of a virtual compound library corresponding to an amide reaction specified using a SMIRKS pattern and write out a SD file containing unsanitized molecules, without generating 2D coordinates, and a sequential generation of molecule names, type:
RDKitConvertFileFormat.py, RDKitFilterPAINS.py, RDKitSearchFunctionalGroups.py, RDKitSearchSMARTS.py
Copyright (C) 2023 Manish Sud. All rights reserved.
The functionality available in this script is implemented using RDKit, an open source toolkit for cheminformatics developed by Greg Landrum.
This file is part of MayaChemTools.
MayaChemTools is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.