![]() |
The miscreen engine is written in Java and therefore can be used on any platform where Java runtime (version 1.13 or higher) is installed. Java is currently supported on virtually all major platforms (Windows, Mac, Linux, Unix). The latest version of Java runtime may be downloaded free of charge from various providers. (You can determine which version of Java is installed on your system using the command java -version). No additional software or special installation is required to run miscreen.
Functions of the miscreen engine are available from the Windows command prompt or UNIX command line.
A set of inactive molecules is also required as a reference. In most cases, this consists simply of inactive compounds from an HTS campaign. When only active molecules are available and no information about inactive molecules can be obtained (for example, when using information from literature sources or competitor patents), a "background" set of representative drug-like molecules may be used as a reference.
java -jar miscreen.jar -act active_molecules -ina inactive_molecules > model_file
active_molecules and inactive_molecules are files containing lists of active and inactive molecules used to train the model (of course, any filenames may be used). Molecules are encoded as SMILES, with one molecule per line, separated by tabs from any additional information (such additional data are ignored).
The generated model is stored in the file "model" (or any filename you choose).
Once a bioactivity model has been generated, the actual virtual screening can be performed using the command:
java -jar miscreen.jar -model model_file -screen file_to_screen [-minscore x] > results
where
model_file is the model file generated in the first step
file_to_screen is a file containing molecules to be screened, with SMILES as the first item followed by additional tab-separated data (such as a molecule identifier)
When using the option -minscore (for example -minscore 5.), only molecules with activity scores greater than the specified value will be sent to the output.
Results will be saved in the file results in the form of SMILES, original input data, and the calculated activity score, separated by tabs.
This procedure does not use cross-validation during model creation and validation. To obtain a more reliable assessment of model performance, we strongly recommend using cross-validation during the model building process. In our internal projects, Molinspiration typically uses an average of 10 cross-validation runs, with 80% of the data used for training and 20% for validation. The complete procedure can be easily implemented in Python by calling miscreen.jar and using sklearn's train_test_split module to prepare training and test sets and sklearn.metrics roc_curve to evaluate performance. Molinspiration will be glad to assist with development of a tailored model-building protocol.
java -jar -mx1000m miscreen.jar parameters
(details depend on your computer system; consult your local Java expert if necessary).
Do not manually edit data files generated by miscreen, as the program relies on a specific data format.
Do not hesitate to contact Molinspiration if you have additional questions, comments, or if you wish to evaluate the miscreen package.