tries

Core Gene Prediction Model

Home


Random Forest model is developed based on top 25 sequence features and gene structural features that classify whether the Gene is Core or Non-Core. For classifying the Genes, users are requested to submit their data in the form as per the value range provided in the input placeholder or autofill the required protein and DNA sequence feature by entering the protein and the coding sequence in the input boxes. [Note: For predicted value, please check the footer of the table.]


For more details on the predictions module, go through the video tutorial link here: Open Video.


Enter your protein sequence for generating protein sequence features (example):

Enter your Coding sequence for generating DNA sequence features (example):



1. Calculate the gene structural features such as (3 Prime UTR Length, Isoforms Count, 5 prime UTR length, Average Exon Length and Canonical mRNA Length) using the customized gff analyzer script: GFF Analyzer

Generate the required gene strutcural features running this command:

Please note: The gff analyzer script only works for gff files in a particular format. An example format file is available in the github script link provided above,and the original data can be accessed through MaizeGDB

2. Calculate the chromosomal distance using the customized chromosomal distance calculator script: Distance Calcultor

Generate the required chromosomal distance feature running this command:

Please note: An example structural bed file is available in the github script link provided above, and the original data can be accessed through MaizeGDB Jbrowse

Submission Form