# assuming /path/to/CEGMA represents a valid path to your local CEGMA installation: % cegma -g /path/to/CEGMA/sample/sample.dna -p /path/to/CEGMA/sample/sample.prot ******************************************************************************** ** MAPPING PROTEINS TO GENOME (TBLASTN) ** ******************************************************************************** RUNNING: genome_map -n genome -p 6 -o 5000 -c 2000 -t 16 /path/to/CEGMA/sample/sample.prot /path/to/CEGMA/sample/sample.dna 2>output.cegma.errors Building a new DB, current time: 04/30/2012 11:42:26 New DB name: /tmp/genome86278.blastdb New DB title: /path/to/CEGMA/sample/sample.dna Sequence type: Nucleotide Keep Linkouts: T Keep MBits: T Maximum file size: 1073741824B Adding sequences from FASTA; added 1 sequences in 0.0646069 seconds. Found 86 candidate regions in /path/to/CEGMA/sample/sample.dna ******************************************************************************** ** MAKING INITIAL GENE PREDICTIONS FOR CORE GENES (GENEWISE + GENEID) ** ******************************************************************************** RUNNING: local_map -n local -f -h /path/to/CEGMA/data/hmm_profiles -i KOG genome.chunks.fa 2>output.cegma.errors NOTE: created 23 geneid predictions ******************************************************************************** ** FILTERING INITIAL PROTEINS PRODUCED BY GENEID (HMMER) ** ******************************************************************************** RUNNING: hmm_select -i KOG -o local -t 16 /path/to/CEGMA/data/hmm_profiles local.geneid.fa /path/to/CEGMA/data/profiles_cutoff.tbl 2>output.cegma.errors NOTE: Found 15 geneid predictions with scores above threshold value ******************************************************************************** ** CALCULATING GENEID PARAMETERS FROM SELECTED GENEID PREDICTIONS ** ******************************************************************************** RUNNING: geneid-train local.geneid.selected.gff local.geneid.selected.dna geneid_params 2>output.cegma.errors DATA COLLECTED: 15 Coding sequences containing 48 introns RUNNING: make_paramfile /path/to/CEGMA/data/self.param.template \ geneid_params/coding.initial.5.logs geneid_params/coding.transition.5.logs \ geneid_params/start.logs geneid_params/acc.logs geneid_params/don.logs \ geneid_params/intron.max > geneid_params/self.param ******************************************************************************** ** ACCURATE LOCAL MAPPING ** ******************************************************************************** RUNNING: local_map -n local_self -g local.genewise.gff -d geneid_params/self.param -h /path/to/CEGMA/data/hmm_profiles -i KOG genome.chunks.fa 2>output.cegma.errors NOTE: Will use specifed local.genewise.gff file instead of running genewise NOTE: created 21 geneid predictions ******************************************************************************** ** FINAL FILTERING ** ******************************************************************************** RUNNING: hmm_select -i KOG -o local_self -t 16 /path/to/CEGMA/data/hmm_profiles local_self.geneid.fa /path/to/CEGMA/data/profiles_cutoff.tbl 2>output.cegma.errors NOTE: Foud 15 geneid predictions with scores above threshold value ******************************************************************************** ** CONVERTING LOCAL COORDINATES INTO GENOME-WIDE COORDINATES ** ******************************************************************************** ******************************************************************************** ** EVALUATING RESULTS AND COMPARING TO SET OF 248 HIGHLY CONSERVED CEGS ** ******************************************************************************** RUNNING: completeness local_self.hmm_select.aln /path/to/CEGMA/data/completeness_cutoff.tbl > output.completeness_report # if this all has worked correctly, you should see the following file sizes % ls -l total 272 -rw-r--r-- 1 keith staff 99177 May 14 09:49 output.cegma.dna -rw-r--r-- 1 keith staff 0 May 14 09:49 output.cegma.errors -rw-r--r-- 1 keith staff 5484 May 14 09:49 output.cegma.fa -rw-r--r-- 1 keith staff 8325 May 14 09:49 output.cegma.gff -rw-r--r-- 1 keith staff 155 May 14 09:49 output.cegma.id -rw-r--r-- 1 keith staff 8075 May 14 09:49 output.cegma.local.gff -rw-r--r-- 1 keith staff 1330 May 14 09:49 output.completeness_report