NAME
gt-ltrdigest - Identifies and annotates sequence features in LTR retrotransposon candidates.
SYNOPSIS
gt ltrdigest [option …] gff3_file
DESCRIPTION
- -outfileprefix [string]
- 
prefix for output files (e.g. foo will create files called foo_*.csv and foo_*.fas) Omit this option for GFF3 output only. 
- -metadata [yes|no]
- 
output metadata (run conditions) to separate file (default: yes) 
- -seqnamelen [value]
- 
set maximal length of sequence names in FASTA headers (e.g. for clustalw or similar tools) (default: 20) 
- -pptlen [start end]
- 
required PPT length range (default: [8..30]) 
- -uboxlen [start end]
- 
required U-box length range (default: [3..30]) 
- -uboxdist [value]
- 
allowed U-box distance range from PPT (default: 0) 
- -pptradius [value]
- 
radius around beginning of 3' LTR to search for PPT (default: 30) 
- -pptrprob [value]
- 
purine emission probability inside PPT (default: 0.970000) 
- -pptyprob [value]
- 
pyrimidine emission probability inside PPT (default: 0.030000) 
- -pptgprob [value]
- 
background G emission probability outside PPT (default: 0.250000) 
- -pptcprob [value]
- 
background C emission probability outside PPT (default: 0.250000) 
- -pptaprob [value]
- 
background A emission probability outside PPT (default: 0.250000) 
- -ppttprob [value]
- 
background T emission probability outside PPT (default: 0.250000) 
- -pptuprob [value]
- 
U/T emission probability inside U-box (default: 0.910000) 
- -trnas [filename]
- 
tRNA library in multiple FASTA format for PBS detection Omit this option to disable PBS search. 
- -pbsalilen [start end]
- 
required PBS/tRNA alignment length range (default: [11..30]) 
- -pbsoffset [start end]
- 
allowed PBS offset from LTR boundary range (default: [0..5]) 
- -pbstrnaoffset [start end]
- 
allowed PBS/tRNA 3' end alignment offset range (default: [0..5]) 
- -pbsmaxedist [value]
- 
maximal allowed PBS/tRNA alignment unit edit distance (default: 1) 
- -pbsradius [value]
- 
radius around end of 5' LTR to search for PBS (default: 30) 
- -hmms
- 
profile HMM models for domain detection (separate by spaces, finish with --) in HMMER3 format Omit this option to disable pHMM search. 
- -pdomevalcutoff [value]
- 
global E-value cutoff for pHMM search default 1E-6 
- -pdomcutoff […]
- 
model-specific score cutoff choose from TC (trusted cutoff) | GA (gathering cutoff) | NONE (no cutoffs) (default: NONE) 
- -aliout [yes|no]
- 
output pHMM to amino acid sequence alignments (default: no) 
- -aaout [yes|no]
- 
output amino acid sequences for protein domain hits (default: no) 
- -allchains [yes|no]
- 
output features from all chains and unchained features, labeled with chain numbers (default: no) 
- -maxgaplen [value]
- 
maximal allowed gap size between fragments (in amino acids) when chaining pHMM hits for a protein domain (default: 50) 
- -force_recreate [yes|no]
- 
force recreation of hmmpressed profiles (default: no) 
- -pbsmatchscore [value]
- 
match score for PBS/tRNA alignments (default: 5) 
- -pbsmismatchscore [value]
- 
mismatch score for PBS/tRNA alignments (default: -10) 
- -pbsinsertionscore [value]
- 
insertion score for PBS/tRNA alignments (default: -20) 
- -pbsdeletionscore [value]
- 
deletion score for PBS/tRNA alignments (default: -20) 
- -v [yes|no]
- 
be verbose (default: no) 
- -o [filename]
- 
redirect output to specified file (default: undefined) 
- -gzip [yes|no]
- 
write gzip compressed output file (default: no) 
- -bzip2 [yes|no]
- 
write bzip2 compressed output file (default: no) 
- -force [yes|no]
- 
force writing to output file (default: no) 
- -seqfile [filename]
- 
set the sequence file from which to take the sequences (default: undefined) 
- -encseq [filename]
- 
set the encoded sequence indexname from which to take the sequences (default: undefined) 
- -seqfiles
- 
set the sequence files from which to extract the features use -- to terminate the list of sequence files 
- -matchdesc [yes|no]
- 
search the sequence descriptions from the input files for the desired sequence IDs (in GFF3), reporting the first match (default: no) 
- -matchdescstart [yes|no]
- 
exactly match the sequence descriptions from the input files for the desired sequence IDs (in GFF3) from the beginning to the first whitespace (default: no) 
- -usedesc [yes|no]
- 
use sequence descriptions to map the sequence IDs (in GFF3) to actual sequence entries. If a description contains a sequence range (e.g., III:1000001..2000000), the first part is used as sequence ID (III) and the first range position as offset (1000001) (default: no) 
- -regionmapping [string]
- 
set file containing sequence-region to sequence file mapping (default: undefined) 
- -help
- 
display help for basic options and exit 
- -help+
- 
display help for all options and exit 
- -version
- 
display version information and exit 
REPORTING BUGS
Report bugs to https://github.com/genometools/genometools/issues.