Difference between revisions of "Eland"

From Genome Technology Core (GTC) wiki - Sequencing and Microarray
Jump to: navigation, search
Line 15: Line 15:
 
== Output Files ==
 
== Output Files ==
  
1. *[GenomeUsed]_[SeedUsed]_eland_extended.txt - Gives all the positions a read maps, if it matches to less than 10 positions.
+
1. [Prefix]_[GenomeUsed]_[SeedUsed]_eland_extended.txt - Gives all the positions a read maps, if it matches to less than 10 positions.
  
2. *[GenomeUsed]_[SeedUsed]_export.txt - Contains all read, quality value and alignment information for a lane of data.
+
2. [Prefix]_[GenomeUsed]_[SeedUsed]_export.txt - Contains all read, quality value and alignment information for a lane of data.
  
3. *[GenomeUsed]_[SeedUsed].ylf - (Y)oung (L)ab (F)ormat files - These are used as input for the young lab error model for ChIP-Seq analysis.
+
3. [Prefix]_[GenomeUsed]_[SeedUsed].ylf - (Y)oung (L)ab (F)ormat files - These are used as input for the young lab error model for ChIP-Seq analysis.
  
4. *[GenomeUsed]_[SeedUsed]_sorted.txt - Contains all the uniquely mapped reads, sorted based on the position on the genome.
+
4. [Prefix]_[GenomeUsed]_[SeedUsed]_sorted.txt - Contains all the uniquely mapped reads, sorted based on the position on the genome.
  
5. *[GenomeUsed]_[SeedUsed]_genomesize.xml - Summary for the genome being used for the alignment
+
5. [Prefix]_[GenomeUsed]_[SeedUsed]_genomesize.xml - Summary for the genome being used for the alignment
  
 
6. TechnicalAnalysis-[Prefix]_[GenomeUsed]_[SeedUsed].xls - QC Report for the ELAND alignment - Please refer here for details - http://jura.wi.mit.edu/genomecorewiki/index.php/QCOutputFormat
 
6. TechnicalAnalysis-[Prefix]_[GenomeUsed]_[SeedUsed].xls - QC Report for the ELAND alignment - Please refer here for details - http://jura.wi.mit.edu/genomecorewiki/index.php/QCOutputFormat
  
7. *[GenomeUsed]_[SeedUsed]_tagcount.txt - Gives a count for all the reads, their corresponding counts, position on the genome and the gene name if it is in the transcribed region.
+
7. [Prefix]_[GenomeUsed]_[SeedUsed]_tagcount.txt - Gives a count for all the reads, their corresponding counts, position on the genome and the gene name if it is in the transcribed region.

Revision as of 11:27, 10 September 2009

ELAND - E fficient L arge-Scale A lignment of N ucleotide D atabases - is an alignment tool integrated in Illumina-Solexa data processing package, can do ungapped alignment for reads with size up to 32 bp (Cox, unpublished). Although the core program is a binary executable but we have built certain wrapper scripts to be able to run the ELAND program independently of the pipeline.

To run the program

Step 1: Please copy the script titled - "run_eland_v1.pl" - from "\\Gobo\wicmt_public\public_apps\ELAND" folder to your home directory.

Step 2: Copy the *_sequence.txt files from the "quality score" folders to your working directory where you have write permissions.

Step 3: Open the "run_eland_v1.pl" in any text editor and modify the parameter according to the instructions in the file. Please email Sumeet Gupta, sgupta@wi.mit.edu, if you have any questions.

Step 4: Logon to your tak account.

Step 5: Run the perl script "run_eland_v1.pl".

Output Files

1. [Prefix]_[GenomeUsed]_[SeedUsed]_eland_extended.txt - Gives all the positions a read maps, if it matches to less than 10 positions.

2. [Prefix]_[GenomeUsed]_[SeedUsed]_export.txt - Contains all read, quality value and alignment information for a lane of data.

3. [Prefix]_[GenomeUsed]_[SeedUsed].ylf - (Y)oung (L)ab (F)ormat files - These are used as input for the young lab error model for ChIP-Seq analysis.

4. [Prefix]_[GenomeUsed]_[SeedUsed]_sorted.txt - Contains all the uniquely mapped reads, sorted based on the position on the genome.

5. [Prefix]_[GenomeUsed]_[SeedUsed]_genomesize.xml - Summary for the genome being used for the alignment

6. TechnicalAnalysis-[Prefix]_[GenomeUsed]_[SeedUsed].xls - QC Report for the ELAND alignment - Please refer here for details - http://jura.wi.mit.edu/genomecorewiki/index.php/QCOutputFormat

7. [Prefix]_[GenomeUsed]_[SeedUsed]_tagcount.txt - Gives a count for all the reads, their corresponding counts, position on the genome and the gene name if it is in the transcribed region.