bionhosts.blogg.se

Xloc 000126
Xloc 000126






X (output without refer gene id, possible the missing setting of -r) $cuffcompare -R /home/guest/Documents/Coursera_commandline/gencommand_proj4/athal_genes.gtf -o /home/guest/Documents/Coursera_commandline/gencommand_proj4/cufflink_day8/cuffcompare transcripts.gtf Use the same format as above for answers consisting of multiple parts. For this step, using the *.tmap files answer the following,įor both sets. To consider only the reference transcripts that overlap some input transfrag. Run cuffcompare on the resulting cufflinks transcripts, using the reference gene annotations provided and selecting the option '-R' Question 10: How many multi-exon transcripts were reported for each set? $ grep "exon" transcripts.gtf|cut -f 9|cut -d ' ' -f 2|cut -d '"' -f2|sort|cut -d '.' -f 2 |sort|uniq -d|wc -l $cat transcripts.gtf | grep -v "transcipt" | cut -f9 | grep -v "exon_number \"\""|grep -v "exon_number \"\""|grep "exon_number \"1\""|wc -lĭay16:$ cat transcripts.gtf |grep "exon_number \"2\""|wc -lĭay8:$ cat transcripts.gtf |grep "exon_number \"2\""|wc -l Question 9 : How many single-exon transcripts were in the two sets?

xloc 000126

$cut -f 1 isoforms.fpkm_tracking |grep "day" |grep -n '2$' Question 8:How many single transcript genes were produced for both sets? $cut -f 1 genes.fpkm_tracking | grep "day"| wc -l Question 6:How many genes were generated by cufflinks for each set (Day8 and Day16)? Me/guest/Documents/Coursera_commandline/gencommand_proj4/Day16/accepted_hits.bam $cufflinks -o cufflink_day16/ -L day16 -p8 /home/guest/Documents/Coursera_commandline/gencommand_proj4/Day16/accepted_hits.bam Assembling transcripts and estimating abundances. >ğragment Length Distribution: Truncated Gaussian (default) Inspecting reads and determining fragment length distribution. You are using Cufflinks v2.2.1, which is the most recent release. $cufflinks -o cufflink_day8/ -L day8 -p8 /home/guest/Documents/Coursera_commandline/gencommand_proj4/Day8/accepted_hits.bam Same format as above for answers consisting of multiple values or parts. For this portion of the analysis, answer the following questions. Question 5:How many reads were left unmapped from each set?Īssemble the aligned RNA-seq reads into genes and transcripts using cufflinks. $samtools view accepted_hits.bam |cut -f 1,6 | grep "N"|wc -l Question 4:How many spliced alignments were reported for each set?

xloc 000126

Question 3:How many reads were uniquely aligned in each case? Question 2:How many reads were mapped in each set? Samtools view accepted_hits.bam |more|wc -l Qestion 1 How many alignments were produced for the ‘Day8’ and ‘Day16’ RNA-seq data sets, respectively? Of these: 34 ( 0.1%) have multiple alignments (0 have >20)ĩ9.9% overall read mapping -p 4 -o Day16/ -G athal_genes.gtf -transcriptome-index= athal Day16.fastq Of these: 10 ( 0.0%) have multiple alignments (0 have : 57985 $tophat2 -p 4 -o Day16/ -G athal_genes.gtf -transcriptome-index= athal Day16.fastq $tophat2 -p 4 -o Day8/ -G athal_genes.gtf -transcriptome-index= athal Day8.fastq If multiple answers are required for one question, separate the answers Include a copy of the reference genome with the name “athal.fa” in the index directory.Īlign both RNA-seq data sets to the reference genome using tophat.

Xloc 000126 archive#

NOTE: The genome and annotation data were obtained and modified from the Arabidopsis Information Resources (TAIR) Database,Īnd the RNA-seq reads were extracted from GenBank’s Short Read Archive (SRA).Ĭreate a bowtie index of the genome using bowtie2-build, with the prefix ‘athal’. All files are provided in the archive gencommand_.

xloc 000126

Your own pipeline are provided in the file “”. Sample command files that you can modify to create Use default parameters unless otherwise specified. The reference genome you will need for the analysis is “athal_chr.fa” and the reference gene annotations are in “Day8.fastq” and “Day16.fastq”), extracted and sequenced the cellular mRNA, and are now set to perform the bioinformaticsĪnalysis. You collected samples at day 8 and day 16 (files In the development of Arabidopsis thaliana shoot apical meristem. You are performing an RNA-seq experiment to determine genes that are differentially expressed at different stages






Xloc 000126