赞
踩
Fastqc官网:Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data
正确命令
fastqc --noextract 201645A_200048_1_S1_L001_R1_001.fastq.gz
当拿到测序数据的第一件事往往是进行质检,Fastqc是较为常用的质控软件,优点是:软件小,易于安装,傻瓜式操作。
Fastqc有两种使用方式,
1. windows系统使用,Fastqc最基本的使用方式,一种交互式界面,使用非常简单,但是这种方式质检小批量数据,对于超过100G或者上T的数据,如果不怕电脑崩了可以试试。
2. Linux系统使用,命令行运行,适合大批量测序数据质检
需要注意,如果直接使用默认参数运行会报错,如下:
- fastqc 201645A_200048_1_S1_L001_R1_001.fastq.gz
- Error: Could not find or load main class uk.ac.babraham.FastQC.FastQCApplication
查看帮助文件给人的感觉是:Fastqc作者推荐用使用windows系统,因为文件中描述了很多Running FastQC Interactively的内容
帮助文件也说明了如何在Linux中运行Fastqc,以及参数
- To run non-interactively you simply have to specify a list of files to process
- on the commandline
-
- 1. fastqc somefile.txt someotherfile.txt 生成含有fastq文件名的txt文件
-
- You can specify as many files to process in a single run as you like. If you don't
- specify any files to process the program will try to open the interactive application
- which may result in an error if you're running in a non-graphical environment.
-
- There are a few extra options you can specify when running non-interactively. Full
- details of these can be found by running
-
- 2. fastqc --help
-
- By default, in non-interactive mode FastQC will create an HTML report with embedded
- graphs, but also a zip file containing individual graph files and additional data files
- containing the raw data from which plots were drawn. The zip file will not be extracted
- by default but you can enable this by adding:
-
- 3. --extract 解压缩文件
-
- To the launch command.
-
- If you want to save your reports in a folder other than the folder which contained
- your original FastQ files then you can specify an alternative location by setting a
- --outdir value:
-
- 4. --outdir=/some/other/dir/
-
- If you want to run fastqc on a stream of data to be read from standard input then you
- can do this by specifing 'stdin' as the name of the file to be processed and then
- streaming uncompressed fastq format data to the program. For example:
-
- zcat *fastq.gz | fastqc stdin
- If you want the results from a streamed analysis sent to a file with a name other than
- stdin then you can add a colon and put the file name you want, for example:
-
- zcat *fastq.gz | fastqc stdin:my_results
- ..would write results to my_result.html and my_results.zip.
按照帮助文件操作,生成一个200048.txt文件,里边有要质检数据的文件名,并没有什么用,仍然报错
- cat 200048.txt
- 201645A_200048_1_S1_L001_R1_001.fastq.gz
-
- fastqc 200048.txt
- Failed to process 200048.txt
- uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line didn't start with '@'
- at uk.ac.babraham.FastQC.Sequence.FastQFile.readNext(FastQFile.java:158)
- at uk.ac.babraham.FastQC.Sequence.FastQFile.<init>(FastQFile.java:89)
- at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:106)
- at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
- at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:159)
- at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:121)
- at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:316)
仔细查看了fastqc --help内容,其中有一个参数
- --noextract Do not uncompress the output file after creating it. You
- should set this option if you do not wish to uncompress
- the output when running in non-interactive mode.
添加该参数后成功运行
- fastqc --noextract 201645A_200048_1_S1_L001_R1_001.fastq.gz
- Started analysis of 201645A_200048_1_S1_L001_R1_001.fastq.gz
3. Fastqc运行依赖于Java,所以,无论是在windows中使用还是在Linux中使用都需要先安装jre
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。