有一个 较新的版本 该记录的可用。

软件 开放存取

罗布兰夫 / sarscov2phylo:20年5月11日

罗布兰夫; 理查德·曼斯菲尔德

引用和重用

请将此版本引用为:

Lanfear,Rob(2020年)。来自GISAID的SARS-CoV-2序列的全局系统发育。 Zenodo 土井:10.5281 / zenodo.3958883

您可以在这里访问该DOI:

如果发布使用该树的论文,则仍必须遵循GISAID数据共享和归因规则。

细节

此版本中的树是使用以下命令行生成的:

bash global_tree_gisaid_start_tree.sh -i [gisaid.fasta] -p [previous_iteration] -t 250

  • [gisaid.fasta]是从GISAID到发布标题中包括日期在内的高覆盖率和完整原始序列的fasta文件,由GISAID数据Feed上的“提交日期”过滤器确定

  • [previous_iteration] is the filepath of the previous release, this is used to provide the excluded_sequences.tsv and ft_SH.tree files as the starting points of the current iteration.

过滤统计
sequences downloaded from GISAID
136871
//
alignment stats of global alignment
Alignment number:    1
Format:              aligned FASTA
Number of sequences: 134414
Alignment length:    29903
Total # residues:    4005663633
Smallest:            29105
Largest:             29903
Average length:      29800.9
Average identity:    100%
//
alignment stats of global alignment after masking sites
Alignment number:    1
Format:              aligned FASTA
Number of sequences: 134414
Alignment length:    29903
Total # residues:    3987748459
Smallest:            29036
Largest:             29675
Average length:      29667.7
Average identity:    100%
//
alignment stats after filtering out short/ambiguous sequences
Alignment number:    1
Format:              aligned FASTA
Number of sequences: 134370
Alignment length:    29903
Total # residues:    3986445916
Smallest:            29036
Largest:             29675
Average length:      29667.7
Average identity:    100%
//
alignment stats of global alignment after trimming sites that are >50% gaps
Alignment number:    1
Format:              aligned FASTA
Number of sequences: 134370
Alignment length:    29646
Total # residues:    3976491642
Smallest:            28498
Largest:             29646
Average length:      29593.6
Average identity:    100%
//
After filtering sequences with TreeShrink
Type:   Phylogram
#nodes: 233128
#leaves:    134252
#dichotomies:   93693
#leaf labels:   134252
#inner labels:  87918
Number of new sequences added this iteration
5656 alignment_names_new.txt
此版本中脚本的重大更改
  • 没有
树木的显着方面
  • 没有

档案 (10.0 MB)
名称 尺寸
罗布兰夫 / sarscov2phylo-5-11-20.zip
md5:b00c585992c8af308fb968665a603718
10.0兆字节 下载
6,087
271
意见
资料下载
所有版本 这个版本
观看次数 6,08716
资料下载 2714
数据量 2.4 GB40.1兆字节
独特的景色 4,81515
独特下载 2214

分享

引用为