有一个 较新的版本 该记录的可用。

软件 开放存取

罗布兰夫 / sarscov2phylo:20年11月11日

罗布兰夫; 理查德·曼斯菲尔德

引用和重用

请将此版本引用为:

Lanfear,Rob(2020年)。来自GISAID的SARS-CoV-2序列的全局系统发育。 Zenodo 土井:10.5281 / zenodo.3958883

您可以在这里访问该DOI:

如果发布使用该树的论文,则仍必须遵循GISAID数据共享和归因规则。

细节

此版本中的树是使用以下命令行生成的:

bash global_tree_gisaid_start_tree.sh -i [gisaid.fasta] -p [previous_iteration] -t 250

  • [gisaid.fasta]是从GISAID到发布标题中包括日期在内的高覆盖率和完整原始序列的fasta文件,由GISAID数据Feed上的“提交日期”过滤器确定

  • [previous_iteration] is the filepath of the previous release, this is used to provide the excluded_sequences.tsv and ft_SH.tree files as the starting points of the current iteration.

过滤统计
sequences downloaded from GISAID
146552
//
alignment stats of global alignment
Alignment number:    1
Format:              aligned FASTA
Number of sequences: 143902
Alignment length:    29903
Total # residues:    4288314661
Smallest:            29105
Largest:             29903
Average length:      29800.2
Average identity:    100%
//
alignment stats of global alignment after masking sites
Alignment number:    1
Format:              aligned FASTA
Number of sequences: 143902
Alignment length:    29903
Total # residues:    4269256702
Smallest:            29036
Largest:             29675
Average length:      29667.8
Average identity:    100%
//
alignment stats after filtering out short/ambiguous sequences
Alignment number:    1
Format:              aligned FASTA
Number of sequences: 143858
Alignment length:    29903
Total # residues:    4267954159
Smallest:            29036
Largest:             29675
Average length:      29667.8
Average identity:    100%
//
alignment stats of global alignment after trimming sites that are >50% gaps
Alignment number:    1
Format:              aligned FASTA
Number of sequences: 143858
Alignment length:    29646
Total # residues:    4257455014
Smallest:            28337
Largest:             29646
Average length:      29594.8
Average identity:    100%
//
After filtering sequences with TreeShrink
Type:   Phylogram
#nodes: 249182
#leaves:    143782
#dichotomies:   99803
#leaf labels:   143782
#inner labels:  93540
Number of new sequences added this iteration
3299 alignment_names_new.txt
此版本中脚本的重大更改
  • 没有
树木的显着方面
  • 没有

档案 (10.1 MB)
名称 尺寸
罗布兰夫 / sarscov2phylo-11-11-20.zip
md5:17a276b142a2c13b8b482392ca06ba4f
10.1兆字节 下载
6,087
271
意见
资料下载
所有版本 这个版本
观看次数 6,08710
资料下载 2714
数据量 2.4 GB40.3兆字节
独特的景色 4,8158
独特下载 2214

分享

引用为