经典工具 | 使用SIFT预测错义突变的有害性
SIFT
网址1 (官方)
开发单位
(2) 美国克雷格·文特尔研究所,基因组医学
预测原理
SIFT根据序列同源性和氨基酸的物理特性,预测氨基酸的取代是否影响蛋白质功能。可应用于自然发生的非同义突变 (多态性)和实验室诱导的错义突变。
网址2 (代表性物种预测)
https://sift.bii.a-star.edu.sg/www/SIFT4G_vcf_submit.html
首先
物种范围
少量具有代表性的动物、植物、真菌、原生生物、原核生物(只有大肠杆菌)。
VCF文件 (8th column "INFO" required) ,大小<5M
提交一个人类的VCF文件 (后文会提交其它物种)
国内SIFT在线预测的体验不是很好,可能由于网络原因。等待时间比较长、或直接"趴窝"。本篇后文会介绍本地预测的方法,体验较好。
网址3 (扩展的SIFT 4G,涉及哪些物种)
SIFT Databases
如果您研究的物种没有被下表列出,可以创建自己的SIFT预测数据库。
Common Name | Scientific Name |
African bush elephant (非洲丛林象) | Loxodonta africana |
African malaria mosquito | Anopheles gambiae |
African rice | Oryza glumaepatula |
Alpaca | Vicugna pacos |
Amebiasis protozoan parasite * | Entamoeba histolytica |
Amborella trichopoda | Amborella trichopoda |
American pika** | Ochotona princeps |
Anthracnose fungus (炭疽菌) | Colletotrichum gloeosporioides |
Arabidopsis (拟南芥) | Arabidopsis thaliana |
Asian rice (亚洲稻) | Oryza sativa |
Aspergillus (曲霉菌) | Aspergillus clavatus |
Aspergillus | Aspergillus flavus |
Aspergillus | Aspergillus fumigatus |
Aspergillus | Aspergillus nidulans |
Aspergillus | Aspergillus niger |
Aspergillus | Aspergillus terreus |
Atlantic cod | Gadus morhua |
Bakarae and foot rot disease fungus | Fusarium fujikuroi |
Barley | Hordeum vulgare |
Barrel clover | Medicago truncatula |
Black cottonwood | Populus trichocarpa |
Blackleg fungus | Leptosphaeria maculans |
Bigelowiella natans** | Bigelowiella natans |
Blind cave tetra | Astyanax mexicanus |
Blood fluke* | Schistosoma mansoni |
Bottlenose dolphin** | Tursiops truncatus |
Bovine | Bos taurus |
Brown bread rice (糙米) | Oryza rufipogon |
Cat | Felis catus |
Campion anther smut | Microbotryum violaceum |
Candida lipolytica | Yarrowia lipolytica |
Carolina anole | Anolis carolinensis |
Chicken | Gallus gallus |
Chinese cabbage | Brassica rapa |
Chinese softshell turtle | Pelodiscus sinensis |
Chimpanzee | Pan troglodytes |
Collared flycatcher | Ficedula albicollis |
Comb jelly | Mnemiopsis leidyi |
Common marmoset | Callithrix jacchus |
Common shrew** | Sorex araneus |
Crucifer anthracnose fungus | Colletotrichum higginsianum |
Cucumber anthracnose fungus | Colletotrichum orbiculare |
Diplogastrid nematode | Pristionchus pacificus |
Dog | Canis familiaris |
Dothistroma needle blight | Dothistroma septosporum |
E.coli | Escherichia coli |
Encapsulated yeast* | Cryptococcus neoformans |
Eremothecium gossypii | Ashbya gossypii |
European centipede | Strigamia maritima |
European hedgehog | Erinaceus europaeus |
Eye worm | Loa loa |
Ferret (雪貂) | Mustela putorius furo |
Filarial nematode worm* | Brugia malayi |
Fission yeast (裂变酵母) | Schizosaccharomyces japonicus |
Fission yeast | Schizosaccharomyces cryophilus |
Fission yeast | Schizosaccharomyces octosporus |
Fission yeast | Schizosaccharomyces pombe |
Fly | Drosophila ananassae |
Fly | Drosophila erecta |
Fly | Drosophila grimshawi |
Fly | Drosophila melanogaster |
Fly | Drosophila mojavensis |
Fly | Drosophila persimilis |
Fly | Drosophila pseudoobscura |
Fly | Drosophila sechellia |
Fly | Drosophila simulans |
Fly | Drosophila virilis |
Fly | Drosophila willistoni |
Fly | Drosophilia yakuba |
Foxtail millet | Setaria_italica |
Freshwater leech | Helobdella robusta |
Fusarium vascular wilt | Fusarium oxysporum |
Gaint panda | Ailuropoda melanoleuca |
Gemmiferous Spikemoss | Selaginella moellendorffii |
Gorilla | Gorilla gorilla |
Grape seed | Vitis vinifera |
Green alga* | Chlamydomonas reinhardtii |
Green Monkey | Chlorocebus_sabaeus |
Grey mouse lemur | Microcebus murinus |
Grey short-tailed opossum | Monodelphis domestica |
Guinea pig | Cavia porcellus |
Guillardia theta** | Guillardia theta |
Hoffmann's two-toed sloth | Choloepus hoffmanni |
Honey bee | Apis mellifera |
Horse | Equus caballus |
Human | Homo sapiens |
Humpbacked fly | Megaselia scalaris |
Indian rice | Oryza indica |
Indian wild rice* | Oryza nivara |
Japanese rice fish | Oryzias latipes |
Jewel wasp | Nasonia vitripennis |
Kangaroo rat** | Dipodomys ordii |
Kentucky bluegrass fungus | Magnaporthe poae |
Large flying fox** | Pteropus vampyrus |
Leaf cutter ant | Atta cephalotes |
Lesser hedgehog tenrec** | Echinops telfairi |
Little brown bat | Myotis lucifugus |
Lyre-leaved rock-cress | Arabidopsis lyrata |
Maize (玉米) | Zea mays |
Maize ear and stalk rot fungus | Gibberella moniliformis |
Maize anthracnose fungus | Glomerella graminicola |
Maize head smut fungus* | Sporisorium reilianum |
Maize smut* | Ustilago maydis |
Malaria parasite* | Plasmodium falciparum |
Malaria parasite* | Plasmodium vivax |
Monarch Butterfly** | Danaus plexippus |
Mosquito | Anopheles darlingi |
Mountain Pine Beetle | Dendroctonus ponderosae |
Mouse | Mus musculus |
Mycobacterium tuberculosis (结核杆菌) | Mycobacterium tuberculosis |
Mycosphaerella graminicola | Zymoseptoria tritici |
Necrotrophic fungal pathogen | Pyrenophora teres |
Nematode | Onchocerca_volvulus |
Neosartorya fischeri | Neosartorya fischeri |
Nile tilapia | Oreochromis niloticus |
Nine banded armadillo | Dasypus novemcinctus |
Noble rot fungus | Botryotinia fuckeliana |
Northern greater galago | Otolemur garnettii |
Northern white-cheeked gibbon | Nomascus leucogenys |
Orangutan | Pongo abelii |
Oryza_meridionalis (南方野生稻) | Oryza meridionalis |
Owl limpet** | Lottia gigantea |
Pacific transparent sea squirt | Ciona savignyi |
Pacific oyster** | Crassostrea gigas |
Parasite* | Leishmania major |
Peach | Prunus persica |
Perigord black truffle | Tuber melanosporum |
Phaeodactylum tricornutum Bohlin | Phaeodactylum tricornutum |
Philippine tarsier** | Tarsius syrichta |
Pig | Sus scrofa |
Placozoan multicellular animal | Trichoplax adhaerens |
Plant pathogen* | Albugo laibachii |
Plant pathogen | Nectria haematococca |
Plant pathogen* | Pythium irregulare |
Platypus | Ornithorhynchus anatinus |
Polychaete worm** | Capitella teleta |
Poplar leaf rust fungus | Melampsora laricipopulina |
Postman butterfly | Heliconius melpomene |
Potato | Solanum tuberosum |
Potato late blight fungus | Phytophthora infestans |
Powdery mildew | Blumeria graminis |
Primate malaria parasite* | Plasmodium knowlesi |
Puffer fish | Takifugu rubripes |
Purple false brome | Brachypodium distachyon |
Rabbit | Oryctolagus cuniculus |
Rat | Rattus norvegicus |
Red bread mold | Neurospora crassa |
Red flour mite | Tribolium castaneum |
Red imported file ant | Solenopsis invicta |
Red spider mite | Tetranychus urticae |
Rhesus macaque | Macaca mulatta |
Rice blast fungus | Magnaporthe oryzae |
Rock hyrax | Procavia capensis |
Round worm* | Caenorhabditis brenneri |
Round worm* | Caenorhabditis briggsae |
Round worm* | Caenorhabditis remanei |
Round worm | Caenorhabditis elegans |
Sea anemone | Nematostella vectensis |
Sea lamprey | Petromyzon marinus |
Sea squirt | Ciona intestinalis |
Sheep | Ovis aries |
Silkworm | Bombyx mori |
Slime mold | Dictyostelium discoideum |
Snow-rot disease causing pathogen* | Pythium iwayamai |
Sorghum | Sorghum bicolor |
Southern house mosquito | Culex quinquefasciatus |
Southern platyfish | Xiphophorus maculatus |
Soybean | Glycine max |
Soybean stem and root rot agent* | Phytophthora sojae |
Spotted gar | Lepisosteus oculatus |
Spotted green pufferfish | Tetraodon nigroviridis |
Stem rust fungus* | Puccinia_graminis |
Tammar wallaby | Macropus eugenii |
Tasmanian devil | Sarcophilus harrisii |
Termite | Zootermopsis nevadensis |
Thirteen lined ground squirrel | Ictidomys tridecemlineatus |
Three spine stickleback | Gasterosteus aculeatus |
Tomato | Solanum lycopersicum |
Toxoplasmosis protozoan parasite* | Toxoplasma gondii |
Treeshew** | Tupaia belangeri |
Trichinosis causing parasite** | Trichinella spiralis |
Trichoderma virens | Trichoderma virens |
Trichoderma reesei | Trichoderma reesei |
Trypanosomiasis parasite* | Trypanosoma brucei |
Verticillium wilt | Verticillium dahlia |
Water flea* | Daphnia pulex |
West Indian ocean coelacanth | Latimeria chalumnae |
Western clawed frog | Xenopus tropicalis |
Wheat | Triticum urartu |
Wheat and barley crown-rot fungus | Fusarium pseudograminearum |
Wheat and barley take-all root rot fungus | Gaeumannomyces graminis |
Wheat head blight fungus | Gibberella zeae |
Wheat fungal pathogen | Phaeosphaeria nodorum |
Wheat leaf rust** | Puccinia triticina |
Wheat tan spot fungus | Pyrenophora triticirepentis |
White mold | Sclerotinia sclerotiorum |
Wild duck | Anas platyrhynchos |
Wild turkey | Meleagris gallopavo |
Yeast | Komagataella pastoris |
Yeast | Saccharomyces cerevisiae |
Yellow fever mosquito | Aedes aegypti |
Yellow koji mold (黄曲霉菌) | Aspergillus oryzae |
Zebra finch | Taeniopygia guttata |
Zebra fish | Danio rerio |
* 预测的假阳性高 (High false positive error)
网址4 (多物种、功能增强版的SIFT)
2. VCF文件必须按染色体和位置排序才能正确注释。
在Linux命令行完成预测 (略)
https://sift.bii.a-star.edu.sg/sift4g/Commandline.html
在Windows本地完成预测 (Mac略)
Annotate using GUI (Mac/Windows)
1. 下载某物种的SIFT4G数据库
2. 下载本地软件
java环境变量设置,以在"Git bash"或"cmd"中启动java
进入"SIFT4G_Annotator.jar"文件所在的文件夹,鼠标右键启动"Git bash"。(或在Windows的cmd命令行写代码,注意正确的文件路径)
在当前目录中打开"Git bash"程序
java -version # 查看环境变量中的java版本
# java version "1.8.0_202"
# Java(TM) SE Runtime Environment (build 1.8.0_202-b08)
# Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode)
java -jar SIFT4G_Annotator.jar # 启动本地版SIFT
自动弹出java图形界面
java命令行启动图形界面
文件读取
7. 保存本地SIFT预测结果:
结果保存
8. 预测前、后的文件对比
预测后VCF文件的变异行数:3559 = 3608-49
VCF头文件多出两行:
查询环形密码子表, Q-Gln / K-Lys
SIFT使用总结
不再赘述,如下图:
SIFT评估突变有害性的工作流程
链接:https://pan.baidu.com/s/1-bMjndANtjiKtLMXEIs3xw
提取码:ysx3 (Author: 宋红卫)