根据蛋白多序列比对文件mapping颜色至蛋白三维结构
使用场景
蛋白氨基酸多序列比对能体现不同蛋白间的进化关系,以及用于发现蛋白结构中的保守结构和容易发生突变的结构。
蛋白氨基酸多序列比对
- 首先从 UniProt 网站上搜索需要进行比对的蛋白序列。
2. 蛋白多序列比对的软件和网络服务器有很多,可以自行百度。这里使用 Clustal Omega 进行蛋白多序列比对。将输出的结果文件保存为 .fas 格式文件。如下图所示:

Mapping多序列比对文件至蛋白三维结构
(1) 从 RCSB晶体结构库 中下载最完整,分辨率最高的蛋白晶体结构 (如没有结构的可用 AlphaFold2 进行预测),在 Pymol 中去溶剂,去小分子,和uniprot序列肉眼比对保存单链为.pdb。
(2) 使用 ConSurf 网站将多序列比对信息mapping到蛋白三维结构中。具体操作如下图所示:
输入:

输出:

获取高分辨率图片
(1) 下载输出文件,解压,其中.html中是多序列比对的结果,_ATOMS_section_With_ConSurf.pdb中含有mapping的颜色的信息。
(2) 将_ATOMS_section_With_ConSurf.pdb加载进Pymol并运行下述脚本consurf_new.py (run consurf_new.py)。
# consurf_new.py
# Define a Python subroutine to colour atoms by B-factor, using predefined intervals
def colour_consurf(selection="all"):
# Colour other chains gray, while maintaining
# oxygen in red, nitrogen in blue and hydrogen in white
cmd.color("gray", selection)
cmd.util.cnc()
# These are constants
minimum = 0.0
maximum = 9.0
n_colours = 9
# Colours are calculated by dividing the RGB colours by 255
# RGB = [[16,200,209],[140,255,255],[215,255,255],[234,255,255],[255,255,255],
# [252,237,244],[250,201,222],[240,125,171],[160,37,96]]
colours = [
[0.039215686, 0.490196078, 0.509803922],
[0.294117647, 0.68627451, 0.745098039],
[0.647058824, 0.862745098, 0.901960784],
[0.843137255, 0.941176471, 0.941176471],
[1, 1, 1],
[0.980392157, 0.921568627, 0.960784314],
[0.980392157, 0.784313725, 0.862745098],
[0.941176471, 0.490196078, 0.666666667],
[0.62745098, 0.156862745, 0.37254902]]
bin_size = (maximum - minimum) / n_colours
# Loop through colour intervals
for i in range(n_colours):
lower = minimum + (i + 1) * bin_size
upper = lower + bin_size
colour = colours[i]
# Print out B-factor limits and the colour for this group
print(lower, " - ", upper, " = ", colour)
# Define a unique name for the atoms which fall into this group
group = selection + "_group_" + str(i + 1)
# Compose a selection command which will select all atoms which are
# a) in the original selection, AND
# b) have B factor in range lower <= b < upper
sel_string = selection + " & ! b < " + str(lower)
if(i < n_colours):
sel_string += " & b < " + str(upper)
else:
sel_string += " & ! b > " + str(upper)
# Select the atoms
cmd.select(group, sel_string)
# Create a new colour
colour_name = "colour_" + str(i + 1)
cmd.set_color(colour_name, colour)
# Colour them
cmd.color(colour_name, group)
# Create new colour for insufficient sequences
# RGB_colour = [255,255,150]
insuf_colour = [1, 1, 0.588235294]
cmd.set_color("insufficient_colour", insuf_colour)
# Colour atoms with B-factor of 10 using the new colour
cmd.select("insufficient", selection + " & b = 10")
cmd.color("insufficient_colour", "insufficient")