Sunday, January 14, 2018

software - How to unify all the subunits in a PDB file?


A PDB may contain a TER token. For example, hemoglobin has 4 subunits separated by a TER identifier.


If we take hemoglobin's biological assembly pdb file we see:


ATOM   1067  NH1 ARG A 141      26.176   8.362  17.810  1.00 11.11           N  
ATOM 1068 NH2 ARG A 141 24.650 9.068 16.200 1.00 13.86 N
ATOM 1069 OXT ARG A 141 26.697 14.784 20.720 1.00 10.99 O

TER 1070 ARG A 141
ATOM 1071 N HIS B 2 3.670 -13.643 19.447 1.00 38.58 N
ATOM 1072 CA HIS B 2 2.695 -14.734 19.744 1.00 32.83 C
ATOM 1073 C HIS B 2 1.379 -14.140 20.199 1.00 30.79 C

Is there an easy way to merge all the subunits into a single VMD frame?



Answer



If you don't care about residue and atom numbering, which you probably do, then sed '/^TER/d' < 1A3N.pdb > 1A3N_combined.pdb will do. Since atom entries must be reordered, it looks like MDAnalysis will strip the TER entries automatically:


#!/usr/bin/env python2


import MDAnalysis


u = MDAnalysis.Universe('1A3N.pdb')

with MDAnalysis.Writer('1A3N_combined.pdb') as writer:
writer.write(u)

So,


ATOM   1067  NH1 ARG A 141      26.176   8.362  17.810  1.00 11.11           N

ATOM 1068 NH2 ARG A 141 24.650 9.068 16.200 1.00 13.86 N
ATOM 1069 OXT ARG A 141 26.697 14.784 20.720 1.00 10.99 O
TER 1070 ARG A 141
ATOM 1071 N HIS B 2 3.670 -13.643 19.447 1.00 38.58 N
ATOM 1072 CA HIS B 2 2.695 -14.734 19.744 1.00 32.83 C
ATOM 1073 C HIS B 2 1.379 -14.140 20.199 1.00 30.79 C

becomes


ATOM   1067  NH1 ARG A 141      26.176   8.362  17.810  1.00 11.11      A    N
ATOM 1068 NH2 ARG A 141 24.650 9.068 16.200 1.00 13.86 A N

ATOM 1069 OXT ARG A 141 26.697 14.784 20.720 1.00 10.99 A O
ATOM 1070 N HIS B 2 3.670 -13.643 19.447 1.00 38.58 B N
ATOM 1071 CA HIS B 2 2.695 -14.734 19.744 1.00 32.83 B C
ATOM 1072 C HIS B 2 1.379 -14.140 20.199 1.00 30.79 B C

where there are now chain IDs in the last column. See how this doesn't renumber residues? Modify the script; resids is a property, so the in-place object mutation works properly:


#!/usr/bin/env python2

import numpy as np
import MDAnalysis



u = MDAnalysis.Universe('1A3N.pdb')
u.residues.resids = np.arange(1, 1 + len(u.residues.resids))

with MDAnalysis.Writer('1A3N_combined.pdb') as writer:
writer.write(u)

Finally:


ATOM   1067  NH1 ARG A 141      26.176   8.362  17.810  1.00 11.11      A    N

ATOM 1068 NH2 ARG A 141 24.650 9.068 16.200 1.00 13.86 A N
ATOM 1069 OXT ARG A 141 26.697 14.784 20.720 1.00 10.99 A O
ATOM 1070 N HIS B 142 3.670 -13.643 19.447 1.00 38.58 B N
ATOM 1071 CA HIS B 142 2.695 -14.734 19.744 1.00 32.83 B C
ATOM 1072 C HIS B 142 1.379 -14.140 20.199 1.00 30.79 B C

No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...