PDBx files offer to 'variants' of chain and residue IDs: the auth and label columns. Currently by default the auth columns are used for creating an AtomArray, to keep compatibility with the PDB format, which also uses the author annotations. However the label fields are much more consistent and are referenced throughout different PDBx categories.
As the PDB format is now deprecated a long time, I propose to change the default to label columns at the cost of consistency with the PDB.
However, one issue remains: Hetero residues that are outside a sequence have a label_seq_id of .. If in addition two of those subsequent residues have also the same residue name and chain ID, the residue-based functionality of Biotite cannot distinguish them as two separate residues (see also #553). So preferably this problem should be tackled first before moving to label fields.
Furthermore, auxiliary functions that use chain IDs should point them to label_asym_id instead.
So in summary, these are the working items:
PDBx files offer to 'variants' of chain and residue IDs: the
authandlabelcolumns. Currently by default theauthcolumns are used for creating anAtomArray, to keep compatibility with the PDB format, which also uses the author annotations. However thelabelfields are much more consistent and are referenced throughout different PDBx categories.As the PDB format is now deprecated a long time, I propose to change the default to
labelcolumns at the cost of consistency with the PDB.However, one issue remains: Hetero residues that are outside a sequence have a
label_seq_idof.. If in addition two of those subsequent residues have also the same residue name and chain ID, the residue-based functionality of Biotite cannot distinguish them as two separate residues (see also #553). So preferably this problem should be tackled first before moving tolabelfields.Furthermore, auxiliary functions that use chain IDs should point them to
label_asym_idinstead.So in summary, these are the working items:
use_author_fields=Trueinget_structure()andget_assembly()label_asym_idas keys in the return values ofget_sequenceandget_sse()