Protein-protein interaction is an important form of its function. Most of our PPI knowledge has emerged from experimental studies conducted over the past 40 years. Techniques such as co-immunoprecipitation, pull-down assays, yeast two-hybrid (Y2H), bioluminescence resonance energy transfer, proximity labeling, and affinity purification coupled with mass spectrometry (AP-MS) can identify the interacting partners of proteins.
Structure-based approaches traditionally focus on predicting the 3D structures of protein complexes; with recent advances in artificial intelligence (AI), these methods are increasingly capable of predicting interacting partners for a given protein. Evolutionary principles can be integrated with network- and structure-based approaches to leverage experimental data across species.
Historically, X-ray crystallography has been the primary method for obtaining high-resolution structural data. However, recent advances in cryoelectron microscopy (cryo-EM) have significantly improved its resolution, enabling the effective determination of 3D structures for large complexes that are difficult to crystallize.
A closely related task to protein complex structure modeling is determining which pair of proteins should interact in physiological conditions.
Screening for interacting protein pairs from non-interacting ones across the entire proteome has long been a formidable challenge for computational methods. For example, in
Escherichia coli, it is estimated that there are approximately 10,000 interacting protein pairs, while the total possible pairs among 4,450 proteins reach 10 million. This results in a signal-to-noise ratio at a scale of 1:1,000 and makes proteome-wide PPI screening a challenging task. Nevertheless, the accumulation of protein sequences and structure data and the development of computational methods made structural biology at the proteome scale possible.
In vivo protein crosslinkingIn vivo crosslinking can be used to characterize protein interactions and ligand-receptor interactions irrespective of treatment conditions. Various sizes of
in vivo crosslinkers are available to target surface and intracellular proteins for analysis by different methods such as
immunoprecipitation (IP), Co-IP,
chromatin immunoprecipitation (ChIP),
electrophoresis mobility shift assay (EMSA),
western blot, immunofluorescence (IF), and
immunohistochemistry (IHC).
Protein functional groupsDespite the complexity of protein structure, including composition and sequence of 20 different amino acids, only a small number of protein functional groups comprise selectable targets for practical bioconjugation methods. In fact, just four protein chemical targets account for the vast majority of crosslinking and chemical modification techniques:
- Primary amines (–NH2)
- Carboxyls (–COOH): This group exists at the C-terminus of each polypeptide chain and in the side chains of aspartic acid (Asp, D) and glutamic acid (Glu, E). Like primary amines, carboxyls are usually on the surface of protein structure.
- Sulfhydryls (–SH): This group exists in the side chain of cysteine (Cys, C). Often, as part of a protein's secondary or tertiary structure, cysteines are joined together between their side chains via disulfide bonds (–S–S–). These must be reduced to sulfhydryls to make them available for crosslinking by most types of reactive groups.
- Carbonyls (–CHO): Ketone or aldehyde groups can be created in glycoproteins by oxidizing the polysaccharide post-translational modifications (glycosylation) with sodium meta-periodate.