Protein Secondary Structure Prediction Using Transformers

February 09, 2026

Reading time: 1 minute

...

📝 Original Info

Title: Protein Secondary Structure Prediction Using Transformers
ArXiv ID: 2512.08613
Date: 2025-12-09
Authors: Manzi Kevin Maxime

📝 Abstract

Predicting protein secondary structures such as alpha helices, beta sheets, and coils from amino acid sequences is critical for understanding protein function. A transformer-based model is presented, applying attention mechanisms to protein sequence data for structural motif prediction. Data augmentation using a sliding window technique is employed on the CB513 dataset to augment the dataset. The transformer demonstrates strong potential in generalizing across variable-length sequences and capturing both local and long-range residue interactions.

📄 Full Content

Proteins are essential biological molecules whose functions depend on their three-dimensional structures. A key structural level is the secondary structure, comprising alpha helices (H), beta sheets (E), and coils (C). Predicting these motifs from amino acid sequences is a fundamental challenge in bioinformatics, as it enables insights into protein folding and function. Traditional methods often fail to capture long-range dependencies between residues. This study leverages a transformer model, utilizing self-attention to predict secondary structures (H, C, E) directly from sequences, aiming to improve accuracy and generalization.

…(Content truncated for length.)

📄 Read Full PDF on ArXiv