Architecture Diagram · Formal Visualization

Syntax-Aware K-Tran Transformer

Aspect-Based Sentiment Analysis · Multi-Task Learning Framework

📝

Input

Raw Text

→

🔤

Tokenize

RoBERTa Tokens

→

🧠

Encode

RoBERTa-Base

→

🌲

Parse

Syntax Bias

→

⚡

Multi-Task

ATE + ASC

→

📊

Output

Aspects + Sentiment

Data & Preprocessing

absa_30000.csv

~30K social media sentences · aspect + polarity labels

train.xml / test.xml

SemEval-2014 · ~3,000 train / ~800 test · restaurant reviews

Cleaning & Normalisation

Remove URLs, noise, symbols common in social media

BIO Tag Conversion

Begin-Inside-Outside format for sequence labelling

Core Model Architecture

RoBERTa-base

Pre-trained Transformer Encoder · 12 layers · 768-dim

🔄

Multi-Head Self-Attention

Contextualised token representations

🌿

Syntax-Aware Attention Bias

spaCy dependency weights injected into attention scores

⚙️

Feed-Forward Network

Position-wise transformation

🔗

Shared Encoder Backbone

Joint feature space for both tasks

token embeddings → [CLS] t₁ t₂ … tₙ [SEP] · 768-dim vectors

Syntax Processing (spaCy)

food ─nsubj→ was
was ─acomp→ great
service ─nsubj→ slow

1.0

0.8

0.1

0.0

0.8

1.0

0.7

0.1

0.7

1.0

0.2

0.0

0.1

0.2

1.0

Syntax-Aware Attention Bias Matrix

Training & Optimisation

L_total = α·L_ATE + (1−α)·L_ASC

Adam Optimizer

Cross-Entropy Loss

Mini-Batch SGD

Dropout Reg.

Gradient Clipping

Early Stopping

Loss Balancing α

Syntax Bias λ

Evaluation Metrics

Accuracy

ACC

Precision

Recall

Macro-F1

Weighted-F1

wF1

ATE F1

ATE

Task 1 · Aspect Term Extraction (ATE)

Sequence Labelling Head

BIO tagging for aspect boundary detection

B-ASP

I-ASP

B-ASP

Example: "The food quality was good but service was slow."

Task 2 · Aspect Sentiment Classification (ASC)

Classification Head

Polarity prediction per extracted aspect

✓ Positive

✗ Negative

– Neutral

food quality → Positive · service → Negative

Figure 1. Full architecture of the Syntax-Aware K-Tran Transformer for ABSA. The model integrates a RoBERTa-base encoder with spaCy-derived dependency attention bias, trained jointly on Aspect Term Extraction and Aspect Sentiment Classification via weighted cross-entropy loss.