Understanding the Architecture
nlpberttransformer
A deep dive into BERT architecture exploring the Transformer encoder and multi-head self-attention mechanism
Read article
Sharing my journey in front-end engineering, best practices, and lessons learned across different industries. Plus occasional musings on hobbies and life beyond code.
A deep dive into BERT architecture exploring the Transformer encoder and multi-head self-attention mechanism
A whitebox conceptual view of how BERT works, what it learns during pretraining, and why it fits code-mixed text