NOUS
DashboardCoursesUploadAuthoringAnalyticsStudentsSettings
RK
Prof. Ramesh KumarPES University
All Courses
3 lessons

Attention & Transformers

The single mechanism that reshaped deep learning

Lessons

  1. 01

    Self Attention

    Queries, keys, values — derived and animated.

    HardOpen
  2. 02

    Multi Headed Self Attention

    Parallel attention heads specializing on different patterns.

    HardOpen
  3. 03

    Transformer Block

    Attention + MLP + norms + residuals — one layer.

    HardOpen