1. Transformer ์•„ํ‚คํ…์ณ

1. Transformer ์•„ํ‚คํ…์ณยถ

Summaryยถ

Transformer ์•„ํ‚คํ…์ณ๋Š” ์ธ๊ณต์‹ ๊ฒฝ๋ง์˜ ํ•œ ์œ ํ˜•์œผ๋กœ, ์ฃผ๋กœ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP)์—์„œ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด ์•„ํ‚คํ…์ณ๋Š” ์ธ์ฝ”๋”(Encoder)์™€ ๋””์ฝ”๋”(Decoder) ๋‘ ๋ถ€๋ถ„์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ธ์ฝ”๋”๋Š” ์ž…๋ ฅ ํ…์ŠคํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•˜์—ฌ ์ปจํ…์ŠคํŠธ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๊ณ , ๋””์ฝ”๋”๋Š” ์ด ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถœ๋ ฅ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. Transformer๋Š” ๊ธฐ์กด์˜ ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง(RNN)๊ณผ ๋‹ฌ๋ฆฌ, ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•˜์—ฌ ํ•™์Šต ์†๋„๊ฐ€ ๋น ๋ฅด๊ณ , ๊ธด ์‹œํ€€์Šค๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Key Conceptsยถ

  • ์ธ์ฝ”๋”(Encoder) : ์ž…๋ ฅ ํ…์ŠคํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•˜์—ฌ ์ปจํ…์ŠคํŠธ ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.

  • ๋””์ฝ”๋”(Decoder) : ์ธ์ฝ”๋”์—์„œ ์ถ”์ถœํ•œ ์ปจํ…์ŠคํŠธ ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถœ๋ ฅ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.

  • ์…€ํ”„-์–ดํ…์…˜(self-attention) : ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ๊ฐ ์š”์†Œ๊ฐ€ ๋‹ค๋ฅธ ๋ชจ๋“  ์š”์†Œ์™€์˜ ๊ด€๊ณ„๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฉ”์ปค๋‹ˆ์ฆ˜์ž…๋‹ˆ๋‹ค.

  • ํฌ์ง€์…”๋„ ์ธ์ฝ”๋”ฉ(positional encoding) : ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์˜ ์ˆœ์„œ ์ •๋ณด๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

  • ๋ ˆ์ด์–ด ๋…ธ๋ฉ€๋ผ์ด์ œ์ด์…˜(layer normalization) : ๊ฐ ๋ ˆ์ด์–ด์˜ ์ถœ๋ ฅ์„ ์ •๊ทœํ™”ํ•˜์—ฌ ํ•™์Šต์˜ ์•ˆ์ •์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.

Referencesยถ

URL ์ด๋ฆ„

URL

DataCamp - How Transformers Work

https://www.datacamp.com/tutorial/how-transformers-work

TrueFoundry - Transformer Architecture

https://www.truefoundry.com/blog/transformer-architecture

MLQ.ai - Understanding Transformers & the Architecture of LLMs

https://blog.mlq.ai/llm-transformer-architecture/

Wikipedia - Transformer (deep learning architecture)

https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)

YouTube - LLM Chronicles #5.1: The Transformer Architecture

https://www.youtube.com/watch?v=GhdB7UMtGqs