Content deleted Content added
→History: Again emphasize the strong link between LLMs and transformers by moving "transformer" a little earlier in the history discussion. |
Microsoft Copilot is a chatbot primarily powered by the GPT-4 language model, not a large language model in itself |
||
Line 5:
LLMs can be used for text generation, a form of [[Generative artificial intelligence|generative AI]], by taking an input text and repeatedly predicting the next token or word.<ref name="Bowman">{{cite arXiv |eprint=2304.00612 |class=cs.CL |first=Samuel R. |last=Bowman |title=Eight Things to Know about Large Language Models |year=2023}}</ref> Up to 2020, [[Fine-tuning (machine learning)|fine tuning]] was the only way a model could be adapted to be able to accomplish specific tasks. Larger sized models, such as [[GPT-3]], however, can be [[prompt engineering|prompt-engineered]] to achieve similar results.<ref name="few-shot-learners">{{cite journal |last1=Brown |first1=Tom B. |last2=Mann |first2=Benjamin |last3=Ryder |first3=Nick |last4=Subbiah |first4=Melanie |last5=Kaplan |first5=Jared |last6=Dhariwal |first6=Prafulla |last7=Neelakantan |first7=Arvind |last8=Shyam |first8=Pranav |last9=Sastry |first9=Girish |last10=Askell |first10=Amanda |last11=Agarwal |first11=Sandhini |last12=Herbert-Voss |first12=Ariel |last13=Krueger |first13=Gretchen |last14=Henighan |first14=Tom |last15=Child |first15=Rewon |date=Dec 2020 |editor1-last=Larochelle |editor1-first=H. |editor2-last=Ranzato |editor2-first=M. |editor3-last=Hadsell |editor3-first=R. |editor4-last=Balcan |editor4-first=M.F. |editor5-last=Lin |editor5-first=H. |title=Language Models are Few-Shot Learners |url=https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf |journal=Advances in Neural Information Processing Systems |publisher=Curran Associates, Inc. |volume=33 |pages=1877–1901 |last25=Chess |last20=Hesse |first20=Christopher |last21=Chen |first21=Mark |last22=Sigler |first22=Eric |last23=Litwin |first23=Mateusz |last24=Gray |first24=Scott |first26=Jack |first25=Benjamin |last26=Clark |last19=Winter |last27=Berner |first27=Christopher |last28=McCandlish |first28=Sam |last29=Radford |first29=Alec |last30=Sutskever |first30=Ilya |last31=Amodei |first31=Dario |first19=Clemens |first18=Jeffrey |last18=Wu |last16=Ramesh |first16=Aditya |last17=Ziegler |first17=Daniel M.}}</ref> They are thought to acquire knowledge about syntax, semantics and "ontology" inherent in human language corpora, but also inaccuracies and [[Algorithmic bias|biases]] present in the corpora.<ref name="Manning-2022">{{cite journal |last=Manning |first=Christopher D. |author-link=Christopher D. Manning |year=2022 |title=Human Language Understanding & Reasoning |url=https://www.amacad.org/publication/human-language-understanding-reasoning |journal=Daedalus |volume=151 |issue=2 |pages=127–138 |doi=10.1162/daed_a_01905 |s2cid=248377870|doi-access=free }}</ref>
Some notable LLMs are [[OpenAI]]'s [[Generative pre-trained transformer|GPT]] series of models (e.g., [[GPT-3.5]] and [[GPT-4]], used in [[ChatGPT]] and [[Microsoft Copilot]]), [[Google]]'s [[PaLM]] and [[Gemini (language model)|Gemini]] (used in [[Google Bard|Bard]])
==History==
|