Content deleted Content added
m Fix the "wrong wiki link error" |
m →History: spell out acronym, add link, improve wording |
||
(23 intermediate revisions by 19 users not shown) | |||
Line 1:
{{Short description|
{{Use dmy dates|date=April 2025}}
{{Infobox software
| name = PyTorch
Line 8 ⟶ 9:
| author = {{Unbulleted list|Adam Paszke|Sam Gross|Soumith Chintala|Gregory Chanan}}
| developer = [[Meta AI]]
| released = {{Start date and age|2016|9|df=yes}}<ref>{{cite web|url=https://github.com/pytorch/pytorch/releases/tag/v0.1.1|title=PyTorch Alpha-1 release|last=Chintala|first=Soumith|website=[[GitHub]]
| latest release version = {{wikidata|property|edit|reference|P348}}
| latest release date = {{start date and age|{{wikidata|qualifier|P348|P577}}}}
Line 17 ⟶ 18:
| language = English
| genre = [[Library (computing)|Library]] for [[machine learning]] and [[deep learning]]
| license = [[BSD-3]]<ref>{{cite web |last=Claburn |first=Thomas |date=12 September 2022 |title=PyTorch gets lit under The Linux Foundation |url=https://www.theregister.com/2022/09/12/pytorch_meta_linux_foundation/ |work=[[The Register]] |access-date=18 October 2022 |archive-date=18 October 2022 |archive-url=https://web.archive.org/web/20221018040848/https://www.theregister.com/2022/09/12/pytorch_meta_linux_foundation/ |url-status=live }}</ref>
| website = {{URL|https://pytorch.org/}}
}}
{{Machine learning}}
'''PyTorch''' is
PyTorch utilises [[tensor]]s as a intrinsic datatype, very similar to [[NumPy]]. Model training is handled by an [[automatic differentiation]] system, Autograd, which constructs a [[directed acyclic graph]] of a forward pass of a model for a given input, for which automatic differentiation utilising the [[chain rule]], computes model-wide gradients.<ref>{{Cite web|title=Overview of PyTorch Autograd Engine|website=PyTorch Blog|date=8 June 2021|url=https://pytorch.org/blog/overview-of-pytorch-autograd-engine|url-status=live}}</ref> PyTorch is capable of transparent leveraging of [[SIMD]] units, such as [[General-purpose computing on graphics processing units|GPGPU]]s.
A number of pieces of [[deep learning]] software are built on top of PyTorch, including [[Tesla Autopilot]],<ref>{{Cite web|last=Karpathy|first=Andrej|title=PyTorch at Tesla - Andrej Karpathy, Tesla|website=[[YouTube]] |date=6 November 2019 |url=https://www.youtube.com/watch?v=oBklltKXtDE}}</ref> [[Uber]]'s Pyro,<ref>{{Cite news|url=https://eng.uber.com/pyro/|title=Uber AI Labs Open Sources Pyro, a Deep Probabilistic Programming Language|date=2017-11-03|work=Uber Engineering Blog|access-date=2017-12-18|language=en-US}}</ref> [[Hugging Face]]'s Transformers,<ref>{{Citation|title=PYTORCH-TRANSFORMERS: PyTorch implementations of popular NLP Transformers|date=2019-12-01|url=https://pytorch.org/hub/huggingface_pytorch-transformers/|publisher=PyTorch Hub|access-date=2019-12-01}}</ref> [[PyTorch Lightning]],<ref>{{Citation|title=PYTORCH-Lightning: The lightweight PyTorch wrapper for ML researchers. Scale your models. Write less boilerplate|date=2020-06-18|url=https://github.com/PyTorchLightning/pytorch-lightning/|publisher=Lightning-Team|access-date=2020-06-18}}</ref><ref>{{Cite web|url=https://pytorch.org/ecosystem/|title=Ecosystem Tools|website=pytorch.org|language=en|access-date=2020-06-18}}</ref> and Catalyst.<ref>{{Citation|title=GitHub - catalyst-team/catalyst: Accelerated DL & RL|date=2019-12-05|url=https://github.com/catalyst-team/catalyst|publisher=Catalyst-Team|access-date=2019-12-05}}</ref><ref>{{Cite web|url=https://pytorch.org/ecosystem/|title= Ecosystem Tools|website=pytorch.org|language=en|access-date=2020-04-04}}</ref>▼
▲A number of
==History==
In 2001, Torch was written and released under a [[GNU General Public License|GPL license]]. It was a machine-learning library written in C++, supporting methods including neural networks, [[Support vector machine|support vector machines]] (SVM), [[Hidden Markov model|hidden Markov models]], etc.<ref>[http://torch.ch/torch3/matos/tutorial.pdf "Torch Tutorial", Ronan Collobert, IDIAP, 2002-10-02]</ref><ref>R. Collobert, S. Bengio and J. Mariéthoz. [https://infoscience.epfl.ch/server/api/core/bitstreams/7513f344-91b6-427d-a020-7836b150a150/content Torch: a modular machine learning software library]. Technical Report IDIAP-RR 02-46, IDIAP, 2002. </ref><ref>https://web.archive.org/web/20011031104036/http://www.torch.ch/</ref> It was improved to Torch7 in 2012.<ref>{{Citation |last=Collobert |first=Ronan |title=Implementing Neural Networks Efficiently |date=2012 |work=Neural Networks: Tricks of the Trade: Second Edition |pages=537–557 |editor-last=Montavon |editor-first=Grégoire |url=https://doi.org/10.1007/978-3-642-35289-8_28 |access-date=2025-06-10 |place=Berlin, Heidelberg |publisher=Springer |language=en |doi=10.1007/978-3-642-35289-8_28 |isbn=978-3-642-35289-8 |last2=Kavukcuoglu |first2=Koray |last3=Farabet |first3=Clément |editor2-last=Orr |editor2-first=Geneviève B. |editor3-last=Müller |editor3-first=Klaus-Robert|url-access=subscription }}</ref> Development on Torch ceased in 2018 and was subsumed by the PyTorch project.<ref>[https://github.com/torch/torch7/commit/fd0ee3bbf7bfdd21ab10c5ee70b74afaef9409e1 torch/torch7, Commit fd0ee3b, 2018-07-02]</ref>
Meta (formerly known as Facebook) operates both PyTorch and Convolutional Architecture for Fast Feature Embedding ([[Caffe (software)|Caffe2]]), but models defined by the two frameworks were mutually incompatible. The Open Neural Network Exchange ([[Open Neural Network Exchange|ONNX]]) project was created by Meta and [[Microsoft]] in September 2017 for converting models between frameworks. Caffe2 was merged into PyTorch at the end of March 2018.<ref>{{cite web|url=https://medium.com/@Synced/caffe2-merges-with-pytorch-a89c70ad9eb7|title=Caffe2 Merges With PyTorch|date=2018-04-02}}</ref> In September 2022, Meta announced that PyTorch would be governed by the independent PyTorch Foundation, a newly created subsidiary of the [[Linux Foundation]].<ref>{{cite web |url=https://arstechnica.com/information-technology/2022/09/meta-spins-off-pytorch-foundation-to-make-ai-framework-vendor-neutral/ |title=Meta spins off PyTorch Foundation to make AI framework vendor neutral |date=2022-09-12 |website=[[Ars Technica]] |last=Edwards |first=Benj}}</ref>▼
▲Meta (formerly known as Facebook) operates both PyTorch and Convolutional Architecture for Fast Feature Embedding ([[Caffe (software)|Caffe2]]), but models defined by the two frameworks were mutually incompatible. The Open Neural Network Exchange ([[Open Neural Network Exchange|ONNX]]) project was created by Meta and [[Microsoft]] in September 2017 for converting models between frameworks. Caffe2 was merged into PyTorch at the end of March 2018.<ref>{{cite web|url=https://medium.com/@Synced/caffe2-merges-with-pytorch-a89c70ad9eb7|title=Caffe2 Merges With PyTorch|date=2 April 2018|access-
PyTorch 2.0 was released on 15 March 2023, introducing [[TorchDynamo]], a Python-level [[compiler]] that makes code run up to 2x faster, along with significant improvements in training and inference performance across major [[cloud computing|cloud platforms]].<ref>{{cite web|title=Dynamo Overview |url=https://pytorch.org/docs/stable/torch.compiler_dynamo_overview.html }}</ref><ref>{{cite news |title=PyTorch 2.0 brings new fire to open-source machine learning |url=https://venturebeat.com/ai/pytorch-2-0-brings-new-fire-to-open-source-machine-learning/ |access-date=16 March 2023 |work=VentureBeat |date=15 March 2023 |archive-date=16 March 2023 |archive-url=https://web.archive.org/web/20230316004808/https://venturebeat.com/ai/pytorch-2-0-brings-new-fire-to-open-source-machine-learning/ |url-status=live }}</ref>
==PyTorch tensors==
{{main|Tensor (machine learning)}}
PyTorch defines a class called Tensor (<code>torch.Tensor</code>) to store and operate on homogeneous multidimensional rectangular arrays of numbers. PyTorch Tensors are similar to [[NumPy]] Arrays, but can also be operated on by a [[CUDA]]-capable [[Nvidia|NVIDIA]] [[Graphics processing unit|GPU]]. PyTorch has also been developing support for other GPU platforms, for example, AMD's [[ROCm]]<ref>{{cite web|url=https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html|title=Installing PyTorch for ROCm|date=9 February 2024
PyTorch supports various sub-types of Tensors.<ref>{{cite web |url=https://www.analyticsvidhya.com/blog/2018/02/pytorch-tutorial/ |title=An Introduction to PyTorch – A Simple yet Powerful Deep Learning Library |website=analyticsvidhya.com |access-date=
Note that the term "tensor" here does not carry the same meaning as tensor in mathematics or physics. The meaning of the word in machine learning is only superficially related to its original meaning as a certain kind of object in [[linear algebra]]. Tensors in PyTorch are simply multi-dimensional arrays.
Line 49 ⟶ 50:
== Example ==
The following program shows the low-level functionality of the library with a simple example.
<syntaxhighlight lang="numpy" line="1">
Line 75 ⟶ 76:
# Output: tensor(-2.1540)
print(a[1, 2]) # Output of the element in the third column of the second row (zero
# Output: tensor(0.5847)
print(a.max())
# Output: tensor(0.8498)
</syntaxhighlight>
The following code-block defines a neural network with linear layers using the <code>nn</code> module.
<syntaxhighlight lang="python3" line="1">
from torch import nn # Import the nn sub-module from PyTorch
Line 89 ⟶ 92:
self.flatten = nn.Flatten() # Construct a flattening layer.
self.linear_relu_stack = nn.Sequential( # Construct a stack of layers.
nn.Linear(28 * 28, 512), # Linear Layers have an input and output shape
nn.ReLU(), # ReLU is one of many activation functions provided by nn
nn.Linear(512, 512),
|