Talk:Convolutional neural network: Difference between revisions

Content deleted Content added
Update Linguistics in the Digital Age assignment details
 
(7 intermediate revisions by 5 users not shown)
Line 1:
{{talkheaderTalk header}}
{{annual readership|scale=log}}
{{Article history
Line 10:
{{WikiProject Artificial Intelligence|importance=Mid}}
}}
{{User:MiszaBot/config
 
| algo = old(90d)
== Feature Maps ==
| archive = Talk:Convolutional neural network/Archive %(counter)d
Need to introduce what feature maps are for nontechnical readers. <!-- Template:Unsigned --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Shsh16|Shsh16]] ([[User talk:Shsh16#top|talk]] • [[Special:Contributions/Shsh16|contribs]]) 18:24, 15 February 2017 (UTC)</small> <!--Autosigned by SineBot-->
| counter = 1
 
| maxarchivesize = 125K
== Non-linear Pooling ==
| archiveheader = {{Automatic archive navigator}}
 
| minthreadstoarchive = 1
It says in the article: "Another important concept of CNNs is pooling, which is a form of '''non-linear''' down-sampling."
| minthreadsleft = 5
 
}}
I don't think this is correct. There are pooling techniques, like average pooling which is mentioned in this same section, which are forms of linear down-sampling. I would remove the "non-linear." [[Special:Contributions/194.117.26.63|194.117.26.63]] ([[User talk:194.117.26.63|talk]]) 15:06, 13 May 2016 (UTC)
 
== Plagiarism in "Layer patterns" ==
 
The text seems is copied from https://cs231n.github.io/convolutional-networks/#layerpat without any attribution <small><span class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Jkoab|Jkoab]] ([[User talk:Jkoab|talk]] • [[Special:Contributions/Jkoab|contribs]]) 01:41, 8 June 2016 (UTC)</span></small><!-- Template:Unsigned --> <!--Autosigned by SineBot-->
 
:Indeed. Deleted copyvio text, see below. [[User:Maproom|Maproom]] ([[User talk:Maproom|talk]]) 09:55, 8 June 2016 (UTC)
== Copyright problem removed ==
 
Prior content in this {{#ifeq:{{NAMESPACENUMBER}}|119|draft|article}} duplicated one or more previously published sources. The material was copied from: https://cs231n.github.io/convolutional-networks/#layerpat. Copied or closely paraphrased material has been rewritten or removed and must not be restored, ''unless'' it is duly released under a compatible license. (For more information, please see [[Wikipedia:COPYRIGHT#Using_copyrighted_work_from_others|"using copyrighted works from others"]] if you are not the copyright holder of this material, or [[Wikipedia:Donating copyrighted materials|"donating copyrighted materials"]] if you are.)
 
For [[Wikipedia:Copyrights|legal reasons]], we cannot accept [[Wikipedia:Copyrights|copyrighted]] text or images borrowed from other web sites or published material; such additions will be deleted. Contributors may use copyrighted publications as a source of ''information'', and, if allowed under [[fair use]], may copy sentences and phrases, provided they are included in quotation marks and [[WP:CS|referenced]] properly. The material may also be rewritten, providing it does not infringe on the copyright of the original ''or'' [[Wikipedia:Plagiarism|plagiarize]] from that source. Therefore, such paraphrased portions must provide their source. Please see our [[Wikipedia:NFC#Text|guideline on non-free text]] for how to properly implement limited quotations of copyrighted text. Wikipedia takes copyright violations '''very seriously''', and persistent violators '''will''' be [[Wikipedia:Blocking policy|blocked]] from editing. While we appreciate contributions, we must require all contributors to understand and comply with these policies. Thank you. <!-- Template:Cclean --> [[User:Maproom|Maproom]] ([[User talk:Maproom|talk]]) 09:55, 8 June 2016 (UTC)
 
== Suggestion: Move the section "Regularization methods" to a new page ==
 
The methods listed here are applicable to deep learning in general.
This topic should be moved into a new page. [[User:OhadRubin|OhadRubin]] ([[User talk:OhadRubin|talk]]) 06:38, 27 November 2018 (UTC)
 
== Parameter Sharing Clarifications ==
 
In the "Parameter sharing" section, "relax the parameter sharing scheme" is written, but what this actually means is unclear. <!-- Template:Unsigned --><small class="autosigned">—&nbsp;Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[User:Ephsc|Ephsc]] ([[User talk:Ephsc#top|talk]] • [[Special:Contributions/Ephsc|contribs]]) 16:22, 27 September 2019 (UTC)</small> <!--Autosigned by SineBot-->
 
== What is convolutional about a convolutional neural network? ==
 
The article fails to explain what the connection between CNNs and [[convolution | convolutions]] are in any meaningful way. In particular, convolutions don't act on vectors; they act on functions. Comparing with the equation on the page for convolutions, there's obviously something analogous. --[[User:Stellaathena|Stellaathena]] ([[User talk:Stellaathena|talk]]) 16:51, 14 December 2020 (UTC)
 
its actually the dsp version of a cross correlation, not a convolution. its a misnomer to call it convolution.-AS
 
== Inaccurate information about Convolutional layers ==
Line 63 ⟶ 37:
I would also suggest merging the section "Definition" into the introduction. The definition section is only two sentences and it feels it would be better placed at the introduction.
 
== Introduction ==
== Misleading use of the term tensor ==
 
"only 25 neurons are required to process 5x5-sized tiles". Shouldn't that be "weights" and not "neurons"? Earlier it said "10,000 weights would be required for processing an image sized 100 × 100 pixels". [[User:Ulatekh|Ulatekh]] ([[User talk:Ulatekh|talk]]) 15:53, 19 March 2024 (UTC)
The article uses the term [[tensor]] in the sense of multi-dimensional array.
But the link redirects to the article [https://en.wikipedia.org/wiki/Tensor] with mathematical definition. These terms in computer science (namely in the library tensorflow) and in mathematics are completely different.
It's necessary to change at least the reference to [https://en.wikipedia.org/wiki/Array_data_type]. But it's better to avoid the ambiguous use of mathematical terminology.
 
:Absolutely, you're right. I was going to ask the same question. 25 weights for each neuron in the second layer from each neuron in the input layer, and all these 25 weights don't vary as the filter is slid across the input. Do you want to make the correction or should I, since the original editor is not responding? [[User:Iuvalclejan|Iuvalclejan]] ([[User talk:Iuvalclejan|talk]]) 22:47, 25 January 2025 (UTC)
Max [[Special:Contributions/88.201.254.120|88.201.254.120]] ([[User talk:88.201.254.120|talk]]) 22:39, 10 April 2022 (UTC)
::I made the change. [[Special:Contributions/2600:6C5D:577F:F44E:B9B2:E830:3647:8315|2600:6C5D:577F:F44E:B9B2:E830:3647:8315]] ([[User talk:2600:6C5D:577F:F44E:B9B2:E830:3647:8315|talk]]) 14:20, 27 January 2025 (UTC)
 
== Big picture ==
== Merge Architecture and Building Blocks sectdions ==
 
Why are convolutional NNs (or networks with several Convolutional layers as opposed to none) more useful especially for images, than networks with only fully connected layers? You mention something about translational equivariance in artificial NNs and in the visual cortex in brains, but this is a property of the neural network, not of its inputs. It's a way to reduce the number of weights per layer, but why isn't it universally useful (for all inputs and all output tasks), and why is it better for images than other ways of reducing the number of weights per layer? [[User:Iuvalclejan|Iuvalclejan]] ([[User talk:Iuvalclejan|talk]]) 23:50, 25 January 2025 (UTC)
Much overlap with no clear distinction. [[User:Lfstevens|Lfstevens]] ([[User talk:Lfstevens|talk]]) 00:36, 7 February 2023 (UTC)
 
==Wiki Education assignment: Linguistics in the Digital Age==
== Acronym ANN ==
{{dashboard.wikiedu.org assignment | course = Wikipedia:Wiki_Ed/University_of_Arizona/Linguistics_in_the_Digital_Age_(Spring_2025) | assignments = [[User:AshlaMaOmao|AshlaMaOmao]] | start_date = 2025-01-15 | end_date = 2025-05-09 }}
 
<span class="wikied-assignment" style="font-size:85%;">— Assignment last updated by [[User:FblthpTheLost|FblthpTheLost]] ([[User talk:FblthpTheLost|talk]]) 00:10, 8 May 2025 (UTC)</span>
The use or the acronym ANN for artificial neural networks is novel to me, and I wonder whether it needlessly clutters the opening sentence. Have others worked in areas where ANN is common? [[User:Babajobu|Babajobu]] ([[User talk:Babajobu|talk]]) 04:55, 24 March 2023 (UTC)
 
== Article is incomprehensible to the intelligent layman ==
 
No blame, it's an excellent start, but I think we can write this so that it's more easily parsed by an intelligent person outside the field who is willing to put in some mental work. [[User:Babajobu|Babajobu]] ([[User talk:Babajobu|talk]]) 04:57, 24 March 2023 (UTC)
 
:No kidding. Whoever wrote this seemed in a hurry to jump right into how CNNs work and what the technical differences are between CNNs and other machine learning architectures, with numerical examples.
:That information does belong here, but further down in the article. This whole thing needs to be rearranged by an Expert who is also a good Explainer, to lead off with answers to simple questions.
:What is a CNN?
:What problems can it solve that other approaches can not, or solve more efficiently?
:Is CNN an example of a wider family of architectures? If so, compare and contrast with its relatives in that family tree.
:Some of these answers may already be embedded in the article, but the article makes the reader work too hard to find them.
:You gotta tell people where you are taking them, and WHY, before you start describing, in detail, the steps you take to get there. [[Special:Contributions/2601:283:4F81:4B00:35A1:9FF5:C8CF:11AF|2601:283:4F81:4B00:35A1:9FF5:C8CF:11AF]] ([[User talk:2601:283:4F81:4B00:35A1:9FF5:C8CF:11AF|talk]]) 21:10, 28 October 2023 (UTC)
 
== Hyperparameters ==
 
I have a question or a problem with explanation of hyperparameters.
 
1. Hyperparameters are first explained in Spatial arrangement subsection of Convolutional layer. Three hyperparameters are listed, which affect the output size. Here, I believe, kernel size ''K'' is missing, which is mentioned right away in the next paragraph.
 
2. In the Hyperparameters section, we have ''kernel size'' and ''filter size''. By my understanding, these two parameters should be the same thing? Additionally, ''number of filters'' uses depth as the number of convolutional+pooling layers, whereas depth in the Spatial arrangement (my previous point) uses ''depth'' as a number of filters. [[User:En odveč|En odveč]] ([[User talk:En odveč|talk]]) 12:32, 30 March 2023 (UTC)
 
== Incorrect description of feed-forward neural network under "Architecture" ==
 
In the "Architecture"-section, the article states: " In any feed-forward neural network, any middle layers are called hidden because their inputs and outputs are masked by the activation function and final [[convolution]]."
 
 
This is not correct:
 
- There is not a final convolution in all feed-forward neural networks.
 
- The middle layers ''are'' called hidden, but not "because their inputs and outputs are masked by the activation function and final [[convolution]]." They are called hidden because they are not "externally visible".
 
 
 
[[User:Rfk732|Rfk732]] ([[User talk:Rfk732|talk]]) 15:48, 8 April 2023 (UTC)
 
:I have removed the sentence. [[User:Rfk732|Rfk732]] ([[User talk:Rfk732|talk]]) 10:38, 13 April 2023 (UTC)
 
== Empirical and explicit regularization? ==
 
The section ''Regularization methods'' has two different subsections: ''Empirical'' and ''Explicit''. What do we mean by empirical? And what do we mean by explicit? —[[User:Kri|Kri]] ([[User talk:Kri|talk]]) 12:43, 20 November 2023 (UTC)
 
== Introduction ==
 
"only 25 neurons are required to process 5x5-sized tiles". Shouldn't that be "weights" and not "neurons"? Earlier it said "10,000 weights would be required for processing an image sized 100 × 100 pixels". [[User:Ulatekh|Ulatekh]] ([[User talk:Ulatekh|talk]]) 15:53, 19 March 2024 (UTC)
 
:Absolutely, you're right. I was going to ask the same question. 25 weights for each neuron in the second layer from each neuron in the input layer, and all these 25 weights don't vary as the filter is slid across the input. Do you want to make the correction or should I, since the original editor is not responding? [[User:Iuvalclejan|Iuvalclejan]] ([[User talk:Iuvalclejan|talk]]) 22:47, 25 January 2025 (UTC)