Computer data storage: Difference between revisions

Content deleted Content added
No edit summary
Tags: Reverted Visual edit
Restored revision 1234598864 by ClueBot NG (talk): Reverting vandalism
Line 8:
[[File:Sony_CRX310S-Internal-PC-DVD-Drive-Opened.jpg|thumb|Read/Write DVD drive with cradle for media extended]]
 
'''Computer data storage''' or '''digital data storage''' is a technology consisting of [[computer]] components and [[Data storage|recording media]] that are used to retain [[digital data]]. It is a core function and fundamental component of computers.<ref name="Patterson">{{Cite book |title=Computer organization and design: The hardware/software interface |last1=Patterson |first1=David A. |last2=Hennessy |first2=John L. |date=2005 |publisher=[[Morgan Kaufmann Publishers]] |isbn=1-55860-604-1 |edition=3rd |___location=[[Amsterdam]] |oclc=56213091 |url-access=registration |url=https://archive.org/details/isbn_9781558606043 }}</ref>{{rp|15–16}}
ethod is typically used in communications and storage for [[error detection and correction|error detection]]. A detected error is then retried.
 
The [[central processing unit]] (CPU) of a computer is what manipulates data by performing computations. In practice, almost all computers use a [[memory hierarchy|storage hierarchy]],<ref name="Patterson"/>{{rp|468–473}} which puts fast but expensive and small storage options close to the CPU and slower but less expensive and larger options further away. Generally, the fast{{efn|Most contemporary computers use volatile technologies (which lose data when power is off); early computers used both volatile and persistent technologies.}} technologies are referred to as "memory", while slower persistent technologies are referred to as "storage".
 
Even the first computer designs, [[Charles Babbage]]'s [[Analytical Engine]] and [[Percy Ludgate]]'s Analytical Machine, clearly distinguished between processing and memory (Babbage stored numbers as rotations of gears, while Ludgate stored numbers as displacements of rods in shuttles). This distinction was extended in the [[Von Neumann architecture]], where the CPU consists of two main parts: The [[control unit]] and the [[arithmetic logic unit]] (ALU). The former controls the flow of data between the CPU and memory, while the latter performs arithmetic and [[Bitwise operation|logical operations]] on data.
 
== Functionality ==
Without a significant amount of memory, a computer would merely be able to perform fixed operations and immediately output the result. It would have to be reconfigured to change its behavior. This is acceptable for devices such as desk [[calculator]]s, [[digital signal processing|digital signal processors]], and other specialized devices. [[von Neumann architecture|Von Neumann]] machines differ in having a memory in which they store their operating [[Instruction set architecture#Instructions|instructions]] and data.<ref name="Patterson"/>{{rp|20}} Such computers are more versatile in that they do not need to have their hardware reconfigured for each new program, but can simply be [[computer programming|reprogrammed]] with new in-memory instructions; they also tend to be simpler to design, in that a relatively simple processor may keep [[State (computer science)|state]] between successive computations to build up complex procedural results. Most modern computers are von Neumann machines.
 
== Data organization and representation ==
A modern [[Computer|digital computer]] represents [[data]] using the [[Binary number|binary numeral system]]. Text, numbers, pictures, audio, and nearly any other form of information can be converted into a string of [[bit]]s, or binary digits, each of which has a value of 0&nbsp;or&nbsp;1. The most common unit of storage is the [[byte]], equal to 8 bits. A piece of information can be handled by any computer or device whose storage space is large enough to accommodate ''the binary representation of the piece of information'', or simply [[data (computing)|data]]. For example, the [[Complete Works of Shakespeare|complete works of Shakespeare]], about 1250&nbsp;pages in print, can be stored in about five [[megabyte]]s (40&nbsp;million bits) with one byte per character.
 
Data are [[Code|encoded]] by assigning a bit pattern to each [[Character (computing)|character]], [[Numerical digit|digit]], or [[multimedia]] object. Many standards exist for encoding (e.g. [[character encoding]]s like [[ASCII]], image encodings like [[JPEG]], and video encodings like [[MPEG-4]]).
 
By adding bits to each encoded unit, redundancy allows the computer to detect errors in coded data and correct them based on mathematical algorithms. Errors generally occur in low probabilities due to [[Randomness|random]] bit value flipping, or "physical bit fatigue", loss of the physical bit in the storage of its ability to maintain a distinguishable value (0&nbsp;or&nbsp;1), or due to errors in inter or intra-computer communication. A random [[RAM parity|bit flip]] (e.g. due to random [[radiation]]) is typically corrected upon detection. A bit or a group of malfunctioning physical bits (the specific defective bit is not always known; group definition depends on the specific storage device) is typically automatically fenced out, taken out of use by the device, and replaced with another functioning equivalent group in the device, where the corrected bit values are restored (if possible). The [[cyclic redundancy check]] (CRC) method is typically used in communications and storage for [[error detection and correction|error detection]]. A detected error is then retried.
 
[[Data compression]] methods allow in many cases (such as a database) to represent a string of bits by a shorter bit string ("compress") and reconstruct the original string ("decompress") when needed. This utilizes substantially less storage (tens of percent) for many types of data at the cost of more computation (compress and decompress when needed). Analysis of the trade-off between storage cost saving and costs of related computations and possible delays in data availability is done before deciding whether to keep certain data compressed or not.