In-memory processing: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 07:46, 23 December 2023 edit Rick Jelliffe (talk \| contribs) Extended confirmed users 10,923 edits →Data structures: add "often" and weaken lots of sales-pitchy statements ← Previous edit		Latest revision as of 13:08, 26 August 2025 edit undo Citation bot (talk \| contribs) Bots 5,862,126 edits Added bibcode. Removed URL that duplicated identifier. Removed access-date with no URL. \| Use this bot. Report bugs. \| Suggested by Headbomb \| Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox \| #UCB_webform_linked 290/990
(28 intermediate revisions by 9 users not shown)
Line 1: {{Short description\|Processing data technology}} {{advert\|date=November 2018}} ~~{{cleanup rewrite\|date=January 2020}}~~ In [[computer science]], '''in-memory processing''' (PIM) is a [[computer architecture]] for [[data processing\|processing]] data stored in an [[in-memory database]].<ref>{{Cite journal \|last=Ghose \|first=S. \|date=November 2019 \|title=Processing-in-memory: A workload-driven perspective \|url=https://www.pdl.cmu.edu/PDL-FTP/associated/19ibmjrd_pim.pdf \|journal=IBM Journal of Research and Development \|volume=63 \|issue=6 \|pages=3:1–19\|doi=10.1147/JRD.2019.2934048 \|s2cid=202025511 }}</ref> In-memory processing may improve the [[Electric power\|power usage]] and [[Computer performance\|performance]] of moving data between the processor and the main memory.<ref>{{Cite book\|last1=Chi\|first1=Ping\|last2=Li\|first2=Shuangchen\|last3=Xu\|first3=Cong\|last4=Zhang\|first4=Tao\|last5=Zhao\|first5=Jishen\|last6=Liu\|first6=Yongpan\|last7=Wang\|first7=Yu\|last8=Xie\|first8=Yuan\|title=2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) \|chapter=PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory \|date=June 2016\|chapter-url=https://ieeexplore.ieee.org/document/7551380\|___location=Seoul, South Korea\|publisher=IEEE\|pages=27–39\|doi=10.1109/ISCA.2016.13\|isbn=978-1-4673-8947-1}}</ref> This contrasts with systems that transfer data from [[disk storage]] to memory on demand, an architecture required when memory was small and expensive compared to disk storage and the database size, and when read-intensive applications such as [[business intelligence]] (BI) were not major applications. Because stored data is accessed much more quickly when it is placed in [[random-access memory]] (RAM) or [[flash memory]], in-memory processing allows data to be analyzed in [[Real-time computing\|real time]], enabling faster reporting and decision-making in business.<ref>{{cite book\|last1=Plattner\|first1=Hasso\|last2=Zeier\|first2=Alexander\|title=In-Memory Data Management: Technology and Applications\|date=2012\|publisher=Springer Science & Business Media\|isbn=9783642295744\|url=https://books.google.com/books?id=HySCgzCApsEC&q=%22in-memory%22\|language=en}}</ref><ref>{{cite journal\|first=Hao\|last=Zhang\|author2=Gang Chen\|author3=Beng Chin Ooi\|author4=Kian-Lee Tan\|author5=Meihui Zhang\|title=In-Memory Big Data Management and Processing: A Survey\|journal=IEEE Transactions on Knowledge and Data Engineering\|date=July 2015\|volume=27\|issue=7\|pages=1920–1948\|doi=10.1109/TKDE.2015.2427795\|doi-access=free}}</ref> The term is used for two different things: == Disk-based business intelligence ==▼ # In [[computer science]], '''in-memory processing''', also called '''compute-in-memory''' (CIM), or '''processing-in-memory''' (PIM), is a [[computer architecture]] in which data operations are available directly on the data memory, rather than having to be transferred to [[Central processing unit\|CPU]] registers first.<ref>{{Cite journal \|last=Ghose \|first=S. \|date=November 2019 \|title=Processing-in-memory: A workload-driven perspective \|url=https://www.pdl.cmu.edu/PDL-FTP/associated/19ibmjrd_pim.pdf \|journal=IBM Journal of Research and Development \|volume=63 \|issue=6 \|pages=3:1–19\|doi=10.1147/JRD.2019.2934048 \|s2cid=202025511 }}</ref> This may improve the [[Electric power\|power usage]] and [[Computer performance\|performance]] of moving data between the processor and the main memory. ===Data structures===▼ # In [[software engineering]], '''in-memory processing''' is a [[software architecture]] where a database is kept entirely in [[random-access memory]] (RAM) or [[flash memory]] so that usual accesses, in particular read or query operations, do not require access to [[disk storage]].<ref>{{cite journal\|first=Hao\|last=Zhang\|author2=Gang Chen\|author3=Beng Chin Ooi\|author4=Kian-Lee Tan\|author5=Meihui Zhang\|title=In-Memory Big Data Management and Processing: A Survey\|journal=IEEE Transactions on Knowledge and Data Engineering\|date=July 2015\|volume=27\|issue=7\|pages=1920–1948\|doi=10.1109/TKDE.2015.2427795\|bibcode=2015ITKDE..27.1920Z \|doi-access=free}}</ref> This may allow faster data operations such as "joins", and faster reporting and decision-making in business.<ref>{{cite book\|last1=Plattner\|first1=Hasso\|last2=Zeier\|first2=Alexander\|title=In-Memory Data Management: Technology and Applications\|date=2012\|publisher=Springer Science & Business Media\|isbn=9783642295744\|url=https://books.google.com/books?id=HySCgzCApsEC&q=%22in-memory%22\|language=en}}</ref> Extremely large datasets may be divided between co-operating systems as in-memory [[data grid]]s. ==Hardware (PIM)== PIM could be implemented by:<ref>{{cite web \|title=Processing-in-Memory Course: Lecture 1: Exploring the PIM Paradigm for Future Systems - Spring 2022 \| website=[[YouTube]] \| date=10 March 2022 \|url=https://www.youtube.com/watch?v=R-sEqnOmDT4 \|language=en}}</ref> * Processing-using-Memory (PuM) Adding limited processing capability (e.g., floating point multiplication units, 4K row operations such as copy or zero, bitwise operations on two rows) to conventional memory modules (e.g., DIMM modules); or Adding processing capability to memory controllers so that the data that is accessed does not need to be forwarded to the CPU or affect the CPU' cache, but is dealt with immediately. * Processing-near-Memory (PnM) ** New 3D arrangements of silicon with memory layers and processing layers. === Application of in-memory technology in everyday life ===▼ In-memory processing techniques are frequently used by modern smartphones and tablets to improve application performance. This can result in speedier app loading times and more enjoyable user experiences. * In-memory processing may be used by gaming consoles such as the [[PlayStation]] and [[Xbox]] to improve game speed.<ref>{{Cite web \|last=Park \|first=Kate \|date=2023-07-27 \|title=Samsung extends cut in memory chip production, will focus on high-end AI chips instead \|url=https://techcrunch.com/2023/07/27/samsung-extends-cut-in-memory-chip-production-will-focus-on-high-end-ai-chips-instead/ \|access-date=2023-12-05 \|website=TechCrunch \|language=en-US}}</ref>{{Failed verification\|date=July 2024}} Rapid data access is critical for providing a smooth game experience. * Certain wearable devices, like smartwatches and fitness trackers, may incorporate in-memory processing to swiftly process sensor data and provide real-time feedback to users. Several commonplace gadgets use in-memory processing to improve performance and responsiveness.<ref>{{Cite journal \|last1=Tan \|first1=Kian-Lee \|last2=Cai \|first2=Qingchao \|last3=Ooi \|first3=Beng Chin \|last4=Wong \|first4=Weng-Fai \|last5=Yao \|first5=Chang \|last6=Zhang \|first6=Hao \|date=2015-08-12 \|title=In-memory Databases: Challenges and Opportunities From Software and Hardware Perspectives \|url=https://doi.org/10.1145/2814710.2814717 \|journal=ACM SIGMOD Record \|volume=44 \|issue=2 \|pages=35–40 \|doi=10.1145/2814710.2814717 \|s2cid=14238437 \|issn=0163-5808\|url-access=subscription }}</ref> * In-memory processing is used by smart TVs to enhance interface navigation and content delivery. It is used in digital cameras for real-time image processing, filtering, and effects.<ref>{{Cite book \|doi=10.1109/ISCAS48785.2022.9937475 \|s2cid=253462291 \|chapter=Approximate In-Memory Computing using Memristive IMPLY Logic and its Application to Image Processing \|title=2022 IEEE International Symposium on Circuits and Systems (ISCAS) \|date=2022 \|last1=Fatemieh \|first1=Seyed Erfan \|last2=Reshadinezhad \|first2=Mohammad Reza \|last3=Taherinejad \|first3=Nima \|pages=3115–3119 \|isbn=978-1-6654-8485-5 }}</ref> Voice-activated assistants and other home automation systems may benefit from faster understanding and response to user orders. * In-memory processing is also used by embedded systems in appliances and high-end digital cameras for efficient data handling. Through in-memory processing techniques, certain IoT devices prioritize fast data processing and response times.<ref>{{Cite web \|title=What is processing in memory (PIM) and how does it work? \|url=https://www.techtarget.com/searchbusinessanalytics/definition/processing-in-memory-PIM \|access-date=2023-12-05 \|website=Business Analytics \|language=en}}</ref> ==Software== ▲=== Disk-based ~~business~~data ~~intelligence~~access === ▲====Data structures==== With disk-based technology, data is loaded on to the computer's [[hard disk]] in the form of multiple tables and multi-dimensional structures against which queries are run. Disk-based technologies are often [[relational database management system]]s (RDBMS), often based on the structured query language ([[SQL]]), such as [[Microsoft SQL Server\|SQL Server]], [[MySQL]], [[Oracle database\|Oracle]] and many others. RDBMS are designed for the requirements of [[Software transactional memory\|transactional processing]]. Using a database that supports insertions and updates as well as performing aggregations, [[join (SQL)\|join]]s (typical in BI solutions) are typically very slow. Another drawback is that SQL is designed to efficiently fetch rows of data, while BI queries usually involve fetching of partial rows of data involving heavy calculations. Line 13 ⟶ 40: Information technology (IT) staff may spend substantial development time on optimizing databases, constructing [[index (database)\|index]]es and [[aggregate (data warehouse)\|aggregate]]s, designing cubes and [[star schema]]s, [[data modeling]], and query analysis.<ref>{{cite book\|last=Earls\|first=A\|title=Tips on evaluating, deploying and managing in-memory analytics tools\|year=2011\|publisher=Tableau\|url=http://www.analyticsearches.com/site/files/776/66977/259607/579091/In-Memory_Analytics_11.10.11.pdf \|archiveurl=https://web.archive.org/web/20120425232535/http://www.analyticsearches.com/site/files/776/66977/259607/579091/In-Memory_Analytics_11.10.11.pdf \|archivedate=2012-04-25}}</ref> ====Processing speed==== Reading data from the hard disk is much slower (possibly hundreds of times) when compared to reading the same data from RAM. Especially when analyzing large volumes of data, performance is severely degraded. Though SQL is a very powerful tool, arbitrary complex queries with a disk-based implementation take a relatively long time to execute and often result in bringing down the performance of transactional processing. In order to obtain results within an acceptable response time, many [[data warehouse]]s have been designed to pre-calculate summaries and answer specific queries only. Optimized aggregation algorithms are needed to increase performance. === In-memory ~~processing~~data ~~tools~~access === Memory processing can be accomplished via traditional databases such as [[Oracle Database\|Oracle]], [[IBM Db2]] or [[Microsoft SQL Server]] or via [[NoSQL]] offerings such as in-memory [[data grid]] like [[Hazelcast]], [[Infinispan]], [[Oracle Coherence]] or ScaleOut Software. With both in-memory database and [[data grid]], all information is initially loaded into memory RAM or flash memory instead of [[hard disk]]s. With a [[data grid]] processing occurs at three [[orders of magnitude\|order of magnitude]] faster than relational databases which have advanced functionality such as [[ACID]] which degrade performance in compensation for the additional functionality. The arrival of [[Column-oriented DBMS\|column centric databases]], which store similar information together, allow data to be stored more efficiently and with greater [[Data compression\|compression]] ratios. This allows huge amounts of data to be stored in the same physical space, reducing the amount of memory needed to perform a query and increasing processing speed. Many users and software vendors have integrated flash memory into their systems to allow systems to scale to larger data sets more economically. Oracle has been integrating flash memory into the [[Oracle Exadata]] products for increased performance. [[Microsoft SQL Server]] 2012 BI/Data Warehousing software has been coupled with [[Violin Memory]] flash memory arrays to enable in-memory processing of data sets greater than 20TB.<ref>{{cite web\|title=SQL Server 2012 with Violin Memory\|url=http://download.microsoft.com/download/6/9/C/69CFB214-0699-4448-8F32-CFE03A0706A6/SQL_Server_2012_Fast_Track_Data_Warehouse_for_Violin_Memory_Datasheet.pdf\|publisher=Microsoft\|access-date=2013-06-01\|archive-url=https://web.archive.org/web/20130309045249/http://download.microsoft.com/download/6/9/C/69CFB214-0699-4448-8F32-CFE03A0706A6/SQL_Server_2012_Fast_Track_Data_Warehouse_for_Violin_Memory_Datasheet.pdf\|archive-date=2013-03-09\|url-status=dead}}</ref> Users query the data loaded into the system's memory, thereby avoiding slower database access and performance [[Bottleneck (software)\|bottlenecks]]. This differs from [[caching (computing)\|caching]], a very widely used method to speed up query performance, in that caches are subsets of very specific pre-defined organized data. With in-memory tools, data available for analysis can be as large as a [[data mart]] or small data warehouse which is entirely in memory. This can be accessed quickly by multiple concurrent users or applications at a detailed level and offers the potential for enhanced analytics and for scaling and increasing the speed of an application. Theoretically, the improvement in data access speed is 10,000 to 1,000,000 times compared to the disk.{{citation needed\|date=January 2016}} It also minimizes the need for performance tuning by IT staff and provides faster service for end users. ==== Advantages of in-memory processing technology ==== Certain developments in computer technology and business needs have tended to increase the relative advantages of in-memory technology.<ref>{{cite web\|title=In_memory Analytics\|url=http://www.yellowfinbi.com/Document.i4?DocumentId=104879\|publisher=yellowfin\|page=6}}</ref> * ~~''Hardware'' becomes progressively cheaper and higher-performing, according to~~Following [[Moore's law]]., ~~Computing~~the ~~power~~number of transistors per square unit doubles every two toor ~~three~~so years. This ~~while~~is ~~decreasing~~reflected in ~~costs~~changes to price, performance, packaging and capabilities of the components. [[Random-access memory]] price and CPU computing power in particular have improved over the decades. CPU processing, memory and disk storage are all subject to some variation of this law. ~~Also~~As well, hardware innovations such as [[Multi-core processor\|multi-core architecture]], [[NAND flash memory]], [[Parallel computing\|parallel servers]], and increased memory processing capability, inhave ~~addition~~contributed to the technical and economic feasibility of in-memory approaches. * In turn, software innovations such as column centric databases, compression techniques and handling aggregate tables, ~~have all contributed to demand~~enable ~~for~~efficient in-memory products.<ref>{{cite web\|last=Kote \|first=Sparjan \|title=In-memory computing in Business Intelligence \|url=http://www.infosysblogs.com/oracle/2011/03/in-memory_computing_in_busines.html \|url-status=dead \|archiveurl=https://web.archive.org/web/20110424013629/http://www.infosysblogs.com/oracle/2011/03/in-memory_computing_in_busines.html \|archivedate=April 24, 2011 }}</ref> * The advent of ''[[64-bit operating system]]s'', which allow access to far more RAM (up to 100 GB or more) than the 2 or 4 GB accessible on [[32-bit computing\|32-bit systems]]. By providing Terabytes (1 TB = 1,024 GB) of space for storage and analysis, 64-bit operating systems make in-memory processing scalable. The use of flash memory enables systems to scale to many Terabytes more economically. * Increasing ''volumes of data'' have meant that traditional data warehouses ~~are~~may nobe ~~longer~~less able to process the data in a timely and accurate way. The [[extract, transform, load]] (ETL) process that periodically updates disk-based data warehouses with operational data ~~can~~may ~~take~~result ~~anywhere~~in ~~from~~lags aand ~~few hours to weeks to complete. So, at any given point of time~~stale data ~~is at least a day old~~. In-memory processing ~~enables~~may ~~instant~~enable faster access to terabytes of data for better real time reporting. * In-memory processing ismay be available at a ''lower cost'' compared to ~~traditional~~disk-based ~~BI tools~~processing, and can be more easily deployed and maintained. According to Gartner survey,<ref>{{Cite web \|title=Survey Analysis: Why BI and Analytics Adoption Remains Low and How to Expand Its Reach \|url=https://www.gartner.com/en/documents/3753469 \|access-date=2023-12-05 \|website=Gartner \|language=en}}</ref> deploying traditional BI tools can take as long as 17 months. ~~Many~~ ~~data warehouse vendors are choosing in-memory technology over traditional BI to speed up implementation times.~~ *Decreases in power consumption and increases in throughput due to a lower access latency, and greater memory bandwidth and hardware parallelism.<ref>{{Cite book\|last1=Upchurch\|first1=E.\|last2=Sterling\|first2=T.\|last3=Brockman\|first3=J.\|title=Proceedings of the ACM/IEEE SC2004 Conference \|chapter=Analysis and Modeling of Advanced PIM Architecture Design Tradeoffs \|date=2004\|chapter-url=https://ieeexplore.ieee.org/document/1392942\|___location=Pittsburgh, PA, USA\|publisher=IEEE\|pages=12\|doi=10.1109/SC.2004.11\|isbn=978-0-7695-2153-4\|s2cid=9089044 \|url=https://resolver.caltech.edu/CaltechAUTHORS:20170103-172751346 }}</ref> ==== Application in business ==== A range of in-memory products provide ability to connect to existing data sources and access to visually rich interactive dashboards. This allows business analysts and end users to create custom reports and queries without much training or expertise. Easy navigation and ability to modify queries on the fly is of benefit to many users. Since these dashboards can be populated with fresh data, users have access to real time data and can create reports within minutes. In-memory processing may be of particular benefit in [[call center]]s and warehouse management. With in-memory processing, the source database is queried only once instead of accessing the database every time a query is run, thereby eliminating repetitive processing and reducing the burden on database servers. By scheduling to populate the in-memory database overnight, the database servers can be used for operational purposes during peak hours. ==== Adoption of in-memory technology ==== With a large number of users, a large amount of [[Random-access memory\|RAM]] is needed for an in-memory configuration, which in turn affects the hardware costs. The investment is more likely to be suitable in situations where speed of query response is a high priority, and where there is significant growth in data volume and increase in demand for reporting facilities; it may still not be cost-effective where information is not subject to rapid change. [[Computer security\|Security]] is another consideration, as in-memory tools expose huge amounts of data to end users. Makers advise ensuring that only authorized users are given access to the data. ▲== Application of in-memory technology in everyday life == In-memory processing techniques are frequently used by modern smartphones and tablets to improve application performance. This can result in speedier app loading times and more enjoyable user experiences. In-memory processing may be used by gaming consoles such as the [[PlayStation]] and [[Xbox]] to improve game speed.<ref>{{Cite web \|last=Park \|first=Kate \|date=2023-07-27 \|title=Samsung extends cut in memory chip production, will focus on high-end AI chips instead \|url=https://techcrunch.com/2023/07/27/samsung-extends-cut-in-memory-chip-production-will-focus-on-high-end-ai-chips-instead/ \|access-date=2023-12-05 \|website=TechCrunch \|language=en-US}}</ref> Rapid data access is critical for providing a smooth game experience. Certain wearable devices, like smartwatches and fitness trackers, may incorporate in-memory processing to swiftly process sensor data and provide real-time feedback to users. Several commonplace gadgets use in-memory processing to improve performance and responsiveness<ref>{{Cite journal \|last1=Tan \|first1=Kian-Lee \|last2=Cai \|first2=Qingchao \|last3=Ooi \|first3=Beng Chin \|last4=Wong \|first4=Weng-Fai \|last5=Yao \|first5=Chang \|last6=Zhang \|first6=Hao \|date=2015-08-12 \|title=In-memory Databases: Challenges and Opportunities From Software and Hardware Perspectives \|url=https://doi.org/10.1145/2814710.2814717 \|journal=ACM SIGMOD Record \|volume=44 \|issue=2 \|pages=35–40 \|doi=10.1145/2814710.2814717 \|s2cid=14238437 \|issn=0163-5808}}</ref>.In-memory processing is used by smart TVs to enhance interface navigation and content delivery. It is used in digital cameras for real-time image processing, filtering, and effects.<ref>{{Cite book \|chapter-url=https://ieeexplore.ieee.org/document/9937475 \|access-date=2023-12-05 \|doi=10.1109/ISCAS48785.2022.9937475 \|s2cid=253462291 \|chapter=Approximate In-Memory Computing using Memristive IMPLY Logic and its Application to Image Processing \|title=2022 IEEE International Symposium on Circuits and Systems (ISCAS) \|date=2022 \|last1=Fatemieh \|first1=Seyed Erfan \|last2=Reshadinezhad \|first2=Mohammad Reza \|last3=Taherinejad \|first3=Nima \|pages=3115–3119 \|isbn=978-1-6654-8485-5 }}</ref> Voice-activated assistants and other home automation systems may benefit from faster understanding and response to user orders. In-memory processing is also used by embedded systems in appliances and high-end digital cameras for efficient data handling. Through in-memory processing techniques, certain IoT devices prioritize fast data processing and response times.<ref>{{Cite web \|title=What is processing in memory (PIM) and how does it work? \|url=https://www.techtarget.com/searchbusinessanalytics/definition/processing-in-memory-PIM \|access-date=2023-12-05 \|website=Business Analytics \|language=en}}</ref> == See also ==