Content deleted Content added
mNo edit summary Tags: Reverted Visual edit Newcomer task Newcomer task: copyedit |
|||
Line 2:
{{advert|date=November 2018}}
{{cleanup rewrite|date=January 2020}}
In [[computer science]], '''in-memory processing''' (PIM) is a [[computer architecture]] for [[data processing|processing]] data stored in an [[in-memory database]].<ref>{{Cite journal |last=Ghose |first=S. |date=November 2019 |title=Processing-in-memory: A workload-driven perspective |url=https://www.pdl.cmu.edu/PDL-FTP/associated/19ibmjrd_pim.pdf |journal=IBM Journal of Research and Development |volume=63 |issue=6 |pages=3:1–19|doi=10.1147/JRD.2019.2934048 |s2cid=202025511 }}</ref> In-memory processing improves the [[Electric power|power usage]] and [[Computer performance|performance]] of moving data between the processor and the main memory.<ref>{{Cite book|last1=Chi|first1=Ping|last2=Li|first2=Shuangchen|last3=Xu|first3=Cong|last4=Zhang|first4=Tao|last5=Zhao|first5=Jishen|last6=Liu|first6=Yongpan|last7=Wang|first7=Yu|last8=Xie|first8=Yuan|title=2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA) |chapter=PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory |date=June 2016|chapter-url=https://ieeexplore.ieee.org/document/7551380|___location=Seoul, South Korea|publisher=IEEE|pages=27–39|doi=10.1109/ISCA.2016.13|isbn=978-1-4673-8947-1}}</ref> Older systems have been based on [[disk storage]] and [[relational database]]s using [[Structured Query Language]], which are increasingly regarded as inadequate to meet [[business intelligence]] (BI) needs. Because stored data is accessed much more quickly when it is placed in [[random-access memory]] (RAM) or [[flash memory]], in-memory processing allows data to be analyzed in [[Real-time computing|real time]], enabling faster reporting and decision-making in business.<ref>{{cite book|last1=Plattner|first1=Hasso|last2=Zeier|first2=Alexander|title=In-Memory Data Management: Technology and Applications|date=2012|publisher=Springer Science & Business Media|isbn=9783642295744|url=https://books.google.com/books?id=HySCgzCApsEC&q=%22in-memory%22|language=en}}</ref><ref>{{cite journal|first=Hao|last=Zhang|author2=Gang Chen|author3=Beng Chin Ooi|author4=Kian-Lee Tan|author5=Meihui Zhang|title=In-Memory Big Data Management and Processing: A Survey|journal=IEEE Transactions on Knowledge and Data Engineering|date=July 2015|volume=27|issue=7|pages=1920–1948|doi=10.1109/TKDE.2015.2427795|doi-access=free}}</ref>
== Disk-based business intelligence ==
===Data structures
With disk-based technology, data is loaded on to the computer's [[hard disk]] in the form of multiple tables and multi-dimensional structures against which queries are run. Disk-based technologies are [[relational database management system]]s (RDBMS), often based on the structured query language ([[SQL]]), such as [[Microsoft SQL Server|SQL Server]], [[MySQL]], [[Oracle database|Oracle]] and many others. RDBMS are designed for the requirements of [[Software transactional memory|transactional processing]]. Using a database that supports insertions and updates as well as performing aggregations, [[join (SQL)|join]]s (typical in BI solutions) are typically very slow. Another drawback is that SQL is designed to efficiently fetch rows of data, while BI queries usually involve fetching of partial rows of data involving heavy calculations.
To improve query performance, multidimensional databases or [[OLAP cube]]s - also called multidimensional online analytical processing (MOLAP) - are constructed. Designing a cube is an elaborate and lengthy process, and changing the cube's structure to adapt to dynamically changing business needs may be cumbersome. Cubes are pre-populated with data to answer specific queries and although they increase performance, they are still not suitable for answering ad-hoc queries.<ref>{{cite journal|last=Gill|first=John|title=Shifting the BI Paradigm with In-Memory Database Technologies|journal=Business Intelligence Journal|year=2007|volume=12|issue=2|pages=58–62|url=http://www.highbeam.com/doc/1P3-1636785121.html|archive-url=https://web.archive.org/web/20150924203158/http://www.highbeam.com/doc/1P3-1636785121.html|url-status=dead|archive-date=2015-09-24}}</ref>
Information technology (IT) staff spend substantial development time on optimizing databases, constructing [[index (database)|index]]es and [[aggregate (data warehouse)|aggregate]]s, designing cubes and [[star schema]]s, [[data modeling]], and query analysis.<ref>{{cite book|last=Earls|first=A|title=Tips on evaluating, deploying and managing in-memory analytics tools|year=2011|publisher=Tableau|url=http://www.analyticsearches.com/site/files/776/66977/259607/579091/In-Memory_Analytics_11.10.11.pdf |archiveurl=https://web.archive.org/web/20120425232535/http://www.analyticsearches.com/site/files/776/66977/259607/579091/In-Memory_Analytics_11.10.11.pdf |archivedate=2012-04-25}}</ref>
===Processing speed
Reading data from the hard disk is much slower (possibly hundreds of times) when compared to reading the same data from RAM. Especially when analyzing large volumes of data, performance is severely degraded. Though SQL is a very powerful tool, complex queries take a relatively long time to execute and often result in bringing down the performance of transactional processing. In order to obtain results within an acceptable response time, many [[data warehouse]]s have been designed to pre-calculate summaries and answer specific queries only. Optimized aggregation algorithms are needed to increase performance.
|