Data engineering: Difference between revisions

Content deleted Content added
Reverted 2 edits by Aniketyadav01 (talk): Spam
m Disambiguating links to Data protection (link changed to Information privacy) using DisamAssist.
Line 9:
In the early 2000s, the data and data tooling was generally held by the [[information technology]] (IT) teams in most companies.<ref name="hist2">{{cite web |last1=Dodds |first1=Eric |title=The History of the Data Engineering and the Megatrends |url=https://www.rudderstack.com/blog/the-data-engineering-megatrend-a-brief-history |website=Rudderstack |access-date=31 July 2022}}</ref> Other teams then used data for their work (e.g. reporting), and there was usually little overlap in data skillset between these parts of the business.
 
In the early 2010s, with the rise of the [[internet]], the massive increase in data volumes, velocity, and variety led to the term [[big data]] to describe the data itself, and data-driven tech companies like [[Facebook]] and [[Airbnb]] started using the phrase ''' data engineer'''.<ref name="hist1" /><ref name="hist2" /> Due to the new scale of the data, major firms like [[Google]], Facebook, [[Amazon (company)|Amazon]], [[Apple Inc.|Apple]], [[Microsoft]], and [[Netflix]] started to move away from traditional [[Extract transform load|ETL]] and storage techniques. They started creating '''data engineering''', a type of [[software engineering]] focused on data, and in particular [[data infrastructure|infrastructure]], [[data warehouse|warehousing]], [[Information privacy|data protection]], [[cybersecurity]], [[data mining|mining]], [[data modelling|modelling]], [[data processing|processing]], and [[metadata]] management.<ref name="hist1" /><ref name="hist2" /> This change in approach was particularly focused on [[cloud computing]].<ref name="hist2" /> Data started to be handled and used by many parts of the business, such as [[sales]] and [[marketing]], and not just IT.<ref name="hist2" />
 
== Tools ==