Wikipedia:Past and future of Wikipedia

Many advances in the building of the sum of all human knowledge have taken place since 15 January 2001, when English Wikipedia's first edit was made. The purpose of this essay is to think both about the past and future of this sum of all human knowledge (that includes not only Wikipedia, but also many other sister projects, such as Wikimedia Commons, Wikisource, Wikibooks, and Wiki-many other things), to acknowledge the many advances that have been made, to dream about how much can still be done, and to warn about the dangers that threaten its future.

Past

edit

English Wikipedia began in January 2001, when Jimmy Wales and Larry Sanger started the project (the idea of a freely licensed encyclopedia had been proposed by Richard Stallman in 1998). Over the years, versions in other languages were added, and new sister projects began: Wikimedia Commons for media files, Wikisource for freely licensed written works, Wikivoyage for travel guides, and more, all of them in multiple languages.

At the time of Wikipedia's beginning, free licensing was not the norm: the main digital encyclopedia was Microsoft Encarta, while the most used web browsers were Internet Explorer and Netscape Navigator, all of them proprietary software (and, in the case of Encarta, also proprietary content). The standard office suite was Microsoft Office (OpenOffice, later LibreOffice, was in its very beginning, and had no mature status at all), which back then even used non-open, proprietary file formats, before it switched to Office Open XML around 2007. Proprietary Microsoft Windows was by far the dominant operating system for desktop computing, while in server computing it shared the space with proprietary Unix versions, and the emerging first useful free software option: Linux, that back then offered a very limited functionality in desktop, and was still arising in servers, making operating systems also dominated by proprietary software. Unrelated to free licensing, smartphones were non-existent, limiting the options to reading or watching contents from any place.

From then to now, the open/free software/freely licensed ecosystem has grown to the point of being dominant in many fields, and that ecosystem is the base that made Wikipedia (and its sister projects) possible. The enormous and (in many cases) high quality content that Wikipedia or Wikimedia Commons have, is a part of that big open world, that offers almost unlimited possibilities when compared to a scenario where each piece of software and each individual work must be purchased separately. Wikipedia itself wouldn't be possible without free licensing and free software at the various levels (MediaWiki, database, operating system, etc).

The past is often looked at with a feeling of nostalgia. Many people have great memories about the time when Wikipedia began, in 2001. When possible, things were enjoyed, as are now. But, among the things that were enjoyed, the availability of great amounts of freely licensed (and, therefore, reusable) content, was not one of them, nor it was the high quality free software that makes it possible. The sum of all human knowledge was not one of them, as its construction was just beginning. While it is a perpetual work in progress, it's now in a state that allows it to be a thing that was never seen before: a really big collection of human knowledge, that even a full human life gives not time enough to read it in its entirety.

Future

edit

From the past, we came to the present, and from there, we'll arrive to the future, that is unknown. Things can change for better or worse. We'll start with the worst cases, that must be prevented, and, finally, we will dream about achievements not yet reached, that can bring the future of human knowledge to new possibilities.

Pessimistic outcome

edit
 
Albert Einstein famously said: Two things are infinite: the universe and human stupidity; and I'm not sure about the universe. Infinite possible stupid ideas may threaten the future of Wikipedia and its sister projects

That which we have built must be taken care of, and must be preserved. The most likely threats aren't asteroid impacts or nuclear war. They are the comparatively minor things that few people take care of. Major threats are thoughts such as:

Example Counterargument
Wikipedia is not needed any more, since AI chatbots provide answers to any question. That sentence is as wise as saying that books are not needed because we have teachers. No further comment is needed.
Content should only be included on Wikipedia if it's viral. If most people follow this, articles (and Commons images) on non-viral topics are not created or stagnate, and the overall quality of Wikipedia will eventually degrade. Conversely, the fact that a subject is viral does not necessarily guarantee that there should be an article about it.
This topic has an excellent article and lots of good images. There is no need to work on it at all. Articles should be kept up to date, and updated images on topics should be uploaded. Article histories will remain publicly available, and the same should be true for most Wikimedia Commons images used in previous versions of the article, so there is no reason not to have updated information. Current versions of Wikipedia articles must be in the present time, so you don't feel like you are reading an obsolete encyclopedia.
Let's focus on Wikipedia only. The other Wikimedia projects are not relevant, they should have lower priority. While Wikipedia is without any doubt the vertebrating core of Wikimedia, all projects are important. Most images you see on Wikipedia articles are not hosted on Wikipedia, but on Wikimedia Commons. Having an article about a book that is already in the public ___domain, however good it is, is not the same as being able to read the full text of the book, hosted in Wikisource. The same can be said about articles on paintings or sculptures, where the artistic work can be seen in full detail in images hosted on Wikimedia Commons. An article about a country or a city can't talk about it from a tourist's point of view, but that viewpoint can be seen in the corresponding article in Wikivoyage. An article on a notable person can't include many quotes from them, but they can all be read in Wikiquote. Wikipedia is the starting point, but a true sum of all human knowledge must be completed with content from all the other projects.
Images and videos must always be uploaded with the maximum possible image resolution. While there are exceptions (for example, photos of large landscapes), there is no need to always use the maximum possible image resolution, given that the image or video has good enough quality. While storage technologies will keep improving, there is the danger of eventually being unable to store all the needed image or video files (with all the needed backup copies) because an unnecessary level of image resolution is being used, if the commonly used image resolutions grow faster than standard storage capacity.
Wikipedia doesn't need to preserve past revisions of articles. And wiki pages can be merged and deleted (with all their edit history) without any problems. Any image from Wikimedia Commons must be deleted as soon as we have a better one. We must think about the future, for the past we have Internet Archive. As of July 2025, the Internet Archive is facing a lawsuit that may threaten its very existence. Even if not, there is no publicly available evidence that confirms that they have full backups out of San Francisco Bay Area, where earthquake risk is huge. Some obscure, not very reliable, sources say they have a full backup in Vancouver. Even if that's true... hey, that's another high risk seismic zone! On top of this, in 2024 the Internet Archive suffered several serious cyberattacks. Let's hope all of this changes in the future, but, as of July 2025, it seems (from what can be read in publicly available sources) there are few reasons for much optimism.


The point about links is especially serious, since such links are currently preserved in the Internet Archive only; more than a risk, it's a very current problem in the event of the Internet Archive shutting down or suffering a huge data loss.

This stuff isn't needed on Wikimedia Commons/Wikisource/other wiki, since it's already in the Internet Archive.
This stuff isn't needed on Wikimedia Commons/Wikisource/other wiki, since somebody at some place hosts a copy for sure.
Wikipedia doesn't need to worry about the links used as citations in articles, since they are at the Internet Archive.
Some random pretext, since we have the Internet Archive.
Any of the above points mentioning the Internet Archive, replacing "Internet Archive" with "archive.today". archive.today is probably even more unreliable for the long term than Internet Archive. There is no publicly known organization behind it, so it might close at any moment because of an unknown reason.
Wikipedia/Wiki.... XML dumps are not really needed, much less dumps with all edit history. See the points above about the Internet Archive, and also LOCKSS.
Implementing media dumps is not a real need, we have been without them for years.
Think about how much content we have about the 19th and 20th centuries, just imagine how much 21st-century content will be preserved anyway, even without the Internet Archive and without Wikimedia projects. Let's think about what really matters. Unlike previous centuries, most 21st-century content is digital, and, while read from millions of devices, is stored in really few places.
Everybody knows Wikipedia, it's everywhere, so its contents can not be lost.
Don't care about potential wrongful deletion of content. I'm just thinking about next week, not next century. When next month arrives, you miss that content that was lost because you only cared about its preservation up to one week. The content where preservation for next century was considered, will also be there next month for sure.
Don't care about potential wrongful deletion of content. The author will eventually request it back. If they are still here.
Let's save some money on backups at the Wikimedia Foundation. After all, we have been here for 24 years, and nothing serious has ever happened, no content was ever lost. And then, the serious thing happens.
Let's save some money by storing all copies in the same datacenter. A full datacenter loss is unimaginable.

Optimistic outcome

edit
 
Encyclopædia Britannica Eleventh Edition (1911). Now, all its content fits in a smartphone in your pocket, while taking very few storage space. Hopefully, some day this will also be true with Wikipedia and its sister projects, in all languages joined together, with all Wikimedia-hosted media files, all of them with full page edit history

The size and scope of the union of Wikipedia and all its sister projects, makes them something unique, a first in history. In 1911, Encyclopædia Britannica Eleventh Edition was available to far fewer people than Wikipedia is now. For a person living in 1911, it was a really huge amount of knowledge. Having it available was like a dream for many people, for sure. Now, it is available at Wikisource, as a tiny part of all the knowledge that Wikipedia/Wikimedia has to offer. If we think into the future, if things are well done and nothing (or next to nothing) is lost along the way, the current content of Wikipedia and its sister projects can become something similar in the future: a tiny thing that fits in your smartphone in your pocket, much smaller than the usual files you have around.

New technologies still in development as of 2025 are capable of storing up to 360 terabytes of data in a single disk, and, unlike common current storage technologies, such disk would be able to last for billions of years. As of 28 July 2025, the total size of all media files in Wikimedia Commons was 661.29 TB, by far the biggest part in terms of storage. XML dumps of the various Wikimedia wikis have different sizes, but the sum of all of them (compressed) may well be under 300 TB. Only 3 of such 360 TB disks would be needed to store the sum of all human knowledge in full, in its current state, and with all page edit history.

Further future advances in storage technologies could make a petabyte a manageable thing, in the same way it happened in the past with the gigabyte, and is happening now with the terabyte. As the years go by, the number of works that fall in the public ___domain will grow, and many of them (books, movies, songs) will be uploaded to Wikimedia Commons, while many of the books will also be added to Wikisource, and even many song lyrics will be added to Wikipedia articles. This will result in a big growth of the total needed size, but technological improvements can make possible to manage so high amount of storage space.

While centralized online activity will always be needed, so the wikis are edited, new articles are created, new media files are uploaded, and content is always kept up to date, a culture of having many copies of such full collection could develop (see LOCKSS principle), once technological improvements make it easier to achieve. In this way, as is the case with printed content, the text and images that millions of people use, could also be in thousands, maybe even millions of places at the same time. There is no better warranty of preservation than that.

The Wikimedia Foundation Mission statement says:

The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public ___domain, and to disseminate it effectively and globally.

In coordination with a network of individual volunteers and our independent movement organizations, including recognized Chapters, Thematic Organizations, User Groups, and Partners, the Foundation provides the essential infrastructure and an organizational framework for the support and development of multilingual wiki projects and other endeavors which serve this mission. The Foundation will make and keep useful information from its projects available on the internet free of charge, in perpetuity.

If knowledge, in addition to being built and put together, is also kept in perpetuity, this mission will be accomplished.