Talk:Abstract Wikipedia

This is an archived version of this page, as edited by Zezen (talk | contribs) at 12:18, 16 December 2021 (What about implicit bias, THE TRUTH and fuzzy logic?: Reply). It may differ significantly from the current version.

Sub-pages

what api will be used?

is http-web api planned for wikifunctions? is the functions planned to be called through web api or other way? --QDinar (talk) 13:02, 17 September 2021 (UTC)Reply

We offer already a Web API to call functions from Wikifunctions, see https://notwikilambda.toolforge.org/w/api.php?action=help&modules=wikilambda_function_call. We might not always use the API for our internal use cases (e.g. Wikipedia calling a function might not go through an HTTP request), it depends on what is efficient. -- DVrandecic (WMF) (talk) 22:49, 24 September 2021 (UTC)Reply

why not to use ready programming language implementations?

what do you think about idea to just use usual programming language interpreters? ie code can be in wiki page, and in can be runned. some dangerous functions can be removed or turned off in order to save from hacking/vandalism. --QDinar (talk) 13:02, 17 September 2021 (UTC)Reply

Great idea - and we do that! Python code is run by the standard Python implementation, JavaScript by Node, etc. The way we hope to avoid dangerous functions is by running them in their own containers with limited resources and no access to the outside world. The architecture is described here. -- DVrandecic (WMF) (talk) 22:51, 24 September 2021 (UTC)Reply

what are z-ids?

what is origin of "z" letter in z-ids? are there already z-ids in wikidata? as i understood, z-ids just replace multiple natural language strings, is it so? if it is so, why function names like "Object_with_modifier_and_of" also not replaced with them? the code in right block in https://notwikilambda.toolforge.org/wiki/Z10104 is hard to understand. are the z-codes in it planned to be replaced with natural language strings? --QDinar (talk) 13:02, 17 September 2021 (UTC)Reply

You are totally right! "Object_with_modifier_and_of" is just the English name of the function, in reality it is identified by a ZID, and it will have a different name in Arabic, and in Russian, and in German, and in Tatar, etc. That is true for all of our functions. The "and" function is called "i" in Croatian, but the ZID in Notwikilambda is Z10026. In the User Interface though, we will try to hide all ZIDs, and instead display the names in your language. -- DVrandecic (WMF) (talk) 22:55, 24 September 2021 (UTC)Reply
there are 5 questions in main text (paragraph, body) part of my question. you answered third and 4th. what can you say about others? --QDinar (talk) 19:23, 4 October 2021 (UTC)Reply
There are no Z-IDs in Wikidata yet. --DVrandecic (WMF) (talk) 00:43, 24 November 2021 (UTC)Reply

response to an old reply (in the archives) about ML

in Talk:Abstract_Wikipedia/Archive_2#Wikidata_VS_Wikipedia_-_General_structure user:DVrandecic (WMF) said:

1. "we don't have automatic translation for many languages"

2. "The hope is that Abstract Wikipedia will generate content of a consistently high quality that it can be incorporated by the local Wikipedias without the necessity to check each individual content. .... translation doesn't help with updates. If the world changes, and the English Wikipedia article gets updated, there is nothing that keeps the local translation current."

i want to say that:

1. i saw news saying yandex has developed machine translation for bashkir language, using tatar language, because this languages are very similar, and there are more content in tatar. (so, more languages may appear, (in ML), using such tehniques).

2.

i doubt that the way you are going to use is going to provide more stable results than the ML. users will constantly edit the functions, renderers, constructors, the abstract code, and probably something is going to brake also. so, an idea have come just in my mind: in case of editing a renderer, if all cases of using that renderer are linked to it, editor may check all the use cases before applying changes... if that use cases are not too many...

it is possible also to make ML automatical updates more easy to check. if after one or several updates some user is notified with them, and given an easy to read page showing differences that are made in original page, and differences that are going to be made in translation.

though, there is a stronger argument against ML in this forum in Talk:Abstract_Wikipedia/Archive_3#Might_deep_learning-based_NLP_be_more_practical? by user:Stevenliuyi: "a ML-based system will make sentences more fluent, it could potentially turn a true statement into a false one".

--QDinar (talk) 23:11, 19 September 2021 (UTC)Reply

I very much look forward and hope that more high quality machine translation will become available for everyone. I think there's a window of opportunity where Wikifunctions / Abstract Wikipedia will provide knowledge in high quality in languages where machine translation will not yet.
I like the idea of showing the results when editing. Our first implementation of that idea is to do that with the testers. But as the system develops, we will probably be able to use more of the system to let the contributors understand the impact of their edits. -- DVrandecic (WMF) (talk) 23:16, 24 September 2021 (UTC)Reply

Boilerplate functions

From my point of view Scratch is a good example for a low-coding-plattform. It is easy possible on it to create a program. In Scrath there are Boilerplates with gaps and it is possible with drag and drop to take the different parts that have different looks and it is possible to connect the parts to a function if they can belong together. From my view for at least some functions that principle could be a possibilty with a lower barrier for creating functions after the boiler plate templates could be translated into other languages to reach people with lower coding knowledge. Making it for them possible to create a function. Have you thinked about offering a possibility like that in the User Interface.--Hogü-456 (talk) 20:42, 25 September 2021 (UTC)Reply

I created a script and with what it is possible to convert a script in a text file in a COBOL-like language to code in R. The definitions of the structure of the sentences are in a CSV-File and you can find the code in https://public.paws.wmcloud.org/User:Hog%C3%BC-456/TexttoCode/Structured%20Text%20to%20Code/. This is an example for what functions in Wikifunctions could be used and I think that reduces the barrier to create a program and if I create more examples I collect through that functions. For me it is important that it will be possible in Wikifunctions to use the functions Offline. For example in schools there are sometimes restrictions regarding web services and it is from my point of view good if data is not transferred to another party if not neccessary. What do you think about the program that I wrote. Do you think that this can be helpful if there are more sentences and their code equivalent defined.--Hogü-456 (talk) 20:58, 22 November 2021 (UTC)Reply
Regarding "Will it be possible to run Wikifunctions functions offline?" - we hope so! We hope that there will be evaluation engines that can run offline, and where you can then have certain functions available to run them on your own hardware, yes. We really want to support the creation of such evaluation engines, and hope that they will help with having many people run Wikifunction functions in all kind of environments. --DVrandecic (WMF) (talk) 00:21, 24 November 2021 (UTC)Reply
Regarding "code that takes a COBOL-like language and translates it to R", I do hope that we will be able to support that kind of functions in Wikifunctions as well, basically a compiler or transpiler from a specification language you define to another language such as R. I took a look at your directory, but have to admit that I didn't exactly figure out how it works. But yes, having several layers of code build on top of Wikifunctions, for example to have a simple declaration of Wikidata queries which then gets compiled into SPARQL and executed - that would be very good to have in Wikifunctions! I hope I understood your suggestion. --DVrandecic (WMF) (talk) 00:25, 24 November 2021 (UTC)Reply
Regarding a Scratch-like interface, yes, that would be awesome. I am not saying everyone would need to learn Scratch in order to implement functions for Wikifunctions, but it would be great if, besides the usual text-based languages such as Python or JavaScript, we would also support Scratch as a programming language, including its UX. --DVrandecic (WMF) (talk) 00:26, 24 November 2021 (UTC)Reply

why not to use a ready "abstract language"?

many natural language generation projects are listed in Abstract_Wikipedia/Related_and_previous_work/Natural_language_generation. why you decided to develop a new coding standart, instead of using of a ready system? --QDinar (talk) 19:35, 4 October 2021 (UTC), edited 19:37, 4 October 2021 (UTC)Reply

Because we don't know which one of these is the right one, so we offer a platform where the community can either re-use an existing one, or come up with their own. --DVrandecic (WMF) (talk) 00:27, 24 November 2021 (UTC)Reply
as i know, you planned to develop just one new coding standart, with many human languages on outer surface, but inner code with z-ids was going to be a language, (determined with many constraints). have you now decided to allow other "abstract" languages to be added, like new human languages can be added to wikimedia projects (through incubator wiki, and other possible stages)? --QDinar (talk) 18:56, 2 December 2021 (UTC)Reply
I was hoping we would have only one single abstract language, although it might include several ways to say the same thing. We have not yet decided where the abstract content would be stored, i.e. on an incubator wiki or some other place. This is a discussion we are going to have early next year. -- DVrandecic (WMF) (talk) 20:55, 10 December 2021 (UTC)Reply

i would like you allow many "abstract languages"... using only one language is like dictatorship, not freedom. also, allowing different languages would allow separate your language project from wikimedia, or develop several languages within wikimedia. your project may not gain big communtiy of developers, and other projects may already have big communities. but how that can be made? seems, the listed, by the url, languages are not {multilingual like your project}. such codes could be hold on every language wikipedia. they could be mixed with traditional code included within tags like <code lang="..."></code>. several stages of generation of string can be made available to see in a tab near "edit source" tab. generated wiki code may be in a tab. a code with structured text (with structure of human language text) just before linearasing it (into wiki code) can be shown in an other tab, if available... in case a multilingual code is used, it can be included with tags like <includetext from="..." lang="..." />. its wikitext and structured text also can be shown in tabs. --QDinar (talk) 20:50, 14 December 2021 (UTC)Reply

Mention the original name Wikilambda

There is message box on top with a footnote, but nonetheless I want to suggest that a (near) native English user adds in the text passage about Wikifunctions some note about the original name Wikilambda for various reasons:

Because of the translation system I do not simply want to add text, but my own suggestion would be: “Originally it was named Wikilambda derived from the Lambda calculus. The name Extension:WikiLambda and the Wikifunctions logo containing a lambda still are reminiscences.” Is this OK? — Speravir – 23:17, 28 October 2021 (UTC)Reply

@Speravir: Thank you for the suggestion! I've added it to Abstract Wikipedia#Background. Sorry for the delayed response. Quiddity (WMF) (talk) 21:33, 22 November 2021 (UTC)Reply
@Quiddity (WMF): Thank you nonetheless. — Speravir – 00:30, 23 November 2021 (UTC)Reply

How will Abstract Wikipedia work from the editors point of view?

I am a German with an interest in diplomats, using the VisualEditor to create and update articles. Abstract Wikipedia sounds like a great idea for my area of work. When I create an article about the new ambassador of Germany to, say, Sweden, it would be great if that article is immediately available to swedish readers (and others worldwide) as well. And vice versa, there are ambassadors from many countries in Germany, so it would be a very efficient use of resources, if we do not have to create and maintain articles about them in parallel in various languages.
Now I read in the explanations and discussions a lot about programming. Is that, what Abstract Wikipedia is going to be? Just another programming language?
Do you expect me to learn this new programming language and all its functions in order to contribute to Abstract Wikipedia? Or will that be a background functionality, so that I can continue to create the article using the VisualEditor and then push a button which will then translate the (in my case German) article into the objects required for Abstract Wikipedia?
--Wikipeter-HH (talk) 16:21, 29 November 2021 (UTC)Reply

@Wikipeter-HH: That's a great question! And in many ways the answer is "We don't know yet". There might be several ways that we will explore, here is just one possibility. I will describe it a bit, but if that isn't enough we can also make a few mockups.
First, it will likely not be like VisualEditor nor like Programming.
So, let's assume we want to create a new biography for an ambassador. Let's say the first two sentences are "Malala Jones (born January 14, 1984 in Fairfax, VA) is the current ambassador of the United States to Nigeria. She is a former member of the girlband Foxy Fairies."
When starting an article, we would need to select a constructor for the first sentence (selecting the right constructor will be one of the hardest parts). We'll likely have a constructor for "Biography start definition". Once we select that constructor, we will get a form that has several fields. In this case I could imagine fields such as "first name", "last name", "date of birth", "place of birth", "position".
There would be a page describing the constructor and the fields, and we could play around to see how it creates sentences in different languages. I imagine that "position" is something like a specific, unique role (which is why the constructor is called a definition, it identifies a specific individual). So for position we would need to use another constructor.
If we are lucky, there will be a constructor for ambassadors, which might ask "current" (a checkbox), "since", "until", "from", "to". So we choose the ambassador constructor, and another set of forms opens up and allows us to choose values.
So we would be building up the sentence. The catalog of constructors defines the expressivity we have available.
Once we are happy with the first sentence, we could add a second sentence (by clicking in the right place), and use a new constructor, e.g. "Person description", and it could ask for fields such as "person", "description", "from", "to", "___location", etc. We again would select the Person (and if the renderers do their job right, it would either say "She", or "Jones", or whatever is appropriate), we would leave the "from", "to", and "___location" fields free, but in the "description" field we could be happy to have a "band member noun phrase" constructor, which allows us to set "former", and add the band. The band could be an item from Wikidata, or a simple name.
But in general, the rough idea is to have a lot of forms, to fill them up with more forms, or with items or constructors, and to have a very clicky and restricted interface. If something needs to be changed, we would again click on edit, and change it in the forms. It won't be as easy as writing an article in a specific language. But it will allow to create content in many languages at once.
One idea we will be exploring is to have a natural language input box, where you can just write in natural language, and then we have a classifier or parser that tries to figure what the right constructors would be to create similar text. It might then switch the sentence "She is a former member of the girlband Foxy Fairies." to "She was a member of the girlband Foxy Fairies." (i.e. adjust the tense, drop the adjective), depending on what kind of constructors are available. So you would type in natural language text, it would guess the constructors, show you the text output in the languages you are comfortable with, and allow you edit the forms of the constructors directly to fix errors.
I hope that we will hear more ideas for the UX as we get closer to this, and I expect that our designers will be very busy to create usable workflows and user experiences.
All the constructors I named here are just suggestions. I don't want to be prescriptive about what kind of constructors we should or will have.
I hope this helps a bit! -- DVrandecic (WMF) (talk) 01:38, 11 December 2021 (UTC)Reply
Hi @DVrandecic (WMF):,
thanks for the detailed explanation. That helps me a lot in understanding the route you are pursuing. What you describe reminds me of ancestry.com. You can enter details of a persons life (birth, parents, marriage, children, ...) and it will create a story from that. However, that is a sequence of loose sentences and sometimes a bit boring (His son John was born 12th February 1907. His daugther Mary was born 31. July 1909. ...). In my view a Wikipedia Article should be a coherent text which makes an interesting read. That is where authors writing in natural language make a difference. I do like the idea of the natural language input box, which is fed into an (artficial intelligence based?) parser. That tool will actually be inevitable to get the millions of existing articles into Abstract Wikipedia.
I will keep an eye on the further developments and if you need a pilot user, please do not hesitate to contact me. --Wikipeter-HH (talk) 12:32, 11 December 2021 (UTC)Reply
@Wikipeter-HH: Thank you! I had this mock-up I did much earlier, and used in a few talks, but couldn't find it on Commons. So I uploaded it now. Maybe that helps, too.
 
Yes, indeed, the series of sentences are expected to be more boring and monotonous than hand-written text. It depends on how many and what kind of constructors we have to see how fluent we can make the results be. -- DVrandecic (WMF) (talk) 02:10, 14 December 2021 (UTC)Reply

What about implicit bias, THE TRUTH and fuzzy logic?

After reading here and there and even on Github and elsewhere, I have many questions about ontology behind this project, probably naive, and I am not sure if it is the right forum. Also, I tend to (ab)use hrefs, examples and metaphors, so do tell me if anything is unclear:

1. Why do you assume that TRUTH is universal?

2. What about counterexamples:

Given a constructor Superlative with the keys subject, quality, class, and ___location constraint, we can have the following abstract content:

Superlative(
  subject: Crimea
  quality: large,
  class: penisula,
  ___location constraint: Russia)
Superlative(
  subject: Taiwan, Hainan
  quality: large,
  class: island,
  ___location constraint: China)

In Wikifunctions, we would have the following function signature:

generate text(superlative, language) : text

[...] The application of the function to the abstract content would result in the following output content:

(in English) Crimea is the largest penisula in Russia.

(in Croatian) Krim je najveći poluotok u Rusiji.

...

(in English) Taiwan is the largest island in China, with Hainan being the second largest one.

(in French) Taïwan est la plus grande île de Chine, Hainan étant la deuxième plus grande...

or anything with "Allah", "Prophet", "Kosovo", "Palestine" in its "superlative subject" field, or other examples from the Lamest edit wars set?

Indeed, "Romanian Wikipedia on the other hand offers several paragraphs of content about their [river] ports" but so did the Croatian one offer detailed information about their Jasenovac camp.

3. What about the resulting output being un-PC, with "GPT-3 Abstract WP making racist jokes, condoning terrorism, and accusing people of being rapists"? (Who is "a terrorist" in the first place?)

4. Has it been discussed somewhere else? Is there a FAQ for such advocati diaboli as myself?

Zezen (talk) 09:54, 14 December 2021 (UTC)Reply

The underlying question seems to be: How will the community decide on how to write content, especially for complex topics? Just like usual - via discussions, guidelines, policies, style guides, and other standard wiki processes. Quiddity (WMF) (talk) 19:00, 15 December 2021 (UTC)Reply
No. So mea cupla for being unclear.
The underlying challenge (question) is the implicit abandonment of the WP:PILLARS thereby.
I can see now that Heather Ford had mentioned parts of these Points 1 and 2 above in Wikipedia@20's chapter, "The Rise of the Underdog": To survive, Wikipedia needs to initiate a renewed campaign for the right to verifiability. ... the ways in which unattributed facts violate the principle of verifiability on which Wikipedia was founded ... [consumers] will see those facts as solid, incontrovertible truth, when in reality they may have been extracted during a process of consensus building or at the moment in which the article was vandalized... search engines and digital assistants are removing the clues that readers could use to (a) evaluate the veracity of claims and (b) take active steps to change that information through consensus...
Add to this Quid est veritas? or these basic THE TRUTH enwiki essays, including the WP:NOTTRUTH policy itself, for a more eloquent and strategic challenge @Quiddity (WMF) and the other gentle readers.
Should you dislike epistemology, the "rights" and anything similarly vague, abstruse, perplexing or nebulous referenced in my original comment, do answer my challenge 2:
generate text(superlative, language) : text -> Crimea is the largest penisula in Russia.
TRUE or FALSE? Shall we accept Abstract Wikipedia generating it?
If you say TRUE (or FALSE, but yes, we accept it), then move to Challenge 3 with these Wired examples... Zezen (talk) 12:18, 16 December 2021 (UTC)Reply
Return to "Abstract Wikipedia" page.