Hierarchy of Language. Abstract. Geller’s Differential Linguistics and his patents are studied, they are shown as the development of Wittgenstein’s Tractatus Logico-Philosophicus and Moore’s Truth and Falsity. The only hierarchy of language – text’s context and surrounding it multiplicity of subtexts – is thoroughly examined; nature of text’s predicative phrases, clauses, sentences and paragraphs is clarified. The novel approach toward Knowledge is proposed: Knowledge is Nothing, singularity; there is the Nothingness of silence, none-predicative definition and text. The practical proofs of Differential Linguistics as the first ever verified Humanitarian theory are demonstrated. Keywords: Knowledge; Wittgenstein; Moore; Russell; Differential Linguistics; Hierarchy; Clause; Sentence; Paragraph; Weight; Emotion; Practical proofs; Nothing; Patents *** Geller suggests that only one hierarchy in language exists: text’s context and surrounding it multiplicity of subtexts; where they always are composed by sets of paragraphs, sentences; clauses and predicative definitions (US Patent 8516013). That hierarchy was initially established by Ludwig Wittgenstein in his famous Tractatus LogicoPhilosophicus, developed, detailed and finalized (as Differential Linguistics and patents) by Ilya Geller and practically implemented by Google (Sergey Brin and Larry Page), IBM, Microsoft, LexisNexis, all other Internet and some database companies. I. 1. 1 Mathematics. Sets Theory1 is the theory that Geller use; where the basic units of the sets are words (Geller 2005): Set Theory is the branch of mathematical logic that studies sets, which are collections of objects. none-predicative definitions2 are the words; predicative definitions are always pluralities of the none-predicative definitions; clauses may both be pluralities and singularities of the predicative definitions; sentences may both be pluralities and singularities of the clauses; paragraphs may both be pluralities and singularities of the sentences; texts are always pluralities of the paragraphs. The sets always intersect, they always compose intersections3, the basic units – words - do not. 2. The predicative phrases are vectors of meaning. The intersections of sentencesparagraphs determine a) the vectors directions (the texts’ sense) and b) emotional significances of the vectors, as weights (US Patent 8504580). Wittgenstein came to the same idea, at 3.261: ‘…the definitions show the way.’ I am unaware of any author who wrote on sets of weighted vectors filtered and aimed by intersections of clauses-sentences-paragraphs as sets (Clark 2014). II. Natural Language. Geller assumes that there is only one Living Language [7]: not a part of it can be ignored and/ or omitted; everything in Living Language plays absolutely crucial role, all its words, punctuation marks, signs, etc.4 The traditional for artificially constructed (from Living) Natural Language the attempt to exclude so called ‘stop words’5, during texts preparation for further utilization by computer, is rejected. III. Axiom. Geller’s Axiom: words exist (Geller 2005). I don’t think the Axiom can be challenged. IV. Definitions. Now, I explain what Ludwig Wittgenstein and Geller mean by the abovementioned Linguistical categories: 1. Predictive definition: Poincare. Wittgenstein said on the words, at 3.26: ‘The name cannot be analysed further by any definition. It is a primitive sign’. 3 In mathematics, the intersection A ∩ B of two sets A and B is the set that contains all elements of A that also belong to B (or equivalently, all elements of B that also belong to A), but no other elements 4 For instance, in his patent US 8447789 Geller mentioned that articles are parts of predicative definitions.. 5 LexisNexis: ‘Stopwords are the most common words in the English language (and, the, but, etc.). Stopwords are not words that are generally searched for by reviewers. Eliminating stopwords from the index ensures that searches run much faster and efficiently.’ http://help.lexisnexis.com/litigation/ac/cn_classic/index.html?update_the_stopwords_list.htm (Most of Internet references used in the article are, most probably, obsolete and do not exist at the moment of publication.) 2 ‘A predicative phrase is a predicative definition preferably characterized by combinations of nouns and other parts of speech, such as a verb and an adjective and an article (e.g., the-grey-cityis) (US Patent 8447789).’ Geller decided that a predicative phrase6 may have undefined number of words as parts-of-speech, and it should answer at least these three questions: ‘What?’, ‘What is going on with ‘what’?’ and ‘How does it look like?’ (Geller 2003, Geller 2005 and US Patent 8504580). Geller pedantically followed Poincare’s instructions, filing his patents and writing his articles: ‘Logical inferences alone are epistemically inadequate to express the essential structure of a genuine mathematical reasoning in view of its understandability… As a consequence of the logical antinomies, one should avoid any impredicative concept formation (Poincare 1920).’ All authors in Philosophy (of Language) and (Computational) Linguistics sooner or later discuss (contextual) phrases and/ or vectors of sense: Phrases may have undefined number of words; Phrases are obtained from sentences; Unclear the role of clauses into the obtaining of phrases; Phrases’ words belong to unknown number of parts-of-speech, usually one, two or three, the phrases may not be predicative phrases (have no verbs); Not a method provided how to identify objectively, without any human intervention, the parts-of-speech of words; None of the authors, Wittgenstein included, uses Sets Theory and Differential Analyses, finding how sets of phrases convey text meaning; None of the authors linked weights of vectors to their emotional strength (Clark 2012, Clark 2014, Beltagy 2014, Boleda 2013, Guevara 2010 and Wittgenstein 1994). Geller operates with predicative definitions as with featureless numbers, he does not see external features. 2. Declaration. I see no point to discuss (in details) all these well-known contemporary connectionist theories, which state that meanings could be represented as appropriate vectors: - Geller has his patents, Patent Office found no prior art; - Geller’s theory is applied practically by many, if not all, major the US companies and already brought hundreds billions of dollars. However, all these abovementioned connectionist theories have not produced any practical results, brought real money: I don’t see any reason at all why I should waste my readers and mine time augmenting comments on air vibrations. 3. Clause. Wittgenstein said on predictive phrase, at 3.261 and 3.262 and Geller thinks the same: ‘Every defined sign signifies via those signs by which it is defined, and the definitions show the way… What does not get expressed in the sign is shown by its application. What the signs conceal, their application declares.’ 6 Geller said: ‘AI Clone treats each clause of a sentence as an individual sentence – the clauses are preferably determined based upon figures of speech and punctuation marks. For example, a semicolon or comma followed by a “but” may indicate a division between clauses if they separate a subject and predicate pair (US Patent 8504580).’ Wittgenstein said (at 3.31) that each part of a proposition which characterizes its sense is called an expression. However, Wittgenstein was unclear and did not explain what he meant, as Geller did, filing his patents. Did Wittgenstein mean phrases? Grammatical divisions – clauses? His ‘complex’ is too vague the definition (at 2.0201). The fact that Geller was granted the US Patent 8504580 is the undoubted evidence that Geller has the priority and there is no prior art. 4. Sentence. A sentence is a subdivision of a paragraph, separated by certain grammatical signs. All authors on Philosophy and Linguistics, since the time of Ancient Greece, without a single exception, studied sentences as the major grammatical division of language; but none of them ever analyzed their role as subdivisions of paragraphs from the prospective of Sets Theory, as sets of clauses, united into sentences, into targeting and emotional weighting of the vectors. 1. Paragraph. In US Patent 8516013 Geller equated paragraphs to passages: ‘A passage in this context can be any suitable amount of text that can be treated as a paragraph, and may actually be a paragraph… A paragraph can be a subdivision of a written composition that comprises one or more sentences, deals with one or more points/ideas, or gives the words of one speaker…’ I don’t know any attempt of any author to study the passages-and/or-paragraphs as sets of pointed and weighted vectors, their role in the pointing and weighting. 5. Subtext. Geller: ‘The term subtext is used to include information other than or in addition to actual words of text. Subtext refers to information that is not explicit in a text but is or may become something that may be gleaned from the text and/or from related text (US Patent 8504580).’ Here, Geller refers to the idea in Tractatus, at 3.263: ‘The meanings of primitive signs can be explained by elucidations. Elucidations are propositions which contain the primitive signs. They can, therefore, only be understood when the meanings of these signs are already known.’ Geller, in his US Patent 8447789, proposed to use the most appropriate definitions from dictionary as subtexts; these definitions are always paragraphs7: no prior art. 6. Context. 7 Later Mr. Geller proposes to make use of any thematically close to the given, pointed at the same direction paragraphs of any available texts (US Patent 8516013). Geller: ‘Context provides the textual description of present circumstances for subjects/objects; while subtext provides textual description of related to context descriptions of circumstances for the same subjects/objects (US Patent 8504580).’ Wittgenstein left a hint, at 3.3: ‘Only the proposition has sense; only in the context of a proposition has a name meaning.’ All authors in (Computational) Linguistics deal with context, all mention it: none see they have no sense without subtexts, none analyze its role into aiming and weighting of vectors. 7. Intersection. The intersections between predicative phrases, clauses, sentences and paragraphs are formed by synonymous predicative definitions (US Paten 8516013). The fact that Geller was granted the US Patent is the proof he is the first and no references to prior art. 8. Parts-of-speech. The vectors of meaning should be predicative definitions, but predicative definitions may not be the vectors. The detection of parts-of-speech for words can easily be performed (by computer) using the method of US Patent 8447789: a) Words are extracted from each predicative phrase from each clause, b) For each of the extracted words, a connection to a definition is retrieved from a dictionary, c) The definition is always a paragraph; d) The definition is then profiled the method of US Patent 8504580, e) The profile of the definition paragraph is then compared, according to an appropriate compatibility algorithm, to the profile of the paragraph from which the words extracted, and some surrounding paragraphs, f) If the algorithm is satisfied, the predicative phrase is vector. People determine parts-of-speech the same way. The fact that Geller was granted the US Patent shows that there is no any prior art. 9. Statistics: Weights. In his US Patent 8504580 Geller proposed the method of Internal weighting8 of vectors: the weight refers to the frequency that a context phrase occurs in relation to other context phrase. For instance, there are two sentences: a) ‘Fire!’ 8 ‘Internal’ means that statistic is obtained directly from texts. b) ‘In this amazing city of Rome some people sometimes may cry in agony: ‘Fire!’’ Evidently, that the vector ‘Fire!’ has different importance into both sentences, in regard to extra information in both. This distinction is reflected as the phrase weights: the first has 1, the second – 0.12; the greater weight signifies stronger emotional ‘acuteness’. Remark. In US Patent 6199067 Geller suggested External statistical measure, popularity; where popularity is the statistics on how predicative definitions are popular among people/ how often they use them; only Internet can provide this kind of data: obviously, there is no popularity into databases (see Google’s US Patent 20110040733). 10. Reconstruction. All one-two words clauses always implicitly refer to omitted words: the implicitly mentioned words could be restored, based on Geller’s method (Geller 2004). Neither prior art nor new publications on the preposition ‘in’ – humans add it automatically if they know9. V. Philosophy (of Language). 1. Text. a) Only text contain causes and consequences at the same time – nobody knows what shall happen in future, but text does; b) Text is eternal: only physical foundation of text vanishes, text does not; c) If text is re-written – it’s a new text, the old one stays forever the same; d) Text is one of two existing singularities in our Universe – it is absolutely unique, all copies of it are the same (only their physical foundations are different). Another eternal singularity is quanta of light: light quanta are many and one. Moore called this phenomenon ‘Identity of Indiscernibles’: how to distinguish the identical, many but the same photons and identical copies of one and the same text (Moore 1993)? We have a paradox and phenomenon right before us, which have not been noticed and dealt with in Science. Text is something ephemeral, transcendental: it does not exist as things and human do. 1. Knowledge. In Differential Linguistics it’s told that an isolated-separated word is a none-predicative definition (in Poincare sense) – it is a noun, Knowledge. As soon as the isolated word is incorporated into a predicative phrase it becomes an opinion. Indeed, previously Geller referred to this example: a none-predicative definition ‘ggffrrtte’ (Geller 2004). Who can tell what Geller meant by ‘ggffrrtte’ unless it is included in a phrase and, consequently, explained by multiple words and subtexts? Moore, in Truth and Falsity wrote on Knowledge: ‘So far, indeed, from truth being defined by reference to reality, reality can only be defined by reference to truth (Moore 1993)’. Wittgenstein at 1.1 and 1.3 clarified: ‘The world is the totality of facts, not of things… The facts in logical In the structure of predicative definitions the adjective/ preposition ‘in-interior’ indicates the familiarity with what are the issues of definitions. 9 space are the world.’ The pillars of Analytical Philosophy divided reality and truth, world and facts, as Geller divided Knowledge from opinions. As you can see Russell was confused telling what Knowledge is, as well as all other authors without a single exception: ‘…knowledge might be defined as belief which is in agreement with the facts …and no one knows what sort of agreement between them would make a belief true (Russell 1926).’ Merriam Webster, for instance, is also lost: ‘information, understanding, or skill that you get from experience or education; awareness of something; the state of being aware of something’. Knowledge is Nothing in Hegel’s sense: ‘...pure being is the pure abstraction, and hence it is the absolutely negative, which when taken immediately, is equal nothing (Hegel 1991). Geller understands the absence of words, silence as the Knowledge of everything at once (Geller 2005, Geller 2006). Wittgenstein thought the same: ‘Whereof one cannot speak, thereof one must be silent’. Thus, there are two kinds of Knowledge: Knowledge as silence and as none-predicative definitions. Text is not Knowledge but it is Nothing, though. Geller thinks that Knowledge is the limit: the function of Knowledge changes its nature as soon as it reaches the limit. This allowed Geller to use Differential Analyses into (Computational) Linguistics. 2. Differential Linguistics. Out of the given set of all possible Knowledge E, one can relate to an isolated word – let’s call it x – designated as y=f(x). One can then say that for E a function of Knowledge is provided: y = f(x), х Е According to Geller the function of Knowledge is a differentiable function for which a derivative from the function y’=f’(x) exists, as paragraph: Fig. 4 The second derivative y’’ is text. Remark. It looks like we know and can operate only with integrals (paragraphs), the function(s) of Knowledge and its argument(s) are inside our brains. We can track the function only discretely, by the discrete requests for information: sense data10 and other intermediaries between mind and the Universe exist as consciousness, as electrical-magnetic notions into brain neurons between internal analyses and external physical acts. 10 Sense data are supposedly mind-dependent objects whose existence and properties are known directly to us in perception (Russell 1912). 3. The proofs. 1. Personalization. Personalization is the selection of right subtexts; they implicitly explain what the words are and how they explicitly should be used. 2. The first proof. 13 years ago Geller was granted US Patent 6199067, which contains the idea of abovementioned personalization, as well as on External weights and search for opinions by predicative definitions. In 2010 Geller got the Summary Judgment, suing Google. Google’s own expert, professor Peters ‘…specifically testified at his deposition that he identified the one major, relevant difference that he perceived between the prior art and the asserted claims: Q. [...] Does your report set forth the differences between any of the asserted claims and the prior art? A. Well, there are very few differences to be honest. The only one that I found that – was the use of part-of-speech tagging in linguistically profiling users and stored data files and queries for these purposes, for the purposes of personalized information retrieval. And the report does actually specifically address that difference (PA Advisors v Google ‘Defendants…’ 2010)’. As you can see Google uses both Geller’s personalization (Google creates profiles on users, collecting their subtexts), External weighting and predicative definitions11 (PA Advisors v Google ‘Summary…’ 2010). Chief Judge of the United States Court of Appeals for the Federal Circuit Randall Rader confirmed it all. Therefore, the first time in History the first ever Humanitarian theory of Differential Linguistics was firmly confirmed, at 3rd of March, 2010. 3. Other proofs. In US Patent 8504580 instead of External popularity Geller proposed to use Internal weights12. In few months after Geller filed the patent application Brin and Page (Google) began to apply his idea practically, as usual without referencing to Geller’s priority or licensing his technology: ‘The significance of the subsequent keywords in the list are calculated based on the number of occurrences of each keyword compared to the number of occurrences of the keyword that appears most on the site13.’ Google recently confirmed that: ‘Keywords of two or three words tend to work most effectively.’ https://support.google.com/adwords/answer/1704371?hl=en 12 Please read how it was done before Geller: ‘Frequency of a word or phrase in a particular section times the manually assigned weight (importance) given to that section. The weights for each word or phrase were then summed across sections (D'hondt 2013).’ 13 http://www.canonicalseo.com/keyword-significance-details-in-google-wmt/ 11 Having Internal statistics Google is able to establish the internal connections between advertisements (texts and images) and what Brin and Page call ‘landing pages’ (texts and images), for Google AdWord and AdSense, as Geller described in his US Patent 8516013. Consequently, one more new practical and very lucrative development of Differential Linguistics is provided, by the same notorious Brin and Page. LexisNexis Semantic Search powered by PureDiscovery™ also applies Internal weights: ‘You can… assign relative importance (weighting), eliminate concepts you don’t wish to use…14’ IBM created Annotation Query Language (AQL)15: the new language parses paragraphs-sentences on clauses, counts relative weights, find parts of speech from surrounding text, uses dictionary for indexes. Microsoft does the same: ‘The Bing Adsquality score shows you how competitive your ads are in the marketplace by measuring how relevant your keywords and landing pages are to customers' search queries and other input. The quality score can range from 1 to 10, with 10 being the highest. You can see the quality score on the Keywords, Campaigns, and Ad groups tabs on the Campaigns page16.’ Hewlett Packard and SAP do use Geller’s Internal measure as well as well as the presented here Hierarchy of languge17. There is no other way to get Internal weights as using abovementioned Geller’s hierarchy. For instance, the phrase ‘amazing Rome in agony’ makes no sense, considering the above example: phrases must be obtained in regard to their clauses. 4. US patent 8504580: another proof. The same US patent 8504580 contains the idea of searching for information based on paragraphs. Indeed, search for whole text rarely makes sense because one and the same text may concentrate on many not wanted topics; it’s better to search for one and specific theme. Eight days after18 Geller filed the patent Google told that it began showing ads in a form of advertising known as behavioral targeting19. Before Geller formulated and published his 14 LexisNexis began to use Internal weighting at the Fall, 2009 after Geller applied for his US Patent 8504580: http://www.lexisnexis.com/en-us/about-us/media/pressrelease.page?id=125674399689744#sthash.LjjwhaI0.dpuf 15 http://www01.ibm.com/support/knowledgecenter/SSPT3X_1.3.0/com.ibm.swg.im.infosphere.biginsights.doc/doc/big insights_aqlref_con_aql-overview.html 16 http://advertise.bingads.microsoft.com/en-us/help-topic/how-to/50813/what-is-my-quality-score-andwhy-does-it-matter 17 http://www.autonomy.com/technology/idol-functions/conceptual-search and http://help.sap.com/hana/SAP_HANA_Text_Mining_Developer_Guide_en.pdf 18 Patent Reform Act of 2009 switched U.S. patent priority from the existing ‘first-to-invent’ system to a ‘first-to-file’. 19 New York Times at http://www.nytimes.com/2009/03/11/technology/internet/11google.html?_r=4& and ‘…we recommend including more text-based content about these topics, including complete sentences and Differential Linguistics and his patent applications nobody, including Brin and Page, searched for paragraphs because it’s not possible without Internal weights. (To obtain weights it is the absolute must to get phrases from clauses but Geller has the patent.) Google behavioral targeting is another capital proof for Geller’s Differential Linguistics. VII. Discussion. 1. Geller found that Linguistics should use Set Theory and Differential Analyze. 2. Geller discovered how the only hierarchy of language looks like: it’s the hierarchy of sets. 3. Geller discovered that the limit for the function of Knowledge is only one, singular Knowledge. 4. The problem of Identity of Indiscernibles can find its solution in the assumption that there are two kinds of Nothingness and text is the third. 5. As for Humanities – they have become exact Science, as Geller declared into 2004: everything concerning consciousness and cognition can easily be fixed, measured and evaluated (Geller 2004). Actually, Google and other Internet companies do this for years. 6. Geller finished what Poincare, Russell, Wittgenstein and More started. VIII. Conclusion. Many problems of Philosophy and Linguistics are ultimately resolved. References. 1. Beltagy, I., S. Roller, G. Boleda, K. Erk, R. Mooney. (2014) UTexas: Natural Language Semantics using Distributional Semantics and Probabilistic Logic. Proceedings of SemEval 2014, pp. 796-801, Dublin, Ireland, August 23-24 2014 2. Boleda G., Baroni M., McNally L. and Pham N. (2013) Intensionality was only alleged: On adjective-noun composition in distributional semantics. Proceedings of IWCS 2013 (10th International Conference on Computational Semantics), East Stroudsburg PA: ACL. 3. Clark, S. and Stephen P. (2007) Combining symbolic and distributional models of meaning. In Proceedings of the AAAI Spring Symposium on Quantum Interaction, pages 52–55, Stanford, CA. 4. Clark, S. (2014) Vector Space Models of Lexical Meaning. A draft chapter to appear in the forthcoming Wiley-Blackwell Handbook of Contemporary Semantics — second edition, edited by Shalom Lappin and Chris Fox 5. Clarke, D. (2012) A Context-Theoretic Framework for Compositionality in Distributional Semantics. ACL Anthology, http://www.aclweb.org/anthology-new/J/J12/ 6. D'hondt E., Verberne S. and Koster C. (2013) Text Representations for Patent Classification. Computational Linguistics. 7. Geller, I. (2003) ‘The Role and Meaning of Predicative and None-predicative Definitions in the Search for Information’, In Proceedings of the Twelve Text Retrieval Conference, Washington. paragraphs, to assist our crawlers in gathering information about your pages and determining relevant ads to display.’ https://support.google.com/adsense/answer/32844?hl=en&ref_topic=1628432 8. Geller, I. (2004) ‘LexiClone Inc. and NIST TREC’, In Proceedings of the Thirteenth Text Retrieval Conference, Washington. 9. Geller, I. (2005) ‘Differential linguistics at NIST TREC’, In Proceedings of the Fourteenth Text Retrieval Conference, Washington. 10. Geller, I. (2006) ‘Answering Factoid and Definition Questions: On Information for an Object’, In Proceedings of the Fifteen Text Retrieval Conference, Washington. 11. Guevara E. 2010. A Regression Model of Adjective-Noun Compositionality in Distributional Semantics. Proceedings of the 2010 Workshop on GEometrical Models of Natural Language Semantics. Pages 33-37 12. Hegel, G. (1991) The Encyclopedia Logic, Indianapolis: Hacket Publishing Company Inc. 140-141 13. Moore, G. (1993) Selecting Writings. London and New York: Routledge. 102-103 14. PA Advisors v Google. (2010) ‘Defendants’ opposition to plaintiff’s motion to exclude the testimony of mr. Stanley Peters’, http://docs.justia.com/cases/federal/districtcourts/texas/txedce/2:2007cv00480/106358/446/0.pdf?ts=1267745976 15. PA Advisors v Google. (2009) ‘Order’, http://law.justia.com/cases/federal/districtcourts/texas/txedce/2:2007cv00480/106358/263 16. PA Advisors v Google. (2010) ‘Summary Judgment Order’, http://docs.justia.com/cases/federal/districtcourts/texas/txedce/2:2007cv00480/106358/483/ 17. Poincare, H. (1920) Science et Methode. Paris: Flammarion. 159 18. Russell, B. (1926), Theory of Knowledge. The Encyclopaedia Britannica. 19. Russell, B. (1912) The Problems of Philosophy. New York: Oxford University Press. 813. 20. US Patent 6199067. (2001) System and method for generating personalized user profiles and for utilizing the generated user profiles to perform adaptive internet searches. 21. US Patent 8447789. (2013) Systems and methods for creating structured data. 22. US Patent 8504580. (2013) Systems and methods for creating an artificial intelligence. 23. US Patent 8516013. (2013) Systems and methods for subtext searching data using synonym-enriched predicative phrases and substituted pronouns. 24. US Patent 20110040733. (2011) Systems and methods for generating statistics from search engine query logs. 25. Wittgenstein, L. (1994) Tractatus Logico-Philosophicus. Moscow: Gnosis.