• Olivier5
    6.2k
    Enough with empty philosophical discourse about language! :-) This is for empirical data, models that are testable, tools for machine translation... The topic of this thread is the scientific method applied to language, in other words linguistics.
  • Olivier5
    6.2k
    First, a brief historical overview of the discipline(s):

    The Journey of a Psycholinguist

    In this episode of The Joy of Science, Shambhavi Chidambaram speaks to Professor Shravan Vasishth, an Indian-origin professor of psycholinguistics at the University of Potsdam in Germany. [...]

    Shravan Vasishth (SV): Alright, so there’s a general misconception about what linguistics is in the lay world. What typically happens to me in day to day life is that when somebody finds out that I’m a linguist, the first question they ask me is how many languages I speak. From a professional perspective, linguistics is a very different ball game. It’s all about the study of linguistic structure. The meaning of language, language constructs, the syntactic structure of sentences, and the study of sound. Studying the statistical properties of language is another thing. Linguistics also includes the historical development of languages, and the connections between languages, regarding how language contact changes.

    SC: But linguistics, as a field, seems to have had quite a rapid change in its approach in the last one hundred years or so. Where did it start and how has it proceeded?

    SV: Linguistics, as we know it today, is very different from the way it was in the 1800s and before that. All the action really began with Pāṇini, and his work, the Aṣṭādhyāyī.

    Editor’s Note: The Aṣṭādhyāyī is Pāṇini’s most famous work, a text on Sanskrit grammar. Known for the brevity, sophistication and logical perfection of its rules, and has influenced linguistic theory up to the present day.

    SV: Once that became known in the West, a lot of Indologists started studying not just Sanskrit but also the connection between European languages and Indian languages. They discovered that there were some historical connections there, and this led to the development of something called historical linguistics—the study of how languages connect with each other, how they have evolved over time and what the proto-language was. They tried to infer the properties of the dead languages, like the classical Vedic language and so on.

    SC: Starting with the Aṣṭādhyāyī, now has the study of languages changed over the centuries?

    SV: The Indologists, who were mostly Europeans, were studying these connections, and this study of language, from a language comparison perspective, slowly evolved into a school of research called structural linguistics, where people started to look at the linguistic structure and develop templates and patterns that they could draw generalizations from. This then became a very big area of research, especially in America, because of Leonard Bloomfield and others, who were inspired by the work of Ferdinand de Saussure, a Swiss linguist. Saussure wrote a very famous textbook in the 1800s that had a big influence on the American linguists and this led to the creation of structural linguistics. Up to the early 1900s, structural linguistics was the dominant paradigm both in the West and in India. The Christian missionaries played a major role in developing this methodology because their goal was to translate the Bible into local languages. They would go into parts of the world that were obscure for them and study the language there using this methodology. They’d eventually figure out the sound structure, the linguistic structure of the language and then write the Bible in that language. In the 1950s, a now-very-famous linguist turned up—Noam Chomsky. He created a new paradigm in linguistics, which came to be known as generative linguistics. This methodology involved consulting your own intuitions about what is possible and what is not possible in language. This was a brilliant new way to unpack the structure of languages, of your own native language. You could sit down and think about your language and develop a very elaborate syntax and semantics and phonology—the different sound patterns. Thus, it became mainstream and is still the mainstream approach in linguistics. It’s now called theoretical linguistics to distinguish it from the other strands which developed later.

    While Chomsky was developing the generative linguistics approach, in the 1950s there was a computer revolution. That’s when all the action started on the computer side and what happened was that people started to develop machine translation systems to automatically translate languages. It was a very ambitious program. Although the initial attempts were miserable failures, today this has become a very sophisticated new approach, and you see that in tools like Google Translate. These systems are able to do very sophisticated translations and are really very good at it! This area is known as computational linguistics. Linguists actually work for Google and other companies, developing the basic linguistic software. All this has its origins in this core linguistic work. In parallel to that, what happened in the 1960s was that linguists started working with psychologists, and started talking to them about how language works in the brain. This field went in a parallel direction from classical theoretical linguistics. It was an independent stream of research that eventually became psycholinguistics. There are connections between psycholinguistics and linguistic theory, but psycholinguistics is to a great extent an independent body of work, that is slightly divorced from theoretical ideas.

    SC: So, that same linguistic structure that people had been studying theoretically for so long, they just took it to an empirical level that way?

    SV: Exactly. Data entered linguistics from the computational side.

    SC: Great! But what about related fields like phonetics? Where do they come into this, exactly?

    SV: Well, phonetics is actually a part of linguistics. You can think of linguistics at several levels. There’s sound—that’s where phonetics and phonology come into, which are parts of core linguistics. Then there’s semantics—meaning—that is also a core part of linguistics and involves studying formal logic to understand how language and meaning are put together. Then there’s syntax—the structure of words— the word order and how sentences are structured. These are the core areas but there are also related areas that fall under linguistics, like pragmatics, where you study implied meaning—that is, what is not actually said, but what is inferred from the sentence.

    SC: That’s sharing a border with poetry, isn’t it?

    SV: That is indeed where all the action comes from. In poetry, there are all these implied connotations, which you don’t actually state but feel from the language. It is also part of linguistics.

    SC: Then it should also border with the social sciences, shouldn’t it? Because you’ve also got politics there, with these kinds of implied meanings, where certain things mean something specific for certain groups of people. It is probably at the overlap of semantics, linguistics and politics, for example.

    SV: Yeah, that social and cultural aspect of language also evolved into its own field and that’s called sociolinguistics. The cultural and social implications of language. [...]

    More:
    https://researchmatters.in/news/joy-science-journey-psycholinguist
  • MSC
    207
    Enough with empty philosophical discourse about language! :-) This is for empirical data, models that are testable, tools for machine translation... The topic of this thread is the scientific method applied to language, in other words linguistics.Olivier5

    Thank you for posting this! Really amazing read.

    Where does Semiotics come into play with linguistics?
  • Olivier5
    6.2k
    Glad you liked it. I was fascinated by some bits and pieces, like the Aṣṭādhyāyī, or the idea that missionaries translating the Bible were a major force in the discipline.

    Primarily, this is to flag that philosophers discussing language could make much more sense if they followed the development of the related scientific disciplines.

    I think semiotics studies signs and symbols, over and beyond articulated language, their evolution, etc.
  • MSC
    207
    Someone else was telling me recently to check out psycholinguistics.

    I was also having a discussion with @schopenhauer1about animal linguistics. It absolutely fascinates me. Is there such a thing as Comparative Psycholinguistics yet?
  • Olivier5
    6.2k
    On the advice of Apokrisis I’ve been reading about Pierce theory of signs, and the importance of interpretation by a subject, which according to Howard Pattee lies at the core of the ‘hard problem’. (Pattee doesn’t solve the problem but exposes it quite well in Cell Phenomenology: The First Phenomenon by Howard Pattee.

    I think a philosophy of life as language along the lines he exposes can work quite well to understand both human language and ´biolanguages’. Currently on:

    On the Origin of Language - A Bridge Between Biolinguistics and Biosemiotics, by Marcello Barbieri

    Barbieri is an embryologist from Ferrare University. So far (p.3) he gives a good primer on the ‘school’.

    Interestingly (for some), there's a neo-thomist connection, through Jacques Maritain and Thomas Sebeok.
  • Daemon
    591
    Do you have a special interest in tools for machine translation Olivier?
  • god must be atheist
    5.1k
    when somebody finds out that I’m a linguist, the first question they ask me is how many languages I speak.Olivier5

    I would have asked instead, what is your favourite object or ice cream to lick. But I am me.
  • Mikie
    6.2k
    This methodology involved consulting your own intuitions about what is possible and what is not possible in language. This was a brilliant new way to unpack the structure of languages, of your own native language.Olivier5

    What nonsense. Chomsky's approach isn't anything like this whatsoever.
  • BC
    13.2k
    I have not found linguistics of much interest, but thanks for the historical piece.

    Google Translate is useful --no doubt about that. Google Translate was introduced around 2006. I don't remember when I first used it, but it has gotten a lot better since. It still has significant limitations, depending. Sometimes it can't render a term at all, or renders it in an exceeding clumsy way. Specialized terms present more problems than everyday text.

    Applications which turn voice into text are quite impressive too, as are apps which can interpret handwriting. I tried it on a handheld device back in the 1990s, and was impressed that it could do a pretty good job interpreting cursive writing that was not-too-carefully performed. .

    Google Ngram is one of my favorite tools. It uses the vast text scanning project that has been going on for at least 15 years. 30,000,000 books have been hoovered into its servers. From the billions of words used, Ngram charts the use-frequency of words and short phrases since 1800. Knowing word frequency is useful for, among other things, studying readability. And it's just plain interesting. The term "readability" was scarcely used before 1900, but there were peaks in its use in the 1960s and 1980s.

    One of my pet peeves is writers using inordinately obscure words. "canaille" for instance. It's a derogatory word meaning 'the common people'. The English got it from the French who got it from the Italians; it means 'pack of dogs'. It was most popular around 1820, so when a contemporary author uses it, you know the writer is 'putting on the literary Ritz' -- the hotel, not the cracker.

    I have pleonastic tendencies. Peak pleonasm was probably before 1800. It's been receding in popularity ever since.
  • BC
    13.2k
    What nonsense. Chomsky's approach isn't anything like this whatsoever.Xtrix

    The mods will punish Oliver5 for writing reckless declarative sentences. Or Xtrix, once the most faulty declarative statement has been determined. I'm an agnostic on the matter of Chomsky's linguistic theories.
  • Olivier5
    6.2k
    Google Translate is usefulBitter Crank

    There’s this new tool, DeepL, which is a bit better. I use it a lot. These things are similar to the Chinese room, in their principle.

    Une canaille in French now means a person without any morality, someone you can’t trust under any circumstances. I didn’t know it came from Italian.

    I always try to squeeze one inordinate word in every paper I write, for the fun but also to wake the reader up. The editors often wack it off. The last time, I used « impelent ».

    Have tried dictation but it doesn’t work well for me.
  • Olivier5
    6.2k
    Do you have a special interest in tools for machine translation Olivier?Daemon

    I use them professionally. DeepL is my favorite.
  • Olivier5
    6.2k
    It’s not me talking but professor Shravan Vasishth, an Indian-origin professor of psycholinguistics at the University of Potsdam in Germany.

    How would you describe Chomsky’s approach?
  • BC
    13.2k
    I like looking up those inordinately obscure words, and I keep a list of them--pearls snatched from the jaws of swine.

    I have a (now retired) interest in common short words. When I was working in AIDS prevention back in the mid 1980s, I found it frustrating and annoying that public health pros found it difficult to simplify information for readers or audiences who sometimes were marginally literate. So I started investigating readability, word frequency, "Basic English", and the Anglo-Saxon words that make up the core of English, along with common words acquired from French, making Middle English into a richer language (thanks, of course, to William the Bastard).

    One can write pretty good, best-selling texts in predominantly Anglo-Saxon and Middle-English; Tolkien wrote several volumes of it. Granted, Tolkien didn't have the problem of explaining viruses, the risks of anal sex (which is probably what the orcs preferred) or proper condom use.

    My plan was to write a manual on how to discuss publicly important matters in widely readable text. I didn't finish that project, though I did write a computer program to evaluate readability and to sort the words in texts into several buckets, which involved making a lot of lists. This was pre-WWW. One can now find tools on line to do much of what I intended.

    Unfortunately, the book on how to put the hay down where the goats can get at it, never got written.
  • Daemon
    591
    One of my translation agency customers told me he is now receiving texts translated by DeepL from his customers, with a request for the language to be improved. Meanwhile he has freelance translators sending in translations made with DeepL which they claim to have translated manually.
  • Olivier5
    6.2k
    Yes, you see some of that happening.
  • Olivier5
    6.2k
    One can write pretty good, best-selling texts in predominantly Anglo-Saxon and Middle-English; Tolkien wrote several volumes of it. Granted, Tolkien didn't have the problem of explaining viruses, the risks of anal sex (which is probably what the orcs preferred) or proper condom use.Bitter Crank

    As a native French speaker, one of the tricks in writing English is to avoid overusing words of French origin and tap into words of Germanic origin, often shorter and more colloquial. Reading Tolkien can help expand one's vocabulary in those directions, also Kipling. But an English text avoiding entirely words of French origin would be just as artificial as one exclusively relying on them.
  • TheMadFool
    13.8k
    My two cents worth:

    The way it seems to me, the scientific method operates at two levels:

    Step 1. Detecting patterns in observational data and giving it a mathematical form. The first part of this can be accomplished, and probably has been, by linguists but I don't know whether the detected patterns in language can be expressed mathematically.

    Step 2. Formulating hypotheses to explain the patterns. Linguists will need to come up with testable hypotheses on whatever aspect of language they're working on, complete with a list of predictions that can be falsified.

    Step 1 looks easier than step 2.
  • BC
    13.2k
    an English text avoiding entirely words of French origin would be just as artificialOlivier5

    It would be artificial and it would be inordinately inelegant. However complete the Old English corpus of words might have been in 800 a.d., it has been irretrievably obsolete for a long time. I do not speak French (hélas; jugez-moi à la légère) but my guess is that Old French had a much larger, more sophisticated corpus than Old English did. Middle English had a much larger corpus, and then became Modern English in the 16th and 17th century as writers reached back into Latin and Greek for word-building materials. [One of the courses for training medical transcriptionists is a study of relevant Greek and Latin vocabulary.]
  • Mikie
    6.2k
    It’s not me talking but professor Shravan Vasishth, an Indian-origin professor of psycholinguistics at the University of Potsdam in Germany.

    How would you describe Chomsky’s approach?
    Olivier5

    Chomsky's approach is best described as "biolinguistics." He's interested in universal grammar, the innate (genetic) structure of language. This presupposes the brain and treats language as a "system" (on par with the visual system), with its own set of properties -- properties described using theories of computation and mathematics. It has nothing to do with "intuition" about "what is possible or not possible." Whoever Vasishth is, he has no clue what he's talking about in this regard.

    https://www.youtube.com/watch?v=vbKO-9n5qmc
  • BC
    13.2k
    What you say sounds a lot closer to what people understand as Chomsky's approach.

    It makes sense to me that the capacity and operation of language would reside in the brain as directed by our species' genetics. Our very complex brains were not built 'de novo'. The need for, and means to communication existed in our predecessor species. We are not born with a ROM-stored language (Chinese, Urdu, Swahili, Norwegian...) but we are born with instructions to acquire the available languages which present themselves to us. We don't have to be taught' it's more like "language falls into place in our brains".
  • Mikie
    6.2k
    What you say sounds a lot closer to what people understand as Chomsky's approach.Bitter Crank

    If by "people" you mean most laypeople, I don't know where you get that. Besides some vague notions of innateness ("nativist") tendencies and the LAD (language acquisition device), "people" don't really have a clue. And again, I'm essentially repeatedly verbatim what Chomsky himself says.

    It makes sense to me that the capacity and operation of language would reside in the brain as directed by our species' genetics. Our very complex brains were not built 'de novo'. The need for, and means to communication existed in our predecessor species. We are not born with a ROM-stored language (Chinese, Urdu, Swahili, Norwegian...) but we are born with instructions to acquire the available languages which present themselves to us. We don't have to be taught' it's more like "language falls into place in our brains".Bitter Crank

    Besides taking minor issue with terms like "communication" and "instructions," that sounds about right. Chomsky repeatedly will say that his main claim shouldn't be controversial and is, in fact, a truism. Yet it continually gets misinterpreted.
  • Olivier5
    6.2k
    my guess is that Old French had a much larger, more sophisticated corpus than Old English didBitter Crank

    I see no reason to assume such an imbalance. What I’d say is that due to the Norman conquest, two languages merged to form modern English: old French and old English. This means that modern English has more flexibility and power than any other language, me think. Hybrid vigor.
bold
italic
underline
strike
code
quote
ulist
image
url
mention
reveal
youtube
tweet
Add a Comment

Welcome to The Philosophy Forum!

Get involved in philosophical discussions about knowledge, truth, language, consciousness, science, politics, religion, logic and mathematics, art, history, and lots more. No ads, no clutter, and very little agreement — just fascinating conversations.