CALS vs WALS: Part 2 - Nouns

A forum for all topics related to constructed languages
Post Reply
User avatar
PTSnoop
cuneiform
cuneiform
Posts: 153
Joined: 02 May 2013 00:07

CALS vs WALS: Part 2 - Nouns

Post by PTSnoop »

PART 1: PHONOLOGY
PART 2: MORPHOLOGY, NOMINAL CATEGORIES, NOMINAL SYNTAX

---

So I recently found out about CALS, the conlang world's answer to WALS. And as I noticed that all the categories were the same, and that the numbers of catalogued languages on each side were pretty similar, I started thinking that someone should go through and compare things.

And, by the great tradition of "someone should" => "I should", here we are.

For each of the features in both the WALS and CALS databases, I've converted them to percentages, then subtracted the CALS number from the WALS number. Nothing too mathematically profound, but it should still give us some interesting data. In effect, a value of +10% means that 10% of conlangs have that feature and "shouldn't" have that feature, while -15% means that 15% of conlangs don't have that feature and "should". (Pretty heavy inverted commas there - I'm not trying to be prescriptive - but still a good way of picturing things.)

And because it gives us a *lot* of data, much of it interesting, I've decided to split things up into a few posts so people don't get bogged down in numbers. In general, I've chosen the features with the most extreme positive or negative values, plus a few that I just find interesting. If anyone's curious about any features I've missed off, let me know and I'll throw together another graph.

Just comparing percentages doesn't give you the full picture by any means - an extra 10% on a feature that 75% of natlangs have will show up the same as an extra 10% on a feature that pretty much never happens. But it's not a bad start. Maybe I'll delve into some more complex statistical stuff at a later date.

PART 1: PHONOLOGY

Consonant Inventories

Image

It seems that there's a tendency towards average-sized (19-25) consonant inventories here.

People seem to be shying away more from the very small (6-14) than the very large (34+) inventories - maybe they're not seen as as interesting. ("Just one more phoneme...")

Vowel Inventories

Image

But for vowels, unlike consonants, there's a tendency away from the average. Possibly some of the huge interesting Indo-European vowel systems are pulling people away from the mean, towards larger (7+) inventories.

Voicing in Plosives and Fricatives

Image

There's a strong tendency here - 20% more conlangs have a voicing contrast throughout.

Front Rounded Vowels

Image

And again, possibly fueled by the tendency towards larger vowel systems, we see more conlangers going for the most "interesting" options.

Tone

Image

As people generally assume, lots of conlangs don't have tones. But what surprised me here was how well the languages with tone matched the natlang distribution of complexity of tone system - I'd expected the tonal-conlangers to have gone much more for big dramatic contours-and-sandhi systems over simple two-way contrasts. Maybe there are more pitch-accent langs than I thought...

Stress

Image

Fixed stress seems unpopular. But though I'd have expected to see unpredictable stress as popular, I wouldn't have expected "Right-oriented: one of the last three" to have shown up quite so strongly.

Uncommon Consonants

Image

English rears its ugly head again. Non-sibilant dental fricatives are pretty rare in natlangs, being less common than co-articulated /kp/ - but because English has them (and, if I'm honest, because they're quite a nice-sounding sound) they show up in 18.5% more conlangs than natlangs. Though, to be honest, I was expecting a larger number - the tendency away from tone was larger than this one.

COMING SOON: MORPHOSYNTAX
Last edited by PTSnoop on 12 Jul 2013 00:33, edited 2 times in total.
User avatar
Creyeditor
MVP
MVP
Posts: 5091
Joined: 14 Aug 2012 19:32

Re: CALS vs WALS: A Comparison

Post by Creyeditor »

Very good idea [:)] (though CALS is by no means representative)
Creyeditor
"Thoughts are free."
Produce, Analyze, Manipulate
1 :deu: 2 :eng: 3 :idn: 4 :fra: 4 :esp:
:con: Ook & Omlűt & Nautli languages & Sperenjas
[<3] Papuan languages, Morphophonology, Lexical Semantics [<3]
User avatar
Click
runic
runic
Posts: 2785
Joined: 21 Jan 2012 12:17

Re: CALS vs WALS: A Comparison

Post by Click »

Great work! [:D]
User avatar
decem
greek
greek
Posts: 640
Joined: 30 Aug 2012 21:31
Location: Newcastle, UK

Re: CALS vs WALS: A Comparison

Post by decem »

[+1]

awesome.
[tick] : :gbr: | [:D] : :deu: :fra: | [:S] : :esp: :ita: :bra: | conlang sxarihe
User avatar
Ear of the Sphinx
mayan
mayan
Posts: 1587
Joined: 23 Aug 2010 01:41
Location: Nose of the Sun

Re: CALS vs WALS: A Comparison

Post by Ear of the Sphinx »

Moar.
Thrice the brinded cat hath mew'd.
User avatar
Valkura
cuneiform
cuneiform
Posts: 96
Joined: 23 Apr 2013 21:16
Location: The Greater Seattle Area
Contact:

Re: CALS vs WALS: A Comparison

Post by Valkura »

Do those sites have information on other uncommon consonants, such as the bilabial trill or labiodental flap? I want to see how many other conlangs have them.
Please don't read this.
Fanael
sinic
sinic
Posts: 331
Joined: 19 Jul 2012 21:26

Re: CALS vs WALS: A Comparison

Post by Fanael »

Valkura wrote:Do those sites have information on other uncommon consonants, such as the bilabial trill or labiodental flap?
They don't.
User avatar
PTSnoop
cuneiform
cuneiform
Posts: 153
Joined: 02 May 2013 00:07

Re: CALS vs WALS: A Comparison

Post by PTSnoop »

PART 2: MORPHOLOGY, NOMINAL CATEGORIES, NOMINAL SYNTAX

Morphology was quite a short section, so I've included all the noun stuff as well. And to fit things on the graph, I've increased the y axis from ±30% to ±40%.

Head Or Dependent Marking

Image

General tendency here towards dependent-marking. But interestingly, the trend's away from "Inconsistent or other" rather than "Head marking". Clearly, we need more people to think of crazy inconsistent systems.

Reduplication

Image

This one's the main reason for my change to 40%. There's a *very* strong tendency here away from partial reduplication, scraping my limits at -39.7%.

Number Of Genders

Image

This is one of those places where we're not doing so badly. I'd have expected more of a bias towards no genders (I tend to avoid the things, myself), but if anything, we've got more than we need.

Associative Plurals

Image

Another 30%-breaker, apparently we don't like associative plurals. Or possibly (like me) we'd not really heard of them before...

Definite Articles

Image

Another place where we're not doing too badly. There's a bias towards no articles at all - plausibly to get further away from Standard Indo-European - but not as strong as I'd have thought. Maybe it's time to start reintroducing the things.

Indefinite Pronouns

Image

A strong trend away from interrogative-based indefinite pronouns. (Which is a shame, I like questions like "He ate something?" for "What did he eat?".)

Number of Cases

Image

Vague bell curve here, centered at around three or four cases, and then another big peak for the 10+ case systems. And again, it looks like we need more minimal systems and more inconsistent-borderline systems here.

Ordinal Numerals

Image

The tendency here seems to be towards the regular and consistent "one two three" and "oneth twoth threeth" systems - possibly "first twoth threeth" feels arbitrary and inconsistent. But again, natlangs prove more abitrary and inconsistent than the average conlang...

Distributive Numerals

Image

This is consistent with what we saw about people not really using reduplication before.

Conjunctions and Quantifiers

Image

Another 30%-breaker. Like for indefinite pronouns, we're seeing conlangers more likely to create separate categories instead of just blending in categories we've already got.

Adjectives Without Nouns

Image

Would it be simplistic of me to assume that the "Not without noun" bar slots neatly into the "Without marking" bar, and the "marked by suffix" into the "marked by preceding word" bar? Maybe people who would otherwise have allowed unmarked adjectives-as-nouns decided against them for ambiguity reasons, while preceding-word people decided on suffixes instead? Maybe not, but I can dream.

And and With

Image

And to finish, a nice simple graph, again matching the tendency for conlangers to create multiple categories rather than reusing existing things.

COMING SOON: VERBAL CATEGORIES
User avatar
Ànradh
roman
roman
Posts: 1376
Joined: 28 Jul 2011 03:57
Location: Cumbernauld, Scotland

Re: CALS vs WALS: Part 2 - Nouns

Post by Ànradh »

I'm quite enjoying this. It seems Iriex has a slight tendency away from the norm (it includes reduplication, 'with' is the same as 'and' etc.)

It's strange though, I would have thought that reusing existing categories was a common thing to do since it requires less work.
Sin ar Pàrras agus nì sinne mar a thogras sinn. Choisinn sinn e agus ’s urrainn dhuinn ga loisgeadh.
clawgrip
MVP
MVP
Posts: 2257
Joined: 24 Jun 2012 07:33
Location: Tokyo

Re: CALS vs WALS: A Comparison

Post by clawgrip »

PTSnoop wrote:A strong trend away from interrogative-based indefinite pronouns. (Which is a shame, I like questions like "He ate something?" for "What did he eat?".)
It's the other way around: the indefinite pronoun is based on the interrogative. So you would say "What did he eat?" but, "He ate a what" instead of "He ate something," or "He didn't eat a what," for "He ate nothing."


Based on these two posts, it looks like the two least common conlang features in Himmaswa are a similarity between conjunctions and some quantifiers with interrogatives "He didn't eat however what" = "He didn't eat anything," and interrogative-based indefinite pronouns "He ate an instance of what." The most common conlang features in Himmaswa are one-th two-th three-th ordinals, and a lack of distributive numerals.
User avatar
Xing
MVP
MVP
Posts: 4153
Joined: 22 Aug 2010 18:46

Re: CALS vs WALS: A Comparison

Post by Xing »

clawgrip wrote: It's the other way around: the indefinite pronoun is based on the interrogative. So you would say "What did he eat?" but, "He ate a what" instead of "He ate something,"
In most languages, I think it would be "he ate whatthing". In most languages, indefinites are derived from interrogatives (cf English "somewhere" and "somehow"). In only a few languages indefinites and interrogatives are identical.
User avatar
PTSnoop
cuneiform
cuneiform
Posts: 153
Joined: 02 May 2013 00:07

Re: CALS vs WALS: Part 2 - Nouns

Post by PTSnoop »

Hmm, the people on the ZBB have pointed out to me that there are, for some inexplicable reason, natlangs recorded on CALS.

This is going to throw off all my numbers - for example, it turns out my big reduplication 40% is now actually closer to 50%. I'll go through and retcon all the earlier graphs to the conlang-only numbers once I have time.
clawgrip
MVP
MVP
Posts: 2257
Joined: 24 Jun 2012 07:33
Location: Tokyo

Re: CALS vs WALS: A Comparison

Post by clawgrip »

Xing wrote:
clawgrip wrote: It's the other way around: the indefinite pronoun is based on the interrogative. So you would say "What did he eat?" but, "He ate a what" instead of "He ate something,"
In most languages, I think it would be "he ate whatthing". In most languages, indefinites are derived from interrogatives (cf English "somewhere" and "somehow"). In only a few languages indefinites and interrogatives are identical.
Yeah, I guess that's kind of what I meant, but I may have simplified it a little too much. I was thinking a bit about Japanese, (the non-native language I speak best) where it's a bit extreme. Indefinite pronouns are formed by adding the interrogative pronoun to the question particle (which also means "or")
Post Reply