New CBB

Discuss constructed languages, cultures, worlds, related sciences and much more!
It is currently Sat 18 May 2013, 16:13

All times are UTC + 1 hour [ DST ]




Post new topic Reply to topic  [ 5 posts ] 
Author Message
PostPosted: Fri 03 May 2013, 02:37 
hieroglyphic
hieroglyphic
User avatar

Joined: Thu 02 May 2013, 21:44
Posts: 25
Location: San Antonio, TX
Many aspiring conlinguists encounter difficulty when they try to flesh out the syntax of their constructed languages. I myself have experienced this, as there seems to be an aura of ambiguity around describing the syntactical configuration of a language.

Phonological descriptions of languages are more or less standardized with a set of certain basic parameters which must be included within the phonology, but this is not necessarily the case with syntactical description.

After some trial and error, I've come up with a type of notation that has helped me describe the syntax of my languages more concisely. Syntax Descriptor Language(SDL, for short), might come in handy, so I'm posting the specifications for anyone to use as they see fit.

The descriptor "S(1) := [arguments here]" defines the syntax for a single sentence type. It can be read as: "Sentence type one is composed of [arguments]", where the arguments are the individual components of the sentence.

So:

S(1) := NP VP

Can be read as "Sentence type 1 is composed of a noun phrase and a verb phrase." We can further define the individual arguments of S(1) like so:

NP := N -mod1 -mod2 -mod3 (PP)
VP := V -mod1 -mod2 -mod3 (PP)

Now, let's go through these statements step by step: The first argument of a syntactical descriptor is the head of the phrase. An argument that is preceded by a dash(-) is an optional modifier. And an argument enclosed in parenthesis is an optional argument. Technically speaking, case doesn't matter, but I like to keep modifiers in lowercase, and arguments in uppercase, which helps differentiate them.

So, the first line could be read as:

"A Noun Phrase is composed of a Noun, which can be modified by [modifiers], and can be followed by [arguments].

Furthermore, direction marking can be used on the modifiers, to signal which side of the head they fall on. Thus:

NP := N -adj.l -num.l -det.r would be written like this:

"Three shaggy dogs some."

While:

NP := N -num.l -adj.l -det.l would be written hence:

"Some three shaggy dogs."

Overall, the basic notation is very simple. You have the item being described on the left side of the equation, and you have the components of description on the right side of the equation. The first argument is the head of the phrase, and arguments preceded by a dash or contained by parenthesis are optional arguements.

There may be some confusion between arguments in parenthesis and arguments by dash. The simple explanation is this: Arguments by dash are not necessarily fixed in their position, while arguments in parenthesis come as they are encountered.

Thus:

NP := N -num -adj -det

Simply states that the head noun is modified in terms of number, adjectival quality, and definite determination. While:

NP := N (num) (adj) (det)

Indicates that the head noun is precisely followed by a number, then an adjective, and finally, a determiner, if they are all used. And last, but certainly not least:

NP := N -num.r -adj.r -det.r

Determines the directional alignment of each modifier, in the order that they are given. Thus, an adjective comes to the right of the head, and a number comes to the right of the head, but the number comes to the left of the adjective, etc. Now then,

A Note on Terms and Sentences

I have used somewhat common identifiers for these examples. By convention, I feel that you should be able to use whatever identifiers you so desire, so long as they are mutually understandable. To that end, I have designed the notation so that it is easily parsed by machine.

In the future, I'd like to develop an application that generates random syntactical configurations, much like scripts that generate random lexical entries.

One last feature to mention: It is possible to define different sentence structures for different usages. Thus, the definition of S(1) is not the same as the definition of S(2). I do this for several reasons:

1. It gives the computer a way to differentiate between different sentence types efficiently and quickly.

2. It opens up the opportunity of creating better chat bots that can speak in a fashion more similiar to human speech.

3. It might have interesting possibilities in the field of computerized translation, though such possibilities may be far off.

I hope you guys enjoyed this post, and I hope you get as much use out of SDL as I have. The notation is here for anybody to use and modify as they see fit. Thanks,

Cacafire.


Top
 Profile  
 
PostPosted: Fri 03 May 2013, 11:22 
roman
roman

Joined: Sun 15 Aug 2010, 15:48
Posts: 513
This type of notation has existed for decades, modulo just a few symbols - the basic idea being identical. However, there's trickier things with notating syntax - e.g. how do you express this rule using those symbols:

an anaphor cannot have as its antecedent a noun that it c-commands and that is to the anaphor's right.

E.g. (where 'he' is to refer to 'John':

After he left, John did not know what to do.
John did not know what to do after he left.
After John left, he did not know what to do.
*He did not know what to do after John left.


Top
 Profile  
 
PostPosted: Sat 04 May 2013, 11:06 
hieroglyphic
hieroglyphic
User avatar

Joined: Thu 02 May 2013, 21:44
Posts: 25
Location: San Antonio, TX
That's a good point. I don't know. :(


Top
 Profile  
 
PostPosted: Sat 04 May 2013, 18:35 
roman
roman

Joined: Sun 15 Aug 2010, 15:48
Posts: 513
Kleene algebra already does all that, although somewhat differently:

X* = any number of X, even zero
X = exactly one X
X | Y = either X or Y
(X|Y)* = any number of Xs or Ys, so e.g. XYX is permissible, as is YYYYYYYYYYYY as is YXYYYXY as is XXXXXXXX. So is the empty string.
(X* | YXZ*) either any number of X or YX followed by any number of Y

Often, this is done by nesting phrases:

NP = DET AdjP N | DET N | N | NP RelP
DET = this | that | my | your | his | her | the | a | an
N = n | n n
n = any noun
AdjP = Adj Adjp | Adj
Adj = any adjective
RelP = that VP | who VP | which VP | whose NP VP | Prep whom NP VP | 
...
Yeah, I know, these rules are incomplete and may even have details that are wrong. Do not use it as a guide to english grammar.


Top
 Profile  
 
PostPosted: Sat 04 May 2013, 21:33 
hieroglyphic
hieroglyphic
User avatar

Joined: Thu 02 May 2013, 21:44
Posts: 25
Location: San Antonio, TX
Oh wow, that looks pretty cool, systemzwang. I'll have to experiment with that. :D


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 5 posts ] 

All times are UTC + 1 hour [ DST ]


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group