Are there four additional unrecognised branches of Austroasiatic?


Austroasiatic is usually considered to have twelve branches and these are on the whole well characterised, although the internal classification of the phylum remains controversial. This paper evaluates the possibility that there were originally four other branches, whose existence can now only be inferred from residual lexicon in languages which are currently considered non-Austroasiatic. These four hypothetical branches are;


a)   The language of the Shompen. Either unclassified or considered a language isolate, Shompen has a number of cognates with mainland Austronesian which are not shared with other Nicobarese languages. This raises the possibility that Shompen represents a separate and presumably earlier migration to the Nicobars and that apparent similarities with Nicobarese may be due to borrowing.


b)   The languages of coastal Vietnam prior to the Chamic migrations. Paul Sidwell has observed that a significant percentage of the lexicon of Chamic is neither Austronesian nor known Austroasiatic. It may be that the languages assimilated by the incoming Chamic speakers represented either a quite unknown phylum or else were an unrecognised branch of Austroasiatic.


c)   Acehnese. Thurgood and Sidwell treated Acehnese as related to the Chamic languages, and it certainly has a significant stratum of Chamic lexicon. However, it also has cognates with mainstream Austroasiatic and vocabulary with no clear etymology. It is therefore possible, as Diffolth (p.c.) has argued, that Acehnese represents a residual Austroasiatic language that has come under heavy Chamic influence.


d)   Bornean substrate languages. Some Austronesianists (Adelaar in particular) have argued that unusual phonological features of Borneo languages such as Bidayuh point to a possible Austroasiatic substrate. Other linguists, including Robert Blust, claim that these features can be explained by processes internal to Austronesian. However, cultural evidence for mainland presence on Borneo is extremely strong, and such early contact is quite likely.


All these hypotheses remain to be tested; the paper evaluates the case for each proposal. In some cases, the poor quality of the data (e.g. Shompen) may mean that they are presently undecidable. Additional types of data, in particular cultural and archaeological, provide support for some hypotheses and these may prove valuable in interpreting the strictly linguistic findings.