Digital Extinction Haunts Several European Languages

Untitled Document
It is estimated that over 20 European languages could be wiped off the digital space for a lack of technological support. This observation is made by Europe's leading language technology experts. Scientists from The University of Manchester also participated in the study.

This European team felt that there simply was not enough digital assistance for 21 out of the 30 languages under study, labeling the level of support as 'non-existent' or 'weak' at best. The Sept. 26 date serves as a reminder to reinforce the significance of developing and preserving the very rich cultural and linguistic heritage of the European continent.

According to META-NET, a center of excellence that consists of 60 research centers based in 34 European countries, which includes the University of Manchester's National Centre for Text Mining (NaCTeM), those languages that are not widely spoken have the greatest risk of extinction as the required technological support are simply pathetic.

Image [Click to enlarge]


European Language Day

It ranks Maltese, Latvian, Icelandic and Lithuanian as most at risk of discontinuation, while certain other languages like Greek, Polish, Bulgarian and Hungarian also fare not much better.

It is a comprehensive study. It involved some 200 experts who are responsible for documenting 30 META-NET White Paper Series. Primarily it assessed language technology support across individual language through 4 different criteria:

  • Automatic translation,
  • Speech interaction,
  • Text analysis and
  • Availability of language resources.
Consistent with the earlier observation, languages like Maltese, Latvian, Icelandic and Lithuanian fare the worst in all four criteria.

One surprise find, in the White Paper for English, is that while English was unanimously acknowledged as having the best technology support when compared against all other European languages, it still does not qualify for "excellent support", but only a fairly reserved "good support", according to the words of University of Manchester researchers.

In comparison, Spanish, Italian, French, Dutch and German only deserves to be classified as "moderate support". Languages like Polish, Hungarian, Greek, Catalan, Basque and Bulgarian could only earn "fragmentary support" categorization, which puts them directly in a group of high-risk languages.

The level of language technology support could best be measured by the number and quality of software in the market. These are applications used for words processing, either in written or spoken form. Some popular applications are grammar and spelling checkers, interactive personal assistants on smartphones (Siri on the ubiquitous iPhone immediately comes to mind), web search engines, automatic translation capabilities, and dialogue systems that work over the phone.

When a particular language is placed under high risk category, the absence of any decent supporting software often spells instant doom. Unless drastic action is activated, there is little chance that the said language can make it into the next phase of digital world.

Language technology systems are judged in a peculiar way. The most commonly accepted approach is by means statistical analysis – a method which calls on enormous amount of data (both written and spoken) for it to work well. This clearly handicaps languages where there are relatively few speakers.

Image [Click to enlarge]


Digital extinction for European languages

In addition, there are also other inherent faults with statistical language technology systems. Chief of which is the quality. How many times we have experienced amusingly incoherent translations as a result of automatic translation systems online.

The report is not all gloom. It encourages systematic and large-scale undertaking from the relevant parties in Europe to create the necessary technologies and apply this technology to the languages which are facing digital extinction threat.

Director of NaCTeM, Prof Sophia Ananiadou observed that people in the UK often use software or applications without even starting to think that the language support capabilities are already incorporated.

The truth is that language technology has already made our daily lives more efficient, if not easier, and it promises to help us even more in areas that we are not able to visualize today. But just as vital is its presence across a wide array of languages. In European context, it would be helpful that competent language technology supports are made available for all languages used in this continent. Otherwise collaborations across European neighbors will be severely restricted, be it for leisure or business.

Prof Hans Uszkoreit, coordinator of META-NET, concluded in this way: The finding shows an alarming trend. Two out of three European languages are clearly under-resourced and some along these still do not see any real concerted effort to reverse the phenomenon. If this persists, the language problems would become a huge issue for Europe as a whole.

Comments

Popular posts from this blog

Google Nexus 2013 – Well Executed Concept