You are not logged in.

  • "annacrook" started this thread

Posts: 4

Date of registration: Feb 6th 2015

Language Team: Computational linguistics

Focus Group: Other - Not Listed Above

Thanks: 163 / 0

  • Send private message

1

Friday, February 6th 2015, 3:12pm

syllabification tools

Hello !

I am currently working on a mechanism for automatic text difficulty assessment for English, French and German. One of the difficulty parameters is a number of syllables per word/sentence. For the former two languages we have the syllabification tools, but not for German. Does anybody have a slightest idea of where I can take it from ? An algorithm, an on/offline service, an API of such a service ?

Thank you,
Anna

41 guests thanked already.
  • "ossi11111" is male

Posts: 1,072

Date of registration: Nov 27th 2013

Language Team: German

Focus Group: Translator
Translation Proofreader
Final Reviewer
Language Coordinator
LTI Development Group

Thanks: 29461 / 364

  • Send private message

2

Sunday, February 8th 2015, 2:08pm

Hi annacrook,

How many syllables specific German words have can be taken from here (example for the German word for "to eat"): http://www.duden.de/rechtschreibung/essen
There is written "Worttrennung: es|sen" which means that you can divide the word into two syllables: "es" and "sen". But I guess you do not want to write a computer program that parses all the sites of Duden for "Worttrennung" but need it in another format.
Or what about this one? http://de.pons.com/specials/api
It seems to me, that you can also download the content of wikimedia in a computer-readable format: http://dumps.wikimedia.org/ So you might want to download the German Wiktionary http://de.wiktionary.org/wiki/Wiktionary:Hauptseite and then parse it.

I hope the information I shared with you helps at all. If not could you give me more specifics of what you are trying to accomplish and especially how you want to accomplish it? Are you writing your own computer program that just needs the right input or are you aiming for already existing computer programs that do what you want?

Best Regards, Tim
Signature from »ossi11111« Willst du beim Übersetzen mithelfen?
E-Mail an: GermanLingTeam@gmail.com

Bisher veröffentlichte Videos mit deutschen Untertiteln:
GermanLingTeam - YouTube

46 guests thanked already.
  • "annacrook" started this thread

Posts: 4

Date of registration: Feb 6th 2015

Language Team: Computational linguistics

Focus Group: Other - Not Listed Above

Thanks: 163 / 0

  • Send private message

3

Monday, February 9th 2015, 3:19pm

Hello, ossi11111,

Thanx a lot four your answer. Here below I'll try to explain what I am searching for. First of all, I am just a linguist and not a computational linguist, so, I might ignore some of the terms proper to the really technical domain. However, in simple words, I need an API for a online service or an offline program (preferably the latter) that either counts the number of syllables in a text or simply graphically divides words into syllables. A perfect solution would be to just have a database of around 150000 words that provides the number of syllables for each word. An example of such a database can be Lexique3 for French: http://www.lexique.org/telLexique.php Here you have to download first to see all the information it possesses.

Hope I am more or less clear ...
Best regards,
Anna

39 guests thanked already.
  • "annacrook" started this thread

Posts: 4

Date of registration: Feb 6th 2015

Language Team: Computational linguistics

Focus Group: Other - Not Listed Above

Thanks: 163 / 0

  • Send private message

4

Monday, February 9th 2015, 3:35pm

As for the tools that you kindly suggested, PONS seems like a really good one, but they will charge us for more then 1000 queries and as our research team will test quite large text corpora, we need to have a free API if possible, or a database, as I previously mentioned. As for the rest, do you think I can contact the authors of DUDEN or WIKI to ask them for an API ?

P.S.: Sorry for the newbie-level terminology :)

40 guests thanked already.
  • "ossi11111" is male

Posts: 1,072

Date of registration: Nov 27th 2013

Language Team: German

Focus Group: Translator
Translation Proofreader
Final Reviewer
Language Coordinator
LTI Development Group

Thanks: 29461 / 364

  • Send private message

5

Monday, February 9th 2015, 8:12pm

I only know that they surely won't answer if you don't ask them. So go ahead and try your luck. :)
Unfortunately I do not have more time to do a research whether something else exists or not because we are currently very busy translating the new movie of The Venus Project. Please let me know what you found out yourself. Maybe I will have more time again in a few weeks if you are not successful until then.
Signature from »ossi11111« Willst du beim Übersetzen mithelfen?
E-Mail an: GermanLingTeam@gmail.com

Bisher veröffentlichte Videos mit deutschen Untertiteln:
GermanLingTeam - YouTube

42 guests thanked already.
  • "annacrook" started this thread

Posts: 4

Date of registration: Feb 6th 2015

Language Team: Computational linguistics

Focus Group: Other - Not Listed Above

Thanks: 163 / 0

  • Send private message

6

Tuesday, February 10th 2015, 10:11am

Dear ossi11111,


Thank you for your interest in this topic. If you have a free minute, I might have enough insolence to ask what you think about the idea that I got on one of the forums to just make a list of vowels combinations (a, e, i, o, u, ä, ö, ü, eu, ie, ei, au, and probably a few more) and count the potential number of syllables. It might probably give a certain marge of error, but taking into account that after quite an extensive research on the subject I did not manage to find any database or free API for German ... Anyway, I saved the tools you suggested and me and my team will try to contact the owners. Just trying to figure out different option.


Have a great day,
Anna

43 guests thanked already.
© Linguistic Team International 2017
Context In Motion