Wednesday, May 02, 2012

Lexical glosses across seven languages

I wrote briefly in my last post about being spurred on to set up a spreadsheet to map vocabulary across different languages, and then use this as a basis for anki flashcard files linked to pictures. I made a good start on this and it's been an interesting process on a number of levels.

Firstly, I gathered that initial list of 400 key words, and then I supplemented it with a list of the 2000 most common words in English. So that is a sizeable initial target of 2500 words or so for an active vocabulary. I expanded the spreadsheet to reflect 7 languages: English (base), Ancient Greek, Latin, Gaelic, Mongol, German, Hebrew.

Filling in Latin and Gaelic has proven the easiest, in terms of just scrolling down and filling in words from my head. There is a second process that involves making sure I have all relevant lexical data: genitive form, gender, marking of long vowels in Latin and Gaelic accent marks. This is further frustrated when I have to refer to a dictionary, since to go English->Latin/Gaelic I look up one entry, but almost never contains enough data to fill in, and so then I have to look up the target word a second time to gain that information. Particularly with English->Latin this can be frustrating, as neo-Latin words often don't have a corresponding Latin->English entry.

German is easy in a different way, as there are great online resources for English->German, and generally I just list noun + article.

Greek is harder, as the best way seems to be to use Woodhouse online, which is a multi-click process, and even then (a) many words are not found/represented, (b) some words are over-represented and then I need to again reverse the process with LSJ or some such to work out nuances.

Mongolian is not too bad, as a single form is usually enough and because I am currently studying it I can often fill in basic words, and I have a couple of dictionaries.

Hebrew is difficult, since there is no real English->Classical Hebrew dictionary that I have access to, so I have to resort to word-searches of Biblical Hebrew vocab documents, and then supplement non-represented words with Modern equivalents.

Then there are equivalence problems. Some words have nuances that do not map to English well. Colours are a good example: the Gaelic colour-breakdown doesn't map to English well at all. So sometimes I need to create multiple entries across languages to reflect unavoidable nuancing.

A second problem exists at a slightly higher level - construction of meaning. For example, English uses a bunch of modal verbs and constructions embedded in words, which are often not expressed by modal words but whole syntactical patterns in other languages. I haven't really come to/tackled this issue yet, but I imagine I will provide whole sentence exemplar patterns.

Lastly there is the problem of images. I pull most of mine from the internet, but as I go on I imagine (indeed already have) that some words/concepts will be particularly hard to 'picture'. One solution is to use target-language explanatory phrases, but that might be difficult.

However, overall I am finding this a rewarding enterprise. It's good to (a) know what I know (or ought to know!), (b) move from knowing X in language 1 to knowing X in languages 1-7, (c) build both a working lexical file for my own easy reference as well as the base for flashcard memorisation of a strong core vocabulary.

No comments: