< Machine Translation
Statistical machine translation
Language models
Language models are used in MT for a) scoring arbitrary sequences of words (tokens) and b) given a sequence of tokens, they predict what token will likely to follow the sequence. Formally, language models are probability distributions over sequences of tokens in a given language.
N-gram models
Character-based models
Recently, it was shown that it is possible to use sub-words, characters or even bytes as basic units for language modelling[citation needed]. There are a few events focused particularly on such models and in general, processing language data on sub-word units, e.g. SCLem 2017.
Translation models
IBM models 1-5
Phrase-based models
Factored translation models
Syntax- and tree-based models
Synchronous phrase grammar
Parallel tree-banks
Syntactic rules extraction
Decoding
Beam search
Hybrid systems
Computer-aided translation
Translation memory
This article is issued from Wikibooks. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.