Ladino Data Hub

Tatoeba Ladino parallel corpora

Ladino sentences with English, Turkish and Spanish translations.

SKAD parallel corpora in LAD, ENG, TUR

Parallel sentences from various resources. The original sentences were created and translated by SKAD. Recollection was done by Col·lectivaT.

Ladino-Spanish dictionary files

Files used in rule-based Spanish to Ladino translation system. Created from "Diksionaryo de Ladino a Espanyol" por Güler, Portal i Tinoco. Recollection was done by Col·lectivaT.

Synthetic parallel corpus: Ladino-English, Turkish, Spanish

Synthetically produced parallel data using rule-based Spanish-Ladino translation. 10.3 million sentence pairs in total.

Una fraza al diya

Ladino language learning sentences prepared by Karen Sarhon of SKAD. Each sentence has translations in Turkish, English, Spanish. Includes 397 sentences, linked with audio and images.

Ladino speech corpus

Ladino speech corpus contains transcripted recordings of some of the last remaining Ladino speakers of Istanbul. Corpus is created by Karen Şarhon of SKAD.

Şalom Ladino articles text corpus

307 articles from the Judeo-Espanyol section of Şalom newspaper. Original sentences and articles belong to Şalom. Corpus is compiled in 2022 by Col·lectivaT.

Neural machine translation models

OpenNMT based models and other necessary files for English, Turkish, Spanish to/from Ladino machine translation.

Text-to-speech (TTS) training dataset

Training dataset used in building Ladino text-to-speech.

Text-to-speech (TTS) models

Ladino text-to-speech models.