This dataset contains parallel corpora with synthetically produced Ladino sentences. Source sentences were obtained from various ES-TR and ES-EN corpora in OPUS collection. Each dataset has four columns:
Source tag, English/Turkish sentence, Spanish sentence, Synthetic Ladino sentence
License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) https://creativecommons.org/licenses/by-sa/4.0/
Citation and more information: https://arxiv.org/abs/2205.15599
Total size:
EN-ES-LAD: 5,748,013
TR-ES-LAD: 4,574,023
This resource is created as part of the "Judeo-Spanish: Connecting the two ends of the Mediterranean" project within the framework of the “Grant Scheme for Common Cultural Heritage: Preservation and Dialogue between Turkey and the EU–II (CCH-II)” implemented by the Ministry of Culture and Tourism of the Republic of Turkey with the financial support of the European Union.