Estás viendo una versión antigua de este conjunto de datos. Para ver la versión actual, click aquí.

Synthetic parallel corpora LAD-EN, TR, ES [Borrador]

Col·lectivaT
/ Creado 09/08/2022
/ Updated 21/11/2024

Synthetically produced parallel data using rule-based Spanish-Ladino translation.

Sizes:

Ladino-Spanish: 10,322,033 sentences

Ladino-Turkish: 4,574,021 sentences

Ladino-English: 5,748,012 sentences

Paper: https://arxiv.org/abs/2205.15599

This dataset is created as part of project "Judeo-Spanish: Connecting the two ends of the Mediterranean" carried out by Col·lectivaT and Sephardic Center of Istanbul within the framework of the “Grant Scheme for Common Cultural Heritage: Preservation and Dialogue between Turkey and the EU–II (CCH-II)” implemented by the Ministry of Culture and Tourism of the Republic of Turkey with the financial support of the European Union. The content of this website is the sole responsibility of Col·lectivaT and does not necessarily reflect the views of the European Union.

Etiquetas:

Datos y Recursos

Este conjunto de datos no tiene datos