Pages in topic:   < [1 2 3 4 5 6 7 8] >
Slate Desktop: your personal MT engine
Thread poster: Mohamed
Richard Hill
Richard Hill  Identity Verified
Mexico
Local time: 16:22
Member (2011)
Spanish to English
@ Tom. A few questions before I take the plunge: Sep 26, 2015

Hi Tom, I just watched the webinar and thought it was very interesting, so much so that I'm considering going for the early bird offer, but have a few questions first.

- I have Trados Studio 2015 Plus; “plus” meaning that I can use it on two machines, so my question is can I use one Slate Desktop license for both my PC and my laptop?

- My PC surpasses the system requirements for Slate but my laptop falls short. It has an Intel® Core™ i5-3360M Processor, 2 cores a
... See more
Hi Tom, I just watched the webinar and thought it was very interesting, so much so that I'm considering going for the early bird offer, but have a few questions first.

- I have Trados Studio 2015 Plus; “plus” meaning that I can use it on two machines, so my question is can I use one Slate Desktop license for both my PC and my laptop?

- My PC surpasses the system requirements for Slate but my laptop falls short. It has an Intel® Core™ i5-3360M Processor, 2 cores and 4 threads, and 8GB RAM. Is this enough to be able to use Slate effectively?

- I have a "general" TM that has all my work since I acquired Trados around 4 years ago (140,000 segments), so basically I’m not in the habit of creating field-specific TMs, mostly because I translate 95% legal texts, but the precise subject matter can vary widely, and having watched the webinar, I wonder whether it will be necessary for me to do some heavy TM maintenance work and break them down into smaller more specific subject matters to get the most out of Slate?

- If I process my “general” TM (140,000 segments) and my version (1.5 million units) of the DGT TM –or several other TMs for that matter– is there any way of giving priority to a given TM over the others during processing?

- The webinar touched on the advantages of processing termbases in Slate, from which I understand that I would need to remove notes, comments, definitions, etc. which is easy enough, in that I can create a simple two column Excel file just containing source and target terms. Can Slate process Excel files?

- In Studio, do you know if Slate translations will be propagated in the editor or will they just show up in the translation memory window? If the former is true, will my TMs have priority over what’s offered by Slate?

- Lastly, for those who buy through the early bird offer, for how long will they receive free upgrades?

Thanks in advance and all the best,

Richard

[Edited at 2015-09-27 01:14 GMT]
Collapse


 
2nl (X)
2nl (X)  Identity Verified
Netherlands
Local time: 22:22
The ethics Sep 27, 2015

Richard Hill wrote:


While this software sounds interesting, I wonder if we'll see agencies using it with their sometimes massive arrays of TMs, and then posting the resulting texts as post-editing jobs, as is currently the case with MT.


I have no doubt that certain clients will use the previously delivered work of a certain translator to create machine translated "exact matches" that will be locked and excluded from payment. And I'm afraid that there is nothing we can do about that. Am I right, Tom?

[Edited at 2015-09-27 00:39 GMT]


 
Tom Hoar (X)
Tom Hoar (X)
United States
Local time: 17:22
English
"Context" respective to translation engines Sep 27, 2015

2nl wrote:

Tom,

I've just watched your webinar on Slate Desktop and near the end you say 'Computers cannot see context'. Are you referring to 'real-life context' or to the context of previous segments in the current translation project?

Let me clarify that: In CafeTran I can instruct the auto-assembling feature to use a specific translation for a certain word (source term). (See: http://cafetran.wikidot.com/inserting-alternative-target-terms-via-the-context-menu)

Can Slate Desktop, or its underlying engine (Moses), put translations for terms that you have approved of on top of the stack, so that they will be used for the rest of the project?)

Hans


In the webinar, I was talking about how engines translate sentence-by-sentence, and therefore they don't carry any context between translated segments. I think a computer that tracks "real-life context" is a bit of a dream.

I looked at your CafeTran link and the "context menu" popup looks like a great time-saving feature. We will need to see look deeper to see if/how we can support that.

Slate Desktop uses many advanced features. You can enter dictionary/glossary terms that take priority over the engine's choice, i.e. force a term to translate the way you want. It will be interesting to see if we can merge this feature with the context menu. The TODO list grows


 
Tom Hoar (X)
Tom Hoar (X)
United States
Local time: 17:22
English
Apertium in OmegaT Sep 27, 2015

Milan Condak wrote:

Hi,

I also watched the webinar on Slate Desktop. I am using OmegaT for HTML translation.

In OmegaT can be checked more MT engines in the same time.

I translate into Czech language. I mostly use MyMemory (Google Translate) checked in OmegaT. Using of Slate DeskTop is similar to using of Apertium offline.

Apertium is web on-line MT with some language pairs. Czech language is not supported yet. There is possibility to convert MT engine into *.jar file and run it offline in OmegaT (a picture on the bottom).

http://www.condak.net/cat_other/omegat/2013-09-17/cs/02.html

I tested in September 2013 the offline version of Apertium in Omega T:

http://www.condak.net/cat_other/omegat/2013-09-17/cs/03.html

The *.jar file can run also without an integration into OmegaT:

http://www.condak.net/cat_other/omegat/2013-09-17/cs/04.html

I hope the *.jar files are OS indipended, they need Java.

On the last page 05.html you can see, that Apertium *.jar feature is bi-directional (PL-CS and CS-PL).

My second remark is on re-using ready pairs of files from Opus project prepared for Moses. This base data is not necessery convert from TMX. This step is skipped?

Milan,
the hobbyist


Guys, thanks for taking to the term "hobbyist." That's where it all starts!

As you point out, OmegaT can use a variety of MT systems, including Apertium. Apertium is a rules-based system. I think the offline .jar files are the encapsulation of the rules-engine, but haven't looked deeply into them.

Yes, the Opus is one of many publicly available corpora. Slate Desktop supports more then just TMX. We directly support:

* parallel corpus text files: my-cospus.en & my-corpus.de
* TMX
* XLIFF
* Tab
* Gettext

We do not support word processing and rendering formats (MS Office, Open document, PDF, etc). For these, you need to use your CAT or other tool like Okapi to convert to a supported format.


 
Tom Hoar (X)
Tom Hoar (X)
United States
Local time: 17:22
English
Agencies' massive arrays of TMs Sep 27, 2015

Richard Hill wrote:

While this software sounds interesting, I wonder if we'll see agencies using it with their sometimes massive arrays of TMs, and then posting the resulting texts as post-editing jobs, as is currently the case with MT.


Personalized translation engine is a marketing term we created to describe a specific use of the engine. Nothing stops the old-fashioned and often ill-fated more is better approach.

This is a classic case of working smart. Dumping huge TMs at a translation engine is hard work and it does not guarantee the engine works better. Sometimes it's worse. These tools are only as good as the technicians, engineers and translators using them.

From earlier in this thread:
Patrick Porter wrote:

...for a professional translator, an MT engine trained with your own translations can be a really powerful production tool. I use this kind of resource in my work all the time...


 
Tom Hoar (X)
Tom Hoar (X)
United States
Local time: 17:22
English
Answers in hopes you take the plunge Sep 27, 2015

Richard Hill wrote:

Hi Tom, I just watched the webinar and thought it was very interesting, so much so that I'm considering going for the early bird offer, but have a few questions first.

- I have Trados Studio 2015 Plus; “plus” meaning that I can use it on two machines, so my question is can I use one Slate Desktop license for both my PC and my laptop?

- My PC surpasses the system requirements for Slate but my laptop falls short. It has an Intel® Core™ i5-3360M Processor, 2 cores and 4 threads, and 8GB RAM. Is this enough to be able to use Slate effectively?

- I have a "general" TM that has all my work since I acquired Trados around 4 years ago (140,000 segments), so basically I’m not in the habit of creating field-specific TMs, mostly because I translate 95% legal texts, but the precise subject matter can vary widely, and having watched the webinar, I wonder whether it will be necessary for me to do some heavy TM maintenance work and break them down into smaller more specific subject matters to get the most out of Slate?

- If I process my “general” TM (140,000 segments) and my version (1.5 million units) of the DGT TM –or several other TMs for that matter– is there any way of giving priority to a given TM over the others during processing?

- The webinar touched on the advantages of processing termbases in Slate, from which I understand that I would need to remove notes, comments, definitions, etc. which is easy enough, in that I can create a simple two column Excel file just containing source and target terms. Can Slate process Excel files?

- In Studio, do you know if Slate translations will be propagated in the editor or will they just show up in the translation memory window? If the former is true, will my TMs have priority over what’s offered by Slate?

- Lastly, for those who buy through the early bird offer, for how long will they receive free upgrades?

Thanks in advance and all the best,

Richard

[Edited at 2015-09-27 01:14 GMT]



I want everyone to know, all your comments are helping to shape our final product packaging. These comments are especially insightful.

RE: use it on two machines. Let me talk to my engineers to see what we can do, and check with our legal counsel to compare license wording. I think we can do something... Stay tuned.

RE: Hardware spec. The higher spec really helps with the processes that create engines (training/tuning). Your notebook should handle running the engines. Just copy them from your PC desktop. This might also be an easy way to address two-machine licensing, i.e. to run the engine for translation work on two machines, and the engine creation tools only licensed to one machine. Would that be satisfactory to everyone?

RE: heavy TM maintenance work. We try to make the maintenance work as easy as possible. The more advanced you become, the more you can reach into the deeper tools. Fortunately, the Desktop engine is not like cloud engines. You can build as many engines as you like at no extra cost. Still, you don't want to spin your wheels on maintenance. One (small agency) customer had 25 years of TMs all categorized by the way they did their work. They dumped them all into one TM for training their first engine. With that engine averaged a respectable 25-30% of segments across all jobs required zero editing. They pondered for months about how to re-organize the TMs for better engines. Finally, they decided to try building engines by their existing categorization. The TM to create the next engine was significantly smaller and focused on only one work area. When they used it for that category of work, the percentage jumped to about 75-80%. This company has all in-house translators and their TMs were never shared (or collected from) out of house.

RE: giving priority to a given TM over the others. Yes, weighting one set of data over another is possible. It's one of the more advanced topics and usually requires some manual work. Those steps will be come more automated as more users become advanced and we have time for updates/upgrades.

RE: Excel files. No, but my posting "Apertium in OmegaT" lists the native file formats we support. You could save a Tab-delimited file and we can import that.

RE: propagated in the Trados editor. Our first MT connector for Trados will probably be a basic interface. I know that Trados has a rich API set to control/customize many aspects of the UI. Learning more about these advanced API features is another thing for our TODO list.

RE: how long will they receive free upgrades. We will distribute free updates/bugfixes as often as necessary throughout the periods between major upgrades. I expect we'll have major upgrades every 12 - 18 months.

This last point tipped the balance for an idea I've been considering for some time. To further incentivize everyone to take the leap of faith in the early bird special, I will update the campaign with a notice that all who buy (and have bought) Super Early Bird specials will continue receiving the 40% discount on all major upgrade pricing in perpetuity for their license. Indiegogo mechanics don't let me edit Perks which people have purchased. So, I'll update the campaign's main wording to make this official.


 
RWS Community
RWS Community
United Kingdom
Local time: 22:22
English
A question on the integrations Sep 28, 2015

Hi,

I'm wondering why you don't just publish an API for this and then the various tool vendors could complete the integration themselves? In the case of Studio it might even give the option for any developer to take this on as the provider for Machine Translation is fairly straightforward.

Regards

Paul
SDL Community Support<
... See more
Hi,

I'm wondering why you don't just publish an API for this and then the various tool vendors could complete the integration themselves? In the case of Studio it might even give the option for any developer to take this on as the provider for Machine Translation is fairly straightforward.

Regards

Paul
SDL Community Support
Collapse


 
Tom Hoar (X)
Tom Hoar (X)
United States
Local time: 17:22
English
Re: integrations Sep 28, 2015

SDL Community wrote:

Hi,

I'm wondering why you don't just publish an API for this and then the various tool vendors could complete the integration themselves? In the case of Studio it might even give the option for any developer to take this on as the provider for Machine Translation is fairly straightforward.

Regards

Paul
SDL Community Support


Hi Paul,

This is a bit of a chicken-n-egg conundrum. We have to start somewhere. We have an installed user base on Linux who has been asking for Windows and integration with CATs. We'll be using the provider faculties in Studio for our connector.

Slate Desktop has an API to create new functionality, both when creating new engines and when using the engines for production. We'll work with vendors to reach the tools that don't publish APIs. I expect many of our advanced customers will learn our API to create extended features.

[Edited at 2015-09-28 13:46 GMT]


 
Milan Condak
Milan Condak  Identity Verified
Local time: 22:22
English to Czech
Supported input data Sep 28, 2015

tahoar wrote:

Yes, the Opus is one of many publicly available corpora. Slate Desktop supports more then just TMX. We directly support:

* parallel corpus text files: my-cospus.en & my-corpus.de
* TMX
* XLIFF
* Tab
* Gettext



Thank yu for clear answer. Can you please make a screenshot of upper menu of Slate? I could not read any text on tabs. A windows of Slate in presentation are too small for me. TIA.

Milan

[Edited at 2015-09-28 15:30 GMT]


 
Tom Hoar (X)
Tom Hoar (X)
United States
Local time: 17:22
English
Screenshot of upper menu of Slate Desktop Sep 28, 2015

Milan Condak wrote:

Thank yu for clear answer. Can you please make a screenshot of upper menu of Slate? I could not read any text on tabs. A windows of Slate in presentation is too small for me.

Milan


Milan, fair enough. I need a day or two to clear some work. I'll add this to my TODO list.

[Edited at 2015-09-28 13:45 GMT]


 
Tom Hoar (X)
Tom Hoar (X)
United States
Local time: 17:22
English
Slate Desktop screen shots Sep 30, 2015

Milan Condak wrote:

Thank yu for clear answer. Can you please make a screenshot of upper menu of Slate? I could not read any text on tabs. A windows of Slate in presentation are too small for me. TIA.

Milan


Here are five screen shots. As mentioned in the webinar, we borrowed this GUI from our Linux DoMT product to run on MS Windows for the demo. It exposes an intermediate level of tools. Slate Desktop will have a major overhaul and include a beginner's level that merges the tabs into one click-n-go step. There are several levels of advanced user interaction, including a scripting interface for users to create their own custom processes (tabs).

Steps to import and prepare TMs
When the user imports TMs, the app tags them with descriptive information in the values of superdomains, domains, and subdomains (these labels will change in Slate Desktop). Then, they run scrub-tm to remove markup tags from the segments. Finally, clean-tm removes segments that violate some technical limits (like over 100 words per sentence/segment).
Slate Desktop corpus tools

Steps in the learning process
The user set any combination of labels values (superdomains, domains, subdomains). Slate Desktop consolidates the segments matching the values into BUILDS. On the build-tm tab, the smt-tm-buildname value becomes the name of the knowledge first part of the engine (front end). The build-lm tab has a corresponding smt-lm-buildname that becomes the name of the style second part of the engine (back end). Reference the webinar video for more details.
Slate Desktop make engines

Resulting engines
You give an engine a name like translate-xliff and set some information that Slate Desktop uses to identify the engine.
Slate Desktop make engines

Enabling Slate Desktop engines in OmegaT
Every CAT will do this differently. In OmegaT, you set a check next to the engine(s) you want to use.
Slate Desktop enable OmegaT

Configure OmegaT
You select the "Settings" menu option (above) and set which engine you want to use, like translate-xliff. Your CAT will use Slate Desktop's personalized translation engine like any other engine, but this one runs on your local desktop.
Slate Desktop configure OmegaT


 
Milan Condak
Milan Condak  Identity Verified
Local time: 22:22
English to Czech
Screenshots and links to MTM15 Sep 30, 2015

tahoar wrote:

Here are five screen shots.


Tom, thank you for your screenshots.
==
In previous posts was some questions on which can be found answers.

I recommend to read presentation in PDF "Real-World Application of an Machine Translation Workflow"

http://ufal.mff.cuni.cz/mtm15/files/02-real-world-application-of-mt-workflow-tomas-fulajtar.pdf

and programme (or other presentations)

http://ufal.mff.cuni.cz/mtm15/programme.html

Keynote: Real-World Application of an Machine Translation Workflow
..., ...
page Open topics:
Morphologically rich languages challenges –Czech, Korean, Finno-Ugric, Turkish
Adding syntactic features into MT
==
Research is continuing. Reseachers are using Linux. There are many articles from similar events.

One workshop is about Depfix.

https://ufal.mff.cuni.cz/events/deep-machine-translation-workshop
==
Statistical MT works better from Czech into English. For translation English into Czech I get better results from rule based MT.

Milan


 
Tom Hoar (X)
Tom Hoar (X)
United States
Local time: 17:22
English
Bus vs Mini Cooper Sep 30, 2015

Milan Condak wrote:

Statistical MT works better from Czech into English. For translation English into Czech I get better results from rule based MT.


Milan, which statistical system(s) have you used/evaluated for your work to confirm these academic papers? Did you watch the entire webinar?

Your PDF link is indeed a good read to learn how large agencies (10th ranked Moravia in this presentation) benefit from the MT-PE cycle. The fifth page titled The MT Ecosystem is especially enlightening as it shows MT at the center of a 4-step cycle of Academic, Commercial Development, LSP (Language Service Providers), and In-house MT Owners. Is it simple oversight that there's no mention of translators participating in the ecosystem?

Machine translation engines are like internal combustion engines as part of a vehicle. Large translation agencies design large buses around a large engine. The bus driver navigates a route with many post-editors to predetermined bus stops. Post-editors learn to adhere to the bus' schedule and walk to/from the bus stop. The large bus require expensive maintenance and large quantities of fuel with low fuel efficiency. Yet, agencies continue to design and use large buses because of the benefits they achieve with the engine despite the bus' inefficiencies.

We design Slate Desktop as a Mini Cooper around a smaller engine. There's no need to change your habits because you drive it when you want and to/from your destination. You can afford the maintenance and it requires less fuel. The Mini, however, carries one (or 2) passengers whereas the bus carries many.

In the name of full disclosure, although Slate Desktop is the world's first SMT engine for the MS Windows desktops, it is not our only product. We have other products that serve large and medium agencies agencies. For example, our server product is an engine behind Welocalize's WeMT brand (9th ranked). We're not against agencies using engines. I'm merely highlighting that different system designs result in different performance. I'm not a translator, but I drive a Toyota and I sometimes ride the bus.

I expressed the ideas above during the webinar in Slides 5 & 7 (20 minutes)

Regarding your link to the Machine Translation Workshop, academia plays a vital role in this technology. I acknowledged that in Slide 19 (60 minutes). Academia's role is fundamental research into making the internal combustion engine better. As such, academia is concerned with pistons, compression ratios, camshafts and manifolds. Our role as a software engineering company is to make easy off-the-shelf products that benefit translators. As such, we focus on vehicle size, shape, ergonomics and driver control.

As a hobbyist, a driver might tinker with his Mini Cooper's engine on the weekends, but there's no need for every driver to understand compression ratios to drive to the grocery store. The driver simply needs to know how to use the key (start button), accelerator pedal, break pedal, steering wheel and mirrors. The screen shots above show the steps translators learn to "drive" Slate Desktop.

[Edited at 2015-10-01 02:29 GMT]


 
2nl (X)
2nl (X)  Identity Verified
Netherlands
Local time: 22:22
Nice comparison Oct 1, 2015

tahoar wrote:

We design Slate Desktop as a Mini Cooper around a smaller engine.



The developer of CafeTran likes to compare his CAT tool with a Cessna: light-weight and agile .


 
Milan Condak
Milan Condak  Identity Verified
Local time: 22:22
English to Czech
I took a part of MTM13 Oct 1, 2015

tahoar wrote:

Milan Condak wrote:

Statistical MT works better from Czech into English. For translation English into Czech I get better results from rule based MT.


Milan, which statistical system(s) have you used/evaluated for your work to confirm these academic papers? Did you watch the entire webinar?



I take a part of MTM13

http://ufal.mff.cuni.cz/mtm13/registered.php

Condak.net s.r.o.

and then I made two presentations.

First:

http://www.condak.net/akce/mtm2013/cs/00.html

I published a scripts for connect Wordfast Classic (works in MS Word) to online MT and to Wordfast Server (it can be desktop, in LAN or WEB).

http://www.condak.net/akce/mtm2013/cs/02.html

I compare translation of sentence: "I can buy a can, but a can can not buy me" on the page 03:

http://www.condak.net/akce/mtm2013/cs/03.html

Moses, TectoMT = machine translation on Charles University

PC Translator (rule-based Czech MT) in my desktop for a price 180 EUR (EN-CS, CS-EN).

In PC Translator I can see:
- words and their translation
- translated sentence = "Já mohu koupit plechovku, ale plechovka nemůže koupit mě." is OK.
- offers from Google or Bing, too.

I compare DeepFix vs. PC Translator on the page 04. Results from build parser in PC Translator are better.

http://www.condak.net/akce/mtm2013/cs/04.html

On page 05 is demo of matecat in browser:

http://www.condak.net/akce/mtm2013/cs/05.html

The testing game is in second presentation:

http://www.condak.net/cat_other/matecat/2013-09-30/cs/00.html

At a bottom of page is a pane Strojový překlad (Machine translation) with two offers, one is
and second is

http://www.condak.net/cat_other/matecat/2013-09-30/cs/11.html
--
One engine can support good "grammar" or up-to-date terminology (Google) and second MT engine the unique right terminology.

HTH,
Milan


 
Pages in topic:   < [1 2 3 4 5 6 7 8] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Slate Desktop: your personal MT engine






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »