Create TM, termbase and glossaries for CAT from proz term base of answered questions
Thread poster: Richard Hill
Richard Hill
Richard Hill  Identity Verified
Mexico
Local time: 05:55
Member (2011)
Spanish to English
May 28, 2011

I do not use CAT tools yet, but I've looked into it a little as I hope to find the time to learn how to use CAT tools, and from my limited understanding of how they work, the possibility occured to me of creating good, large TMs, glossaries and/or termbases from the term bases of the proz website, possible made available (for sale?) to proz members.

 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 12:55
English to Hungarian
+ ...
Unlikely May 28, 2011

I've proposed this before, asking the proz people to make available the glossaries in some simple manageable format (xls, txt etc.)
I got no answer and I don't think you'll get one, either. Here's hoping I will be surprised!


 
hazmatgerman (X)
hazmatgerman (X)
Local time: 12:55
English to German
there May 28, 2011

also is the minor point of quality assurance to be considered. How will you assess the quality of the data if you have no reliable information about its genesis?

 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 12:55
Member (2006)
English to Afrikaans
+ ...
TM no, and 2 types of term bases May 29, 2011

rich11 wrote:
...the possibility occured to me of creating good, large TMs, glossaries and/or termbases from the term bases of the ProZ.com website...


I don't think the possibilities for TMs are very good, since TMs store sentences, and the ProZ.com term bases are just words or phrases.

As for term bases, you get two types -- prescriptive and non-prescriptive. A prescriptive term base is one that you must create yourself, to remind you of terms that you must use. The non-prescriptive term base is one that you consult to find possible translations of terms, but which you don't necessarily have to use.

Instead of trying to create a non-prescriptive term base, then, why not just do web searches on the ProZ.com term base via your browser? Or use something like Intelligwebsearch for that?


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 12:55
English to Hungarian
+ ...
Workflow May 29, 2011

Samuel Murray wrote:

Instead of trying to create a non-prescriptive term base, then, why not just do web searches on the ProZ.com term base via your browser? Or use something like Intelligwebsearch for that?

Because it's slow and inconvenient. You have to switch to the source text side, select the word and initiate the search. It doesn't compare to having the term presented to you as you're going along without breaking your stride.
Not everyone works the same way as you do, Samuel. I personally use termbases to cut down on the thinking time (I glance at the TB hits when the right term doesn't come to me instantly) and to speed up typing. Studio's autosuggest paired with a good termbase means you often only have to type the first two letters of a long technical term or phrase.


 
Manticore (X)
Manticore (X)  Identity Verified

Local time: 12:55
English to German
+ ...
@rich11 May 30, 2011

You have to be 100% sure that the TM and the glossary are correct for a specific context. There are quite a few questionable entries in Proz's term base. I don't know of any web site that will give you good results. Sorry, you have to work a bit and build your own TM and glossary according the subjects you are specialising in.

 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 12:55
English to Hungarian
+ ...
To Roland May 30, 2011

Roland Fischer wrote:

You have to be 100% sure that the TM and the glossary are correct for a specific context.

No you don't, not before importing the glossary. You just have to use your judgement when applying hits from the TB, as always.

Roland Fischer wrote:
There are quite a few questionable entries in Proz's term base. I don't know of any web site that will give you good results.


That applies to every single dictionary and glossary ever created, and every dictionary and glossary that will be created in the future.
That doesn't mean that making existing imperfect resources available in a more convenient format is a pointless excercise.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 12:55
Member (2006)
English to Afrikaans
+ ...
@Farkas May 30, 2011

FarkasAndras wrote:
Roland Fischer wrote:
You have to be 100% sure that the TM and the glossary are correct for a specific context.

No you don't, not before importing the glossary. You just have to use your judgement when applying hits from the TB, as always.


It is true that we all work in slightly different ways. I myself would like to know that I can trust my TB nearly without thinking. If I have to think about using a term every time the TB suggests it, it would take me much longer to do the translation, because I would have to search the TM or previously translated portion of the source file each time to ensure that I use the terms consistently.

If your CAT tool has a dictionary function (such as OmegaT), then you can load the ProZ.com term base as a dictionary into that, but then you know that the hits you get from that are mere suggestions.

FarkasAndras wrote:
Samuel Murray wrote:
Instead of trying to create a non-prescriptive term base, then, why not just do web searches on the ProZ.com term base via your browser? Or use something like Intelligwebsearch for that?

Because it's slow and inconvenient. You have to switch to the source text side, select the word and initiate the search.


I guess it depends on your CAT tool. Your CAT tool has "source text side" to which you are forced to "switch" every time you want to select a word in it. Mine doesn't.

But even if it did, you can also type in the word manually to do the web search. If you are using an autocomplete tool that works outside the CAT tool as well as inside it (e.g. LetMeType) then you can type the word in the Intelligwebsearch box and the word will be autocompleted after typing just one or two letters. But even if your autocomplete tool only works inside your CAT tool, you can still type it in the target field and then initiate the keyboard shortcut as soon as the word appears.

I agree that seeing a list of words automatically can be useful, but only if that list does not contain a whole lot of useless information that you should wade through. If 1/3 of the words in a 30-word sentence have TB matches, and some of them have multimatches, your TB pane will be something that requires quite a bit of reading, and that can also slow you down.

Not everyone works the same way as you do, Samuel. I personally use termbases to cut down on the thinking time (I glance at the TB hits when the right term doesn't come to me instantly) and to speed up typing.


I also use it to cut down on typing time (that is why I tend to add long words to the TB even if I know their translation), but I mainly use the TB to ensure consistency, and you can't have that if your TB is a hodge podge of terms from various sources.



[Edited at 2011-05-30 11:10 GMT]


 
Richard Hill
Richard Hill  Identity Verified
Mexico
Local time: 05:55
Member (2011)
Spanish to English
TOPIC STARTER
thanks all for your comments Jul 17, 2011

Having read your responses and having gained a little more understanding, I agree with Samuel when he says, "I myself would like to know that I can trust my TB nearly without thinking", thus it's better to create one's own termbases for reliability, unless the glossaries were made available in some text file format, and you could do some quality checking and editing.

Cheers!

rich


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Create TM, termbase and glossaries for CAT from proz term base of answered questions






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »