You are not logged in.

  • "brunodc" is male
  • "brunodc" started this thread

Posts: 2,259

Date of registration: May 31st 2011

Language Team: French

Focus Group: Translator
Translation Proofreader
Language Coordinator
LTI Administration Group
LTI Development Group

Location: France

Thanks: 69824 / 740

  • Send private message

1

Tuesday, November 22nd 2011, 8:17am

**** Timeshifting considerations ****

This thread is meant for us to share some thoughts about the task of timestamp adjustments on our various videos, which (hopefully) will help us being more efficient with the transcription process and reduce the amount of time between a video release and the completion of the English transcription.

1 registered user and 69 guests thanked already.

Users who thanked for this post:

Borislav/lizardman

  • "brunodc" is male
  • "brunodc" started this thread

Posts: 2,259

Date of registration: May 31st 2011

Language Team: French

Focus Group: Translator
Translation Proofreader
Language Coordinator
LTI Administration Group
LTI Development Group

Location: France

Thanks: 69824 / 740

  • Send private message

2

Tuesday, November 22nd 2011, 9:15am

A few thoughts I'd like to share:

  • I do not trust the way dotSUB is handling the timing of the videos, because a web-based tool will hardly match the precision we would get with an offline tool. That is the reason why I use VisualSubSync exlusively to transcribe and timeshift videos. Not only do we do a much better job with those tools, but we also work much faster.
  • I found that it is more comfortable when the subtitles appear very slightly before the speaker utters his or her first words. I always felt that my eyes were always slower than my ears and that I was able to fine tune the timestamps within -100ms or +100 ms to make them more comfortable to read. I read somewhere that there's a threshold under which we can't really tell whether speech is not in sync with video. The delay is 80ms. I think I'm gonna try to make a video this week-end so that we can make a little LTI study on the subject, with a questionnaire and everything. That should be fun ! Anyway, here's what I do when I timeshift to make sure the sync is OK:
    • I usually try to avoid the flv format if I have a choice, because this format tends to mess up the timing of the video (not always, but it happens frequently).
    • When timeshifting the video, I start the timestamp exactly when the speaker starts.
    • When the timeshifting is complete, I load the subtitles on a video player and get a feel of the reading comfort. If I feel that the subtitles start too late, I change all timings to start 50ms or 100ms earlier (I do 50ms 95% of the time). I repeat the operation as long as I don't get it right. I use the "delay" function in subtitle workshop, but most subtitle editors have that function.

  • The more I timeshift, the more I fond myself bending the 70 character rule. This rule states that subtitle strings shouldn't be larger than 70 characters to give translators a chance to cram their translations in the same string.
    But here's why I say this: I find that the length of time is much more important than the number of characters in the string, because how good is it to have two 1,5 second strings if most readers won't have time to read the text inside them? Of course, this 70 character rule is still very important and should be respected 95% of the time, but I never hesitate to forget about it if I feel that it gives translators a better chance to convey the meaning of the sentence. In those cases, all I'm trying to do is to make sure that the string doesn't spill over a third line, because NO STRING should exceed more than 2 lines. VisualSubSync allows us to check that out very easily.

That's all that comes to mind. Please don't hesitate to share your thoughts on the subject with the group, it's really important to us to get feedback.

2 registered users and 60 guests thanked already.

Users who thanked for this post:

Ana, musicmanUK

  • "lizardman" is male

Posts: 578

Date of registration: Jun 3rd 2011

Language Team: Bulgarian

Focus Group: Translation Proofreader
LTI Administration Group

Location: Plovdiv, Bulgaria

Thanks: 25455 / 483

  • Send private message

3

Tuesday, November 22nd 2011, 6:31pm

I would like to give some feedback on the 70 character rule. Yes, I very much support what you're saying - the biggest priority is to have a meaningful string (cut in the most logical places) appearing for a reasonable amount of time. If that bends the 70 character rule, so be it. But there are limitations to this, of course. A few letters more wouldn't make much of a difference, but it would definitely be hard, on some occasions, to deal with something like 75 characters. So perhaps there should be an upper limit that is not to be exceeded.

Of course, trying to keep up with no more than 70 characters is a very important guideline and what I'm saying above applies more to special cases.

58 guests thanked already.
  • "ichernev" is male

Posts: 24

Date of registration: Jul 4th 2011

Language Team: English Team

Focus Group: English Transcriber
Timestamp Adjustment Team

Location: Bulgaria

Thanks: 929 / 1

  • Send private message

4

Friday, November 25th 2011, 2:13pm

About the starting time of subtitles -- when I timeshift I use the waveform and also play the video in the exact spot where the subs start to see if this is really the spot or not. I noticed, that when playing in a movie player the subs somehow seem a little bit delayed according to the speech, but that is perfectly fine for me (it feels delayed, but it doesn't feel wrong). I think displaying subs before the voice is awkward, but of course this is my opinion :))

About the 70 chars limit. When I timeshift I try to reduce all strings larger than 70chars, but I leave some that are < 75 chars, if they can not be split properly. PJ talks pretty fast on some interviews, so you either have a very long subtitle or two with short duration. I sometimes break the rule about starting subs exactly when the speaker starts, if the previous subtitle is too short (< 1.5 secs), so I make the sub at least 1.5 - 1.6 secs, and the next one starts delayed, but at least you can read it all.

Characters per second is another important characteristic. 25 chars/sec seems to be the upper limit (I read this somewhere) so I often prefer to have short subtitles with short duration (at least 1.5) if the chars/sec is ~ 10-15, especially if the next subtitle is much longer.

About the number of lines -- this is very much configurable. I always change the subtitle font on my player (mplayer) and even with the same size different videos/subs have different visible size on the screen. If you are talking about lines shown on a particular place (like dotsub or youtube) this is something else. But even there, for ex. youtube, if you're fullscreen don't you have more space per line? I mean, just stick with 70-75 char rule and it should be fine for everybody (or change it, to 75-80 if you will), but it should be measured in characters, not lines.

62 guests thanked already.

Posts: 3,056

Date of registration: May 24th 2011

Language Team: English

Focus Group: Timestamp Adjustment Team
Public Relations Group

Location: Malmö, Sweden

Thanks: 93602 / 2397

  • Send private message

5

Sunday, November 27th 2011, 4:34pm

Hi! I am also a timeshift maniac and my name is nomada :hypno:

I agree with all that you say up here. There are few points where there is divergence of approach when about how precise should we be with synchronization and how far should we go braking the >70 characters guideline.

In the first one, I think all works fine, as long as the subs show up synchronized with the voice. if slightly before or after it doesn't make much difference. But a few days ago i noticed that Addendum's subtitles on TZMOfficial youtube are appearing too soon. So we could try to find out all the situations that can provoke this and avoid them. flv format and dotsub are already not advised by Bruno and in this case of Addendum I think it was an online dotsub timeshifting, if I am not mistaken.


About the structure of the strings guidelines, I found them more and more important and I also feel the guidelines "sliding" a bit in several directions as to adapt to the situations found. Break the synchronization guideline in order to give more space to a string, allow more characters/string etc. I don't have a limit that I always follow. After timeshifting, if the analyze tool still tells there are >70 characters strings and others with less than 2 seconds but more than 35 characters (reading time is bad), I go back and review those. It all depends of if we can actually read them + the importance of the content on that string, which determines how long will I build the string or if I will cut it. I can't really tell a rule. Sometimes the message is not so important or it has more grunging or a repetition of what was said on the string before or...etc, and i live the string really long, because I know that the translation of the meaning can make a huge difference on those "not so technical terms used" situations. Resuming, the reading time is what is influencing more my decisions.

Still on the >70 characters guideline, i notice very often that strings with 10 characters more can be smaller then others with 10 characters less. I don't know if all alphabets have equally fat and thin letters like "G" and "I" but, if that is the tendency, shouldn't we be pointing at a virtually ideal length than to a maximum length?

Another point of view on this issue: if the limit of characters can change so much depending on the player/website where we apply the subs, maybe we should focus more on that, for a start? Should we follow the cinema (or whatever) official limit of 42 characters/string? Or should we consider that our target population are mostly Youtube users and we should adapt our guidelines according to that? Or should we research which are the limits in most frequently used players, and then find an in between point for the English transcriptions target (not maximum limit, because in the translations it all varies without a possible reference)?


Last point: I really think that it is more "costly" to A) teach the guidelines to every new transcription volunteer (they can come anytime just to work on 1 single video that they like and then leave) and later take the same time, that we would need to transcribe in his place, to correct his work during the timeshifting, than it would be for B) us (everyone transcribing regularly + timeshifters) focusing on transcriptions offline and save a good amount of string restructuring here? Sometimes when I timeshift I feel like repeating the transcription task. It is more like taking the puzzle pieces and puzzle board built by someone else and use a scissor to cut the edges of the whole puzzle and pieces in order for them to fit; when we could just made them out of our mental moldes (integrated guidelines :D ). Transcribing offline is the only hypothesis for both Shane (in china) and Gert (off the gridd somewhere in Canada). So we are already starting - if they stick around - with this approach. Let me know what you think.

hugs
Signature from »nomada« Click to know how to: - Receive email notifications - Join the LTI biweekly Meetings

1 registered user and 56 guests thanked already.

Users who thanked for this post:

brunodc

  • "georgyvlad" is male

Posts: 52

Date of registration: Jul 24th 2011

Language Team: Bulgarian

Focus Group: English Transcriber
English Proofreader
Translator
Translation Proofreader
LTI Development Group

Location: Mountain View, CA, USA

Thanks: 1941 / 34

  • Send private message

6

Sunday, November 27th 2011, 6:35pm

Hi folks,

Here is my input...

- I like the online tool (dotSub) for several reasons:
1) it is more fun and easier for multiple people to collaborate;
2) it works pretty well when using the keyboard shortcuts;
3) I don't have to trust and install any software on my computer;
4) some proofreading can start between multiple transcribers even before we start proofreading.
I suggest that the people who want to work offline, post in the forum what section they will work on (e.g. from 5:00 to 10:00) and later download the most current online text, merge their work and upload back to dotSUB (I hope that's not too hard in offline tools)

- I agree with the general sentiment that rules can be bent in rare cases

- I have a general, radical idea :-) How about we divide the transcribing from the timestamping altogether? I think it has more benefits than drawbacks. Here they are:
1) More fluent English speakers / listeners can focus on the transcription and we can use their time more efficiently
2) The transcription can happen very quickly, if we don't focus on timestamps. This means that we can very quickly produce a transcript (even proofread) for purposes other than subtitles - access for the deaf and ability to quickly review video material content
3) Timestamping can be done by people with even very limited English fluency
4) Timestamping can be done with offline tools
5) We save the need for an extra proofreading step after timestamps are adjusted (timeshifting) in the current process

1 registered user and 52 guests thanked already.

Users who thanked for this post:

nomada

  • "georgyvlad" is male

Posts: 52

Date of registration: Jul 24th 2011

Language Team: Bulgarian

Focus Group: English Transcriber
English Proofreader
Translator
Translation Proofreader
LTI Development Group

Location: Mountain View, CA, USA

Thanks: 1941 / 34

  • Send private message

7

Sunday, November 27th 2011, 6:45pm

By the way, I like when the subtitles start exactly when the speaker pronounces the first word. Showing them before will be more confusing to me but I have wanted to do it sometimes when I need to bend rules to fit abnormal speaker timing within our guidelines...

1 registered user and 55 guests thanked already.

Users who thanked for this post:

nomada

  • "sydstf" is male

Posts: 75

Date of registration: Jun 20th 2011

Language Team: Chinese

Focus Group: Language Coordinator

Location: Kaohsiung, Taiwan

Thanks: 3487 / 4

  • Send private message

8

Monday, November 28th 2011, 4:46am

- I have a general, radical idea :-) How about we divide the transcribing from the timestamping altogether? I think it has more benefits than drawbacks. Here they are:
1) More fluent English speakers / listeners can focus on the transcription and we can use their time more efficiently
2) The transcription can happen very quickly, if we don't focus on timestamps. This means that we can very quickly produce a transcript (even proofread) for purposes other than subtitles - access for the deaf and ability to quickly review video material content
3) Timestamping can be done by people with even very limited English fluency
4) Timestamping can be done with offline tools
5) We save the need for an extra proofreading step after timestamps are adjusted (timeshifting) in the current process

Baically I agree with the above-mentioned suggestions. We can separate the timeshifting and transcription. Actually, as long as the transcription is correct, then timeshifting is really the easiest task among all procedures if someone can use off-line tools to adjust, such as the powerful free timeshifting software Aegisub.

Therefore, my suggestion is this: Transcription(including the final correct fully proofread subtitles) should be completed first in any way, then timeshifting members good at using off-line tools will adjust timeshifting off-line and finally just need to simply upload to dotsub without working on dotsub again and changing tiemstamps(done accurately by off-line tools).

It's because I personally find that some subtitles even from Repository section are still often overlapping about the timestamps so that two separate lines may still show up at the same time. Therefore, I still have to use off-line tools to adjust again myself. Personally, I won't suggest timeshifting on dotsub. I strongly recommend off-line tools to finish timeshifting instead of working on dotsub though some of you here may not agree with me, but that's also fine. :D

55 guests thanked already.
  • "Sue" is female

Posts: 928

Date of registration: May 26th 2011

Language Team: Spanish

Focus Group: Language Coordinator

Location: Mallorca, Spain

Thanks: 27078 / 523

  • Send private message

9

Monday, November 28th 2011, 7:15am

Hi :)
I'm not a timeshifter but a translator. I do however appreciate the hard work you guys put into the synchronizing of the subs and understand it's a very time consuming job.
I find sometimes though that some of the sentences seem to be cut in places where just one or two words are left for the next line, then one or two from the next line left for the following, and so on. The result is a few strings one after another which are just half sentences.
Here are a couple of examples.
02:12 02:16
It has banks. It has armies and navies
02:16 02:19
prisons and police. We don't have any of those.

I think there could be three strings here
It has banks.
It has armies and navies, prisons and police.
We don't have any of those.
----
03:35 03:37
We talk at each other. That means
03:37 03:40
sometimes a person will say "Have a nice weekend!"

I'd look at this like so..
We talk at each other.
That means sometimes a person will say;
"Have a nice weekend!"
----
03:53 03:56
When you read the Bible, you say "Jesus meant this...," and he says
03:56 03:59
"No, he meant that," and another person..."He meant this."

When you read the Bible, you say "Jesus meant this..."
and he says "No, he meant that"
and another person..."He meant this..."
-----
08:41 08:44
You have to raise children, because children can learn
08:44 08:48
anything at all. They can learn geology, physics, chemistry.

You have to raise children, because children can learn anything at all.
They can learn geology, physics, chemistry.

A video I find very well timeshifted is Jacque on Free Will.

I hope this helps.

VP Hugs
:D

3 registered users and 56 guests thanked already.

Users who thanked for this post:

nomada, Ana, musicmanUK

  • "cris" is female

Posts: 101

Date of registration: Jun 5th 2011

Language Team: Spanish Team

Focus Group: Translator

Location: Madrid

Thanks: 2861 / 4

  • Send private message

10

Saturday, December 3rd 2011, 4:32pm

notes on my first experience

Hi everyone,
I've only done one work but I was translating before and I've tried to keep in mind some of the things I often found strange in the strings and timestamps such as (these are only examples) adjectives on one line and the noun that goes with them on the next, having more time for a short line than for a really long line that comes straight after it or finishing a line with a conjunct (I've had to do this actually, so as to win some time, but I have finished the lines with "...").

Anyway, what I've tried to do is to firstly look for good places to "cut" sentences, respecting the different phrases, and adjust them to the beginning of the sound (sometimes a little bit before, I agree with some of you on that: not only is it imperceptible to our eyes/ears, but we win some milliseconds to play with).

Concerning the tasks of transcribing and timeshifting some have commented on, I would also encourage that they were done by different people for the same reasons given already. I found it far less time-consuming to timeshift than to transcribe and still, as far as I know, less people like doing it.

As for what's been said about the 70 characters rule, it didn't give me so much trouble as the 35 character one, but something we should account for and which I think isn't mentioned anywhere is that (this is something I learnt in regard to teaching reading skill in Spanish, but I guess it applies for all languages) when you read a word that is common it takes you less time since you "catch" the whole word at once. Contrarily, if it's a word with which you're not familiar, you have to do, whether mentally or verbally, what they call (in spanish) a syllabic reading, which is (so you understand) what little children do when they're learning. I suppose this can be taken into account when we give more or less time to a string because words can be longer or shorter in different languages, but if they're common, they're common everywhere and this makes them quicker to read.

Finally, I strongly recommend offline tools for timeshifting. I used "subtitleeditor" and it has options so as not to let you make strings shorter than 1500 or gaps shorter than 100, it gives you the time length and characters on different columns so you only have to concentrate on giving them the apropriate time in accordance with the number of characters.

Hoping my comments are useful (although, as I said, I haven't got that much experience). Regards,

Cris

2 registered users and 53 guests thanked already.

Users who thanked for this post:

brunodc, nomada

© Linguistic Team International 2019
Context In Motion