You are not logged in.

  • "lizardman" is male
  • "lizardman" started this thread

Posts: 578

Date of registration: Jun 3rd 2011

Language Team: Bulgarian

Focus Group: Translation Proofreader
LTI Administration Group

Location: Plovdiv, Bulgaria

Thanks: 25298 / 483

  • Send private message

1

Tuesday, October 16th 2012, 3:12pm

Tutorial: subtitle file comparisons - before and after revision

Sometimes it can be useful to have a file that shows the changes that were made during a proofreading. It can serve for educational purposes or similar. In this thread we are going to show how to create a PDF file that clearly shows the differences between two versions of subtitles. Scroll down to the end of this post, where you can take the attached file, to see the end result (it's in Bulgarian, but you will get the idea). Then read on if you'd like to know how you can create such files for your language.

Update: scroll down to the next posts to see a much easier way to make your comparison.

In this tutorial we will use several programs, all of them freely available. I hope you will not be discouraged by this, but rather see it as an opportunity, because the tricks shown here can be very useful in a variety of situations and save a lot of work.

First you will need to remember to save a copy of the subtitles before you start your proofreading. After you're done with all the editing, you will end up with two versions (respectively two files) - before and after the proofreading.


Getting rid of the timecodes

Open one of the subtitle files with Aegisub (after you've downloaded and installed the program). In Aegisub, go to File=>Export Subtitles, then click on Export... and choose to save the file as Plain Text (*.txt) format. Now repeat the procedure with the other subtitle file. This will erase the timecodes (e.g. 00:00:00,400 --> 00:00:03,860) which are only a distraction.


Inserting empty lines

Now we will need the program Notepad++ . Use it to open one of the two exported text files that you now have. Remove the first line which says that the file has been exported by Aegisub, and then place the second line on the place of the first. As you can see, there are now only text lines, but we see that the lines with text are not separated by distance and they'll not look very good in the comparison file. So we will insert empty lines between each of the lines with text. But we will not do this manually (as people usually do). Instead we'll use a Macro. In Notepad++, click on the beginning of the first line, and then go to Macro=>Start Recording. Now press the down arrow (on your keyboard), then Enter. Then go to Macro=>Stop Recording. You have recorded a simple action that you now can repeat as many times as you'd like. Go to Macro=>Run a Macro Multiple Times, then select Run until the end of file, and click OK. Now you have empty lines in between all text lines throughout your file. Save the file, repeat the procedure with the next file and you're done with this step.


Line breaks

If you have line breaks in one or both of your subtitle files, you can make them look nice and be noticeable. If you don't have line breaks, you can skip this step.

If you have line breaks, Aegisub will have exported them as \N symbols. This doesn't look too good to me, and the bad thing is that it's not separated from the rest of the text. I personally chose this symbol instead: |. And I surround it with spaces. So let's replace all of the \N with |. In Notepad++, select one of the \N symbols and press Ctrl+F. Go to the Replace tab and on the Replace With field insert (space)|(space). Then click Replace All. And we have nice symbols for the line breaks throughout the file. :) Just save the changes. If anything goes wrong, here or anywhere else, Ctrl+Z (undo) is your friend.


Making the comparison

For this I use WinMerge . After you install it, a nice feature you can activate is Edit=>Options, go to Shell Integration, and click on Enable Advanced Menu. Now when you right click on a file, you will have a Compare To option. Click on that, then right click on another file and choose Compare. The two files will automatically open up in WinMerge.

OK, so you can give proper names to your two text files (that used to be subtitle files) and open them in WinMerge. The lines with differences are coloured in yellow, and the differences themselves are additionally coloured. Read through the whole length to check if everything is OK. If needed, you can make changes right there in WinMerge. Press F5 to refresh and see the effect of your changes.


Extracting the PDF

A small but useful program is DoPDF . It allows you to generate PDF files from any application, as though you are using a printer. Instead of a real printer, you have the DoPDF "printer" which generates PDF's. Install the program so that you can use it.

When you've checked that everything is OK in WinMerge, and after you've installed DoPDF, go (in WinMerge) to File=>Print. Select DoPDF as the printer and click OK. You will have a PDF document generated. :)


If you have different number of strings

If your two subtitle files are very different from each other, like if you have a different number of strings (strings=subtitle lines), then it is more tricky. You will have trouble inside WinMerge because it will probably not compare them as it should. What I do is put a dot in the places where I have divided a string into two (meaning I have one subtitle line against two), or combine two subtitle lines, or make any necessary adjustments in order to have WinMerge compare the correct lines. Always press F5 to see the effect of your changes. This part can be difficult and frustrating, as WinMerge takes into account the whole files, in their entire length, and then it compares the lines it thinks correspond to each other. So you might have to go down the file, make some changes, then go back up, make more changes, and then press F5 - all of this in order to arrange the files to have the same number of lines, which is needed in order to make the comparison correct.



Hopefully most people won't have different number of strings and so wouldn't need to concern themselves with the last paragraph. If you do need to go through that, however, with some trying and time spent you can manage it. You can always contact me for assistance.

Use this thread for questions and suggestions. :)
lizardman has attached the following file:

3 registered users and 161 guests thanked already.

Users who thanked for this post:

kwizrak, Ana, Ray

  • "ossi11111" is male

Posts: 1,137

Date of registration: Nov 27th 2013

Language Team: German

Focus Group: Translator
Translation Proofreader
Final Reviewer
Language Coordinator
LTI Development Group

Thanks: 40141 / 377

  • Send private message

2

Sunday, March 23rd 2014, 10:23am

Getting rid of the timecodes
You do not need to install Aegisub to do this.

Here you will find a very easy to use tool: http://forum.linguisticteam.org/srt_tool/
You will need the password "n0tv3rycr34t1v3" without ". Select option 7. After uploading the srt-file a txt-file containing only the text lines will be created for you. So there are no time codes anymore.

There is automatically an empty line added between every text line, so you won't need to do the second step called "Inserting empty lines" either. The same holds for the step "Line breaks" which will become obsolete also.

@lizardman:
Please, try out if the tool does what you described here! (Maybe I did not understand it correctly.)
Signature from »ossi11111« Willst du beim Übersetzen mithelfen?
E-Mail an: GermanLingTeam@gmail.com

Bisher veröffentlichte Videos mit deutschen Untertiteln:
GermanLingTeam - YouTube

133 guests thanked already.
  • "lizardman" is male
  • "lizardman" started this thread

Posts: 578

Date of registration: Jun 3rd 2011

Language Team: Bulgarian

Focus Group: Translation Proofreader
LTI Administration Group

Location: Plovdiv, Bulgaria

Thanks: 25298 / 483

  • Send private message

3

Sunday, March 23rd 2014, 11:47am

Unfortunately, the srt tool gives me a one-line text document or an empty text document.

Since this tutorial was created, we have discovered a great tool that makes all of this much easier. Its way of comparison seems to be much more intelligent than the other tools I know of. It's the site Diff Checker - Online diff tool that compares text to find the difference between two text files . Check it out and see if you get better results. Open one subtitle file with a program like Notepad, copy everything and paste it into the Original text section in diffchecker. Then do this for the other subtitle file.

See the result of my comparison: Saved diff g9sel55r - Diff Checker (the differences are after the middle)

Diffchecker also allows you to store your comparison forever, which is quite cool. But we know it's not really forever - for example, the site may stop functioning after some months or years. In the case you want to store your diff and have it for years to come, you can make a screenshot of the entire webpage. So you will have a picture of the comparison on your computer. To do such a screenshot, you can use a Firefox add-on like Fireshot. I'm attaching the screenshot of the comparison that I made with it.
lizardman has attached the following image:
  • FireShot Screen Capture.png

95 guests thanked already.
  • "ossi11111" is male

Posts: 1,137

Date of registration: Nov 27th 2013

Language Team: German

Focus Group: Translator
Translation Proofreader
Final Reviewer
Language Coordinator
LTI Development Group

Thanks: 40141 / 377

  • Send private message

4

Sunday, March 23rd 2014, 1:06pm

That's very strange. I will check that out. Maybe I have to do some minor changes for Windows user. :)
As I figured out together with lizardman the tool only supports languages that mainly use letters of the English alphabet.
Signature from »ossi11111« Willst du beim Übersetzen mithelfen?
E-Mail an: GermanLingTeam@gmail.com

Bisher veröffentlichte Videos mit deutschen Untertiteln:
GermanLingTeam - YouTube

This post has been edited 1 times, last edit by "ossi11111" (Jun 12th 2014, 9:55am)


117 guests thanked already.
© Linguistic Team International 2019
Context In Motion