Saturday, October 26, 2019

Splitting Kanji and okurigana at the end of the line


A question was asked on the Linguistics Stack Exchange about the oriental languages. The title was the following: How are line breaks handled in ideographic scripts?


The answer made me think, and I thought I'd ask here to have a wider range of answers on the matter.


When writing Japanese, if we reach the end of the page, can we split the Kanji and their okurigana? I'd like to bring an example but nothing comes to my mind, so feel free to provide one, in case I don't do that first.


And another question on the first one: does this change if we are talking of handwritten VS typed Japanese?



Answer



If you hit the end of line, and you're out of space, yes, you can freely split kanji and their okurigana. I have a novel right in front of me that does it two lines in a row on the second page:




  • 彼女と初 // めて会った

  • // い出してみるがいいよ.


Wikipedia says that the rules governing line-splitting in Japanese are called 禁則【きんそく】処理【しょり】, and there are slight variations in implementation.


For example, LibreOffice uses the following lists and rules by default when typesetting Japanese text:



Characters that should not begin a line


(If a character typed here is positioned at the beginning of a line after a line break, it is automatically moved to the end of the previous line.)


!%),.:;?]}¢°’”‰′″℃、。々〉》」』】〕ぁぃぅぇぉっゃゅょゎ゛゜ゝゞァィゥェォッャュョヮヵヶ・ーヽヾ!%),.:;?]}。」、・ァィゥェォャュョッー゙゚¢


(essentially: do not begin a line with a suffix, a terminating or closing punctuation mark, or any kana that changes the pronunciation1 of the kana that ended the previous line, including the sokuon)



Characters that should not end a line


(If a character typed here is positioned at the end of a line due to a line break, it is automatically moved to the beginning of the next line.)


$([¥{£¥‘“〈《「『【〔$([{「£¥


(essentially: do not end a line with a prefix or opening punctuation mark)



This basically lines up with what I see in novels. The process leaves the line a character or so short, and in typeset text, justification is normally applied, by increasing (追い出し) or decreasing (追い込み) the space between characters. On 原稿【げんこう】用紙【ようし】, where this is obviously not possible, I'd guess that punctuation which can't begin the next line is just stuck after and outside the last box on the end of the previous line, or wedged into that last box.


However, when the location of the end of line is only roughly defined instead of rigidly enforced, for example when writing in a nebulously-defined small region of a larger piece of paper, there's often a tendency to split where it comes most naturally rather than when you run out of room. In that case I feel (but I'm not a native, uh, writer) that it would be unusual to split a kanji and its okurigana.




1 Incidentally, I find it amusing that they offer readers this comfort, yet have no qualms about leaving the pronunciation of the last kanji on a page entirely in question until you turn over...


No comments:

Post a Comment

digital communications - Understanding the Matched Filter

I have a question about matched filtering. Does the matched filter maximise the SNR at the moment of decision only? As far as I understand, ...