Discussion:
Linebreaks in paragraphs and character entities
bvh
2003-12-12 12:13:23 UTC
Permalink
Hi,

I am interested in docbook->latex conversion. I wrote a simple C++ program for
that (naartex). Unfortunatly only after I first published it someone
pointed to db2latex.

Now I am trying to use db2latex instead because it is much more mature than
my first rudimentary conversion and it would save me a lot of time if I could
use something existing instead of brewing my own solution. (I've to use this
in a commercial project, so) However I've run in 3 small issues.

1. Use of character entities

The documents I have to translate use character and entity references for
special characters like accents, the euro symbol and many more.

Entity references in the latin1 set like (à etc) are converted to
latin1 encoding by db2latex (I believe?) I know about the latinenc package
to solve this problem, but what about characters outside that set? Are there
other LaTeX packages I need to use?

Character entities like € (I believe this to be the euro symbol) are
left alone by db2latex. Again the problem is that (my installation of) LaTeX
doesn't know how to handle these. What should I do to get that working?

The FAQ mentions this but it is not fully clear for me how to solve the
character entity references (with the numerical form)

2. Line breaks in paragraphs

My documents sometimes contain things like this

<para>
This is

one paragraph
</para>

I believe the intent is to render this as one single paragraph. However when
I convert to LaTeX with db2latex the single white line is still there
and becomes a hard line break in the typeset document.

3. We have documents where the scaling factor for the imagedata has a fraction.
This didn't work with db2latex. However when rereading the docbook specification
before posting I saw that the scaling factor must be an integer so I'll have
to change those sources then.

Thanks for taking your time in replying and developing db2latex.

cu bart
--
http://www.irule.be/bvh/


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
James Devenish
2003-12-13 02:00:13 UTC
Permalink
Hi,
Post by bvh
1. Use of character entities
[...]
Post by bvh
Entity references in the latin1 set like (&agrave; etc) are converted to
latin1 encoding by db2latex
It's the way XML works :-/
Post by bvh
Character entities like &#8364; (I believe this to be the euro symbol) are
left alone by db2latex. Again the problem is that (my installation of) LaTeX
doesn't know how to handle these. What should I do to get that working?
There are several solutions provided by DB2LaTeX. You will need to turn
these features on, though. I personally go for Unicode handling. This
means that I have install the 'ucs' ('unicode') LaTeX package and then
use the following variables in my XSL files:

<xsl:output encoding="UTF-8"/>
<xsl:variable name="latex.inputenc">utf8</xsl:variable>
<xsl:variable name="latex.use.ucs">1</xsl:variable>
<xsl:variable name="latex.ucs.options">postscript</xsl:variable>

Some examples are available in the snapshot 'samples' tarball or can be
viewed on the web:
<http://cvs.sourceforge.net/viewcvs.py/db2latex/db2latex/xsl/sample/test_entities/>.
Post by bvh
2. Line breaks in paragraphs
[...]
Post by bvh
<para>
This is
one paragraph
</para>
[...]
Post by bvh
However when I convert to LaTeX with db2latex the single white line is
still there and becomes a hard line break in the typeset document.
This doesn't happen to me. Maybe I worked around it in CVS. Could you
try a snapshot? <http://db2latex.sourceforge.net/snapshot/> I tried
to track down where this was fixed, but I couldn't find it. Confusing.
Post by bvh
3. We have documents where the scaling factor for the imagedata has a fraction.
This didn't work with db2latex. However when rereading the docbook specification
before posting I saw that the scaling factor must be an integer so I'll have
to change those sources then.
Yep, the @scale attribute is a percentage (0--100). See also
<http://db2latex.sourceforge.net/reference/rn30re79.html>.




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
James Devenish
2003-12-13 02:09:02 UTC
Permalink
Post by James Devenish
Post by bvh
2. Line breaks in paragraphs
[...]
Post by James Devenish
Post by bvh
However when I convert to LaTeX with db2latex the single white line is
still there and becomes a hard line break in the typeset document.
[...]
Post by James Devenish
This doesn't happen to me. Maybe I worked around it in CVS.
Ah yes, found it. It's taken care of by the trim-outer template in
normalize-scape.mod.xsl (snapshots / CVS only).




-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
bvh
2003-12-13 13:42:21 UTC
Permalink
Post by James Devenish
Ah yes, found it. It's taken care of by the trim-outer template in
normalize-scape.mod.xsl (snapshots / CVS only).
This answers my other question in the thread. Didn't realize that XPATH is
part of xslt.

cu bart
--
http://www.irule.be/bvh/


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
bvh
2003-12-13 13:37:48 UTC
Permalink
Post by James Devenish
Some examples are available in the snapshot 'samples' tarball or can be
<http://cvs.sourceforge.net/viewcvs.py/db2latex/db2latex/xsl/sample/test_entities/>.
OK. I'll try these.
Post by James Devenish
Post by bvh
However when I convert to LaTeX with db2latex the single white line is
still there and becomes a hard line break in the typeset document.
This doesn't happen to me. Maybe I worked around it in CVS. Could you
try a snapshot? <http://db2latex.sourceforge.net/snapshot/> I tried
to track down where this was fixed, but I couldn't find it. Confusing.
Yep. Is fixed in the snapshot. The last released version (0.7?) still
contains this problem, but the snapshot from 13-12 does not have this
problem anymore. Thanks!

Just out of curiosity : how do text transformations like these work in
xsl? I've only just started to play with xsl-transformation so I still
have a _very_ hard time navigating around the complex xslts like
db2latex.

cu bart
--
http://www.irule.be/bvh/


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills. Sign up for IBM's
Free Linux Tutorials. Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
Loading...