Fun with LRS

I’ve been using my Sony PRS-600 for about a month now, and I’ve collected a number of public domain titles for it in the LRF format. Today I discovered the tools lrf2lrs and lrs2lrf from calibre. These tools allow for transforming the LRF files to and from their source code format, LRS, and after playing around a bit I found it to be a really great system, especially for anyone who’s vaguely used to CSS and HTML/XML.

Setting up the environment

In order to use the above tools, you’ll have to install calibre and then install the command line tools under the preferences menu. From there, you can run the two tools from the command line. All you’ll need to do, in terms of syntax, is to be able to run:

lrf2lrs <filename>
lrs2lrf <filename>

You can add in the -o <output-filename> option to either of the above in order to redirect the output.. though I haven’t personally found this particularly essential. It will, by default, use the same file name with differing extensions of .lrs and .lrf, though in my workflow I just make sure that I save a copy of the LRS file when necessary in order to roll back changes.

Editing the LRS files

If you aren’t all that familiar with code, especially XML code, don’t expect any miracles. It is very easy to change the meta information, such as the author, title, etc., just by changing the text that you see into the text that you want.

Editing the Table of Contents

The top of the file contains the table of contents, under the tag <TOC>, with entries as <TocLabel refobj="X">Chapter Name</TocLabel>, where X is the target objid. There is also generally a “refpage” attribute in TocLabel, though you can safely enough ignore these since they’ll be recalculated when transferred back into the LRF format.

objid — object ID’s

The important thing to know about the objid (think object ID) is that it must be unique. It does not have to be sequential, however, since when you run the conversion to LRF (and back to LRS, if so desired) these numbers will be recalculated. The objects to which objid’s are assigned, can be things such as a Page, TextBlock, Image, BlockStyle, TextStyle, etc. In order to keep things straight, I often would enter in new objects as fresh ID numbers way above the max in the document, then convert to LRF and back to LRS. This way, the new LRS will have sequential indexes again.

Body content and styling

The bulk of the text of the LRS document is contained within Main, which contains a number of Page and TextBlock elements, which therein contain paragraphs (P), line breaks (CR), and formatting code such as Italic.

After Main, there is a Style block towards the end of the LRS file. These contain certain elements with their own objid numbers, which are referenced in the attributes of the elements within Main. This is vaguely like CSS, though it is different in that it is a bit more obtuse to write since the styles are only maintained by numbers which can and will be re-indexed in converting between LRS and LRF formats.

Conclusion

There is obviously much, much more to creating files from LRS, and to tricks in editing these source files in more depth. Starting from an existing document, however, it is not too painful to make certain re-arrangements and additions to the structure, and in some cases to simplify code blocks which may have been misinterpreted via conversion or user error. With some trial and error and an attention to detail in maintaining the proper XML format, however, it is nice to be able to control most all aspects of a book by editing a single file. Additionally, even if your target is not LRF, there is relatively easy conversion to EPUB or other formats, using calibre and other open source tools available online.

Leave a Reply