This is Bolderbast.Inducks.org, part of the inducks.org website.
previous page | next page
Main
How to index
   Beginners
   More detail
   Programmers
     Internal files
     DIZNI
       How to use DIZNI
       How it works
       Bugs
   Scanners
   File formats
   ISV files
   Field reference guide
Credits/links
Producktion
Index of indexes
Various

How to use DIZNI

1. WHAT CAN DIZNI DO FOR YOU

If you are an indexer, the program DIZNI can do several things for you.
The most important things are:

- Check your input file(s)
- Generate internal (and isv) files
- Generate a dump file
- Generate output files

In the following chapters, we'll explain in more detail what and how.
But first, a chapter about what we assume you already know, and a chapter
about how to get the DIZNI program.


2. WHAT DO YOU NEED TO KNOW BEFORE READING THE REST

First of all, we assume you have read the "How to Index"/"More Detail" pages
on the Bolderbast site (http://bolderbast.inducks.org/xh2.html).
So you know about "input", "internal", and "output" files.

Secondly, you need to know something about command lines, batch files, and
directory structures.
For instance a DOS-box in MS-Windows.
In the rest we'll just assume you are an MS-Windows user.
Users of other operating systems (like Linux) are usually smart enough to
read the text anyway. 8-)

It's very handy when you know how to use CVS (http://www.cvshome.org),
but you can also do lots of things without.


3. GETTING DIZNI ITSELF

For using DIZNI on MS-Windows, all you need is the file DIZNI.EXE. This is
available on: http://bolderbast.inducks.org/xh34.html

For other operating systems, you can try to compile the C++ source code of
DIZNI.
The source code is available through CVS (or ask HF by e-mail).
The "make" instruction is very simple: compile *.cpp, and link all object
files together.
(There is a gnu-Makefile available that does just that.)

There are also some example batch files (on CVS:
inducks/programs/dizni/batchfiles/).
These mainly illustrate what we're writing here.
In the examples below, we're showing the contents of these batchfiles.
The directories we use there are just an example.


4. GETTING THE INPUT FILES

We assume you will be running DIZNI starting with input files only.
If you want to start from the internal files, skip this chapter, as well as
the next 2 chapters.
You cannot make a dump in that case, but you can still make output files.

The input files are all available on CVS. For instance under directory
c:\fluks\i\inducks.

Some files are not in CVS as one file, but are split up in several files.
Before running DIZNI, you need to concatenate them by using a "copy" command.

And then there are a few files that are not only needed in the first step
(chapter 5),
but also in other steps. It's handy to copy them to a place other than the
CVS tree.

See the file SUB-GETCVS.BAT:

  copy c:\fluks\i\inducks\stories\fsb\*.fsb d:\fluks\data\i-fsb.dbs
  copy c:\fluks\i\inducks\issues\fr\*.dbi d:\fluks\data\fr.dbi
  copy c:\fluks\i\inducks\issues\it\*.dbi d:\fluks\data\it.dbi
  copy c:\fluks\i\inducks\auxil\countries.dbx d:\fluks\data\
  copy c:\fluks\i\inducks\auxil\languages.dbx d:\fluks\data\
  copy c:\fluks\i\inducks\auxil\producers.dbx d:\fluks\data\
  copy c:\fluks\i\inducks\auxil\dbsnames.dbx d:\fluks\data\
  copy c:\fluks\i\inducks\auxil\equivs.dbx d:\fluks\data\

In this example, the file fr.dbi will be in d:\fluks\data\, which is the
"working directory" that we will use further on.
You need to create (mkdir) this directory. DIZNI will put its internal and
output files, as well as some scratch files here.
DIZNI will make a lot of large files.
You will need at least 420 megabytes of disc space for the first step
(described in the next chapter).


5. GENERATING THE INTERNAL FILES

5.0. RUNNING DIZNI THE STANDARD WAY

If all input files are on their places, you can run (see DOIT-DIZNI-XX.BAT):

  dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\

to generate the internal files in the directory d:\fluks\data\ (always end
the directory name with a \).
This generation can take a long time (though on HF's fast machine it takes
less than 4 minutes), so be patient.
After the generation, you'll have a bunch of new files:

*.log1 and *.log2 and *.log3 - these files are informational (they contain
log messages).
If the directory d:\fluks\data\log\ exists, these files are written there.

*.ine, *.ins, *.inl, *.ind - these are the internal files. This is where you
did it all for.

*.isv - new since September 2005
If the directory d:\fluks\data\isv\ exists, these files are written there.

*.ut3 - these files can be thrown away.

a.ut2, s.ut2, unsolved.ut1 - these files are *only* needed for making a dump
file (chapter 6).
If you don't plan to do that, you can throw them away.


To the command to run "DIZNI -xx", you can add several options. The most
interesting are:

  -q  tells DIZNI to keep quiet (no messages on the screen).
  -o  read owner fields from OWNER.DBX
  -e  also read file EXTRA.DBI
  -s  also perform the options -xse -xss -xsl -xsi with standard linewidth
      etc.

The files OWNER.DBX and EXTRA.DBI are no official Inducks files.
You can create them for your own convenience. OWNER.DBX lists which comics
are owned by you.
EXTRA.DBI can contain indexes of "unofficial" publications.
HF uses EXTRA.DBI to index the stories that he has in xerox form or as scans.
Some of us put a copy of these files in CVS, in inducks/localfiles.

Example:

  copy c:\fluks\i\inducks\localfiles\owner-hf.dbx d:\fluks\data\owner.dbx
  copy c:\fluks\i\inducks\localfiles\extra-hf.dbi d:\fluks\data\extra.dbi
  dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -o -e


5.1. RUNNING DIZNI WITH LESS MEMORY

If running "dizni -xx" causes problems (because it uses a LOT of memory), you
can try to run the following instead:

  dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -1
  dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -2
  dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -3
  dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -4

This does the same as in 5.0, except that it writes more intermediate data to
disc.
So you'll need more disc space. Also, it may be a bit slower.


5.2. RUNNING DIZNI WITH FEWER FILES

If you don't want to download *all* the input files, it's OK to leave most of
them out.
You can run DIZNI anyway, but the results will be less, of course.
For instance, if you don't have the file DK.DBI, you won't get any Danish
reprints in your internal files
(nor in any files generated from that).

There are only a few files that MUST exist when running DIZNI.
These are 3 of the DBX files mentioned in 5.0 (they are all on the CVS
directory "auxil"): countries.dbx, languages.dbx, producers.dbx.

Besides this, at least one DBI file and at least one DBS file must exist.


6. GENERATING THE DUMP FILE

The dump file can be very big: it contains *all* information from *all* input
files, as well as log messages.
The information is sorted by story code, so this file is handy when you want
to compare the indexes of several countries.
BLi wrote a text about how he worked with the dump file. See appendix A.

To create a dump file, you must still have the files from the generation
(a.ut2, s.ut2, unsolved.ut1) as well as some internal files (*.dbx).
Then type (see GEN-DUMP.BAT):

  dizni -xud -c yy -D d:\fluks\data\

This will create a file d:\fluks\data\dump.udm. At the moment (November 2004),
the file is 110 megabytes.
(When BLi wrote his text about dump files in 1999, the file was only 20 MB!)

  dizni -xud -c nl -D d:\fluks\data\

This will create a file d:\fluks\data\dump.udm, with only the stories in it
that were reprinted in the country "nl".


7. GENERATING "STANDARD" OUTPUT FILES

Note: most of the "standard" output files can also be generated by adding -s
to the dizni -xx call (see above).


7.0 ISSUE AND STORY FILES

On the Inducks website (http://coa.inducks.org/inducks/files/) there are
output files available.
You can run DIZNI yourself to generate these files from the internal files.
The options are:

  dizni -xse -D d:\fluks\data\
  dizni -xss -D d:\fluks\data\

These two runs will generate lots of TXT files.
The story indexes are written to d:\fluks\data\stories\ (if that directory
exists, else to d:\fluks\data\).
The issue indexes are written to d:\fluks\data\issues\ (if that directory
exists, else to d:\fluks\data\).


7.1 LEGEND FILES

See DOIT-COA.BAT:

  dizni -xsl -D d:\fluks\data\

This creates a bunch of TXT files with legend information (persons,
heroes, etc.) in
the directory d:\fluks\data\legends\ (if that directory exists, else to
d:\fluks\data\).


7.2 INDEXED ISSUES FILES

  dizni -xsi -f indexed-issues.txt -c all -D d:\fluks\data\

This creates the file "indexed-issues.txt", listing all issues of all
countries.

  dizni -xsi -f indexed-issues-dk.txt -c dk -D d:\fluks\data\

This creates the file "indexed-issues-dk.txt", listing all issues of
Denmark (dk).


You can also make listings of "owned" and "not owned" comics (if you used
OWNER.DBX in the generation step), using the -o option.
And there's a -n option to make nice output that can serve as a checklist.

Examples:

  dizni -xsi -f owned-issues.txt -c all -o -D d:\fluks\data\
  dizni -xsi -f owned-nl-issues.txt -c nl -o -D d:\fluks\data\
  dizni -xsi -f owned-us-issues.txt -c us -o -D d:\fluks\data\
  dizni -xsi -f owned-de-issues.txt -c de -o -D d:\fluks\data\
  dizni -xsi -f nl-checklist.txt -cnl -o -n -D d:\fluks\data\


8. MORE SOPHISTICATED OUTPUT OPTIONS

Using DIZNI -xos, you can make selections in the data.
For instance a CB index of all stories that I own, with all known reprints in the world:

  dizni -xos -u c -a CB -c all -o yes -f CBindex.txt -D d:\fluks\data\ -w 132

Run "dizni -xos -?" for a full explanation of the options.


9. OTHER DIZNI UTILITIES

If you want to find out *all* options of DIZNI, try "DIZNI -?" on a command
line.
The output will give a list of main options.
For every main option (for instance "un"), you can try "DIZNI -xun -?".
This will give you a list of all switches that you can use with that main
option.

In the following paragraphs, we describe the most used utilities.
Other DIZNI options are mainly meant for HF only.


9.1 MAXIMISE

With this option, you can add information from the INDUCKS story file to your
input file.
The call is (see MAXIMISE.BAT):

  dizni -xun -f dekopie.dbi -9 -c nl -D d:\fluks\data\

The directory d:\fluks\data\ should contain dekopie.dbi, stories.ins, and the
dbx files (no other files are used).
DIZNI generates an output file d:\fluks\data\maximised.dbi.

The -c option indicates the country of the DBI file.
This is important for recognising special story items (e.g. HC-coded items in
a Dutch DBI file).
There are several items that DIZNI can add to the file.
       -1   add pages field (and pagelayout, brokpg)
       -2   add app / xapp field
       -3   add hero field
       -4   add plot field
       -5   add writ field
       -6   add art field
       -7   add ink field
       -8   add desc field
       -9   change internal field (to @ if unsolved, to ! if special, to ?
            if idc)

All additions (so not the "-9" changes) are logged in the output file itself,
in a hidden comment, always starting with [/`maxim:
This is done so you can edit the file afterwards, and change the "[/`maxim:"
text to something more meaningful.


9.2 CHECKEQUIV

The call is (see CHECKEQUIV.BAT):

  dizni -xuq -C c:\fluks\i\inducks\ -D d:\fluks\data\ -s -g

This option needs the DBX files, all applicable DBI files and headers.ine.
It writes a report to d:\fluks\data\equiv.log.
This report can be used to spot errors. For instance:

    Reporting equivalents of:
      #de1 = de/LTB 301
      #nl2 = nl/PO3  88
    Not listed in all 2 issues:
      D 98172/49                is(+) or is not(-) in: #de1- #nl2+
      D 98172/50                is(+) or is not(-) in: #de1+ #nl2-

Apparently, the "nl" entry has a wrong page count (unless the issues are not
fully equivalent, of course).


9.3 LANGUAGE FILE CHECK

The call is:

  dizni -xut -c nl -C c:\fluks\i\inducks\ -D d:\fluks\data\

This option reads the file heroes-nl.dbi and produces a file heroes-nl.cpy.
This cpy file contains a checked version of all characters. Missing characters are
added (with empty translation), while wrong character names are noted.


9.4 CHECK (AND COPY) DBI FILE

The call is (see CHECKEDCOPY.BAT):

  dizni -xuc -f nl.dbi -C c:\fluks\i\inducks\ -D d:\fluks\data\ -c -h

This option reads nl.dbi and makes an "improved" copy of it
(d:\fluks\data\nl.cpy).
In this CPY file, the most obvious errors are automatically corrected (if
the -c option is given).
These corrections are also marked in the CPY file (if the -h option is given),
for instance like this: [/`hero changed, was: D+M]
It uses the DBL input files from the CVS tree, as well as the usual DBX files
and OLDCODES.DBX.
The file "nl.dbi" should be on d:\fluks\data\, DIZNI does *not* look in the
CVS tree for it (because it cannot guess in which subdir the file is).

Error messages are written to DIZNI.LOG.


APPENDIX A. HOW TO WORK WITH THE DUMP FILE

Text of BLi (October 13th 1999); heavily edited by HF on 10-11-2004.

The indexer's dump is a file containing *all* information from the story *and*
issue input files gathered conveniently in one place.
It looks for instance like this:

  ------------
    /           D 92311      12           VR      DD  [xapp:GL,HDL,US,DD]
  de/93-41b     D 92311      12           VR      DD  Die Glücksritter
  dk/AA93-38c   D 92311      12           VR      DD  Den store gevinst
  fr/JM 2287f   D 92311      12           VR      DD  La chance tourne
  it/MG  451k   D 92311                              
  nl/94-26b     D 92311      12           VR      DD  [xapp:GL,HDL,US,DD]
  se/93-38c     D 92311      12                   DD  Många turer
  uk/M93-39c    D 92311                               Luck Of The Draw!
  ------------

It's almost the only file I'm using lately for any research.
This dump file can't be generated from the internal files.
You need *.ut2 files which are temporary files in the process of generating
the internal files.
See chapters 5 and 6.

HF will certainly upload a zipped version of the complete dump if you ask for
it.
The dump file generations has an option also to make a dump file with only
the entries of stories with reprints of a certain country.

These are good for checking of errors or locating codes not yet published
in a certain country.
These also can easily be made from the complete dump-file with SKO's lfi.
E.g. The command line

  lfi /p1 "de/" dump.udm > de.txt

results in a dump called de.txt with the entries for stories with reprints in
Germany

The command line

  lfi /v /p1 "de/" dump.udm > -de.txt

results in a dumpfile with the rest of the stories having no reprint in
Germany.

With the help of HF's diznutil.exe these files can be searched very
effectively.
The trick is that the contents of a story entry in the dump file like the
example is sorted on a single line so all the information from all the files
can be searched with SKO's find program (lfi).
I'll give an actual example (I identified a code especially for this):

I have this unidentified entry:

  74-13d   W              6  @        PAl     AR  Wo die Milch herkommt
  [xapp:ARK,Duchess] [/xapp Hunde von AR?!!?]

The search

  f /p15 W -de1|f /p28 " 6"|f /p49 AR|f PAl|d D > find.txt

results in a find.txt

  dk/AA72-37f   W ARIS  1-03  6           PAl     AR  Mælkens vej
  se/72-37f     W ARIS  1-03  6                   AR  kommer med mjölken 
  w /ARIS  1-03 W ARIS  1-03  6           PAl     AR  
  ------------

which obviously is the story searched for (Mælkens and mjölken are the
Scandinavian words for Milch = milk).

This was the ideal case because it really was the only entry displayed.
Usually there are a few more that could match, but the descriptions mostly
reveal if it's the story in question or not.

Let me explain, first I renamed SKO's lfi.exe to f.exe (so I don't have to
type that much).
For the same reason I renamed diznutil.exe to d.exe.
The dump-file I was searching (nor German reprints) has the name -de1.

The command above meant:

Look in the file -de1 for lines with a "W" at position 1 (= W-code), with
" 6" at poisition 28 (page count),
with AR at position 49 (hero Aristocats) and containing the artist PAl (no
position necessary but of course also possible to use).

The file -de1 already contains the information on single lines for searching
(hence the 1 at the end).
The "d D" at the end sorts the result back to different lines into readable
format into find.txt.

Is it clear from what I wrote? I hope so, because I want to stop now.

Just two last commands

  diznutil d dump1

sorts the content of dump.udm in searchable single line format into dump1

  diznutil D dump.udm

converts it back (essentially this is what the |d D does at the end of the
search command).

--end--
This page was generated on 2007-05-11 by DVEGEN 4.5a © Harry Fluks 2003.
For more information contact Harry Fluks (hfl at inducks.org - replace the at)