|
previous page | next page |
|
|
Main How to index Beginners More detail Programmers Internal files DIZNI How to use DIZNI How it works Bugs Scanners File formats ISV files Field reference guide Credits/links Producktion Index of indexes Various |
How to use DIZNI
1. WHAT CAN DIZNI DO FOR YOU
If you are an indexer, the program DIZNI can do several things for you.
The most important things are:
- Check your input file(s)
- Generate internal (and isv) files
- Generate a dump file
- Generate output files
In the following chapters, we'll explain in more detail what and how.
But first, a chapter about what we assume you already know, and a chapter
about how to get the DIZNI program.
2. WHAT DO YOU NEED TO KNOW BEFORE READING THE REST
First of all, we assume you have read the "How to Index"/"More Detail" pages
on the Bolderbast site (http://bolderbast.inducks.org/xh2.html).
So you know about "input", "internal", and "output" files.
Secondly, you need to know something about command lines, batch files, and
directory structures.
For instance a DOS-box in MS-Windows.
In the rest we'll just assume you are an MS-Windows user.
Users of other operating systems (like Linux) are usually smart enough to
read the text anyway. 8-)
It's very handy when you know how to use CVS (http://www.cvshome.org),
but you can also do lots of things without.
3. GETTING DIZNI ITSELF
For using DIZNI on MS-Windows, all you need is the file DIZNI.EXE. This is
available on: http://bolderbast.inducks.org/xh34.html
For other operating systems, you can try to compile the C++ source code of
DIZNI.
The source code is available through CVS (or ask HF by e-mail).
The "make" instruction is very simple: compile *.cpp, and link all object
files together.
(There is a gnu-Makefile available that does just that.)
There are also some example batch files (on CVS:
inducks/programs/dizni/batchfiles/).
These mainly illustrate what we're writing here.
In the examples below, we're showing the contents of these batchfiles.
The directories we use there are just an example.
4. GETTING THE INPUT FILES
We assume you will be running DIZNI starting with input files only.
If you want to start from the internal files, skip this chapter, as well as
the next 2 chapters.
You cannot make a dump in that case, but you can still make output files.
The input files are all available on CVS. For instance under directory
c:\fluks\i\inducks.
Some files are not in CVS as one file, but are split up in several files.
Before running DIZNI, you need to concatenate them by using a "copy" command.
And then there are a few files that are not only needed in the first step
(chapter 5),
but also in other steps. It's handy to copy them to a place other than the
CVS tree.
See the file SUB-GETCVS.BAT:
copy c:\fluks\i\inducks\stories\fsb\*.fsb d:\fluks\data\i-fsb.dbs
copy c:\fluks\i\inducks\issues\fr\*.dbi d:\fluks\data\fr.dbi
copy c:\fluks\i\inducks\issues\it\*.dbi d:\fluks\data\it.dbi
copy c:\fluks\i\inducks\auxil\countries.dbx d:\fluks\data\
copy c:\fluks\i\inducks\auxil\languages.dbx d:\fluks\data\
copy c:\fluks\i\inducks\auxil\producers.dbx d:\fluks\data\
copy c:\fluks\i\inducks\auxil\dbsnames.dbx d:\fluks\data\
copy c:\fluks\i\inducks\auxil\equivs.dbx d:\fluks\data\
In this example, the file fr.dbi will be in d:\fluks\data\, which is the
"working directory" that we will use further on.
You need to create (mkdir) this directory. DIZNI will put its internal and
output files, as well as some scratch files here.
DIZNI will make a lot of large files.
You will need at least 420 megabytes of disc space for the first step
(described in the next chapter).
5. GENERATING THE INTERNAL FILES
5.0. RUNNING DIZNI THE STANDARD WAY
If all input files are on their places, you can run (see DOIT-DIZNI-XX.BAT):
dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\
to generate the internal files in the directory d:\fluks\data\ (always end
the directory name with a \).
This generation can take a long time (though on HF's fast machine it takes
less than 4 minutes), so be patient.
After the generation, you'll have a bunch of new files:
*.log1 and *.log2 and *.log3 - these files are informational (they contain
log messages).
If the directory d:\fluks\data\log\ exists, these files are written there.
*.ine, *.ins, *.inl, *.ind - these are the internal files. This is where you
did it all for.
*.isv - new since September 2005
If the directory d:\fluks\data\isv\ exists, these files are written there.
*.ut3 - these files can be thrown away.
a.ut2, s.ut2, unsolved.ut1 - these files are *only* needed for making a dump
file (chapter 6).
If you don't plan to do that, you can throw them away.
To the command to run "DIZNI -xx", you can add several options. The most
interesting are:
-q tells DIZNI to keep quiet (no messages on the screen).
-o read owner fields from OWNER.DBX
-e also read file EXTRA.DBI
-s also perform the options -xse -xss -xsl -xsi with standard linewidth
etc.
The files OWNER.DBX and EXTRA.DBI are no official Inducks files.
You can create them for your own convenience. OWNER.DBX lists which comics
are owned by you.
EXTRA.DBI can contain indexes of "unofficial" publications.
HF uses EXTRA.DBI to index the stories that he has in xerox form or as scans.
Some of us put a copy of these files in CVS, in inducks/localfiles.
Example:
copy c:\fluks\i\inducks\localfiles\owner-hf.dbx d:\fluks\data\owner.dbx
copy c:\fluks\i\inducks\localfiles\extra-hf.dbi d:\fluks\data\extra.dbi
dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -o -e
5.1. RUNNING DIZNI WITH LESS MEMORY
If running "dizni -xx" causes problems (because it uses a LOT of memory), you
can try to run the following instead:
dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -1
dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -2
dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -3
dizni -xx -C c:\fluks\i\inducks\ -D d:\fluks\data\ -4
This does the same as in 5.0, except that it writes more intermediate data to
disc.
So you'll need more disc space. Also, it may be a bit slower.
5.2. RUNNING DIZNI WITH FEWER FILES
If you don't want to download *all* the input files, it's OK to leave most of
them out.
You can run DIZNI anyway, but the results will be less, of course.
For instance, if you don't have the file DK.DBI, you won't get any Danish
reprints in your internal files
(nor in any files generated from that).
There are only a few files that MUST exist when running DIZNI.
These are 3 of the DBX files mentioned in 5.0 (they are all on the CVS
directory "auxil"): countries.dbx, languages.dbx, producers.dbx.
Besides this, at least one DBI file and at least one DBS file must exist.
6. GENERATING THE DUMP FILE
The dump file can be very big: it contains *all* information from *all* input
files, as well as log messages.
The information is sorted by story code, so this file is handy when you want
to compare the indexes of several countries.
BLi wrote a text about how he worked with the dump file. See appendix A.
To create a dump file, you must still have the files from the generation
(a.ut2, s.ut2, unsolved.ut1) as well as some internal files (*.dbx).
Then type (see GEN-DUMP.BAT):
dizni -xud -c yy -D d:\fluks\data\
This will create a file d:\fluks\data\dump.udm. At the moment (November 2004),
the file is 110 megabytes.
(When BLi wrote his text about dump files in 1999, the file was only 20 MB!)
dizni -xud -c nl -D d:\fluks\data\
This will create a file d:\fluks\data\dump.udm, with only the stories in it
that were reprinted in the country "nl".
7. GENERATING "STANDARD" OUTPUT FILES
Note: most of the "standard" output files can also be generated by adding -s
to the dizni -xx call (see above).
7.0 ISSUE AND STORY FILES
On the Inducks website (http://coa.inducks.org/inducks/files/) there are
output files available.
You can run DIZNI yourself to generate these files from the internal files.
The options are:
dizni -xse -D d:\fluks\data\
dizni -xss -D d:\fluks\data\
These two runs will generate lots of TXT files.
The story indexes are written to d:\fluks\data\stories\ (if that directory
exists, else to d:\fluks\data\).
The issue indexes are written to d:\fluks\data\issues\ (if that directory
exists, else to d:\fluks\data\).
7.1 LEGEND FILES
See DOIT-COA.BAT:
dizni -xsl -D d:\fluks\data\
This creates a bunch of TXT files with legend information (persons,
heroes, etc.) in
the directory d:\fluks\data\legends\ (if that directory exists, else to
d:\fluks\data\).
7.2 INDEXED ISSUES FILES
dizni -xsi -f indexed-issues.txt -c all -D d:\fluks\data\
This creates the file "indexed-issues.txt", listing all issues of all
countries.
dizni -xsi -f indexed-issues-dk.txt -c dk -D d:\fluks\data\
This creates the file "indexed-issues-dk.txt", listing all issues of
Denmark (dk).
You can also make listings of "owned" and "not owned" comics (if you used
OWNER.DBX in the generation step), using the -o option.
And there's a -n option to make nice output that can serve as a checklist.
Examples:
dizni -xsi -f owned-issues.txt -c all -o -D d:\fluks\data\
dizni -xsi -f owned-nl-issues.txt -c nl -o -D d:\fluks\data\
dizni -xsi -f owned-us-issues.txt -c us -o -D d:\fluks\data\
dizni -xsi -f owned-de-issues.txt -c de -o -D d:\fluks\data\
dizni -xsi -f nl-checklist.txt -cnl -o -n -D d:\fluks\data\
8. MORE SOPHISTICATED OUTPUT OPTIONS
Using DIZNI -xos, you can make selections in the data.
For instance a CB index of all stories that I own, with all known reprints in the world:
dizni -xos -u c -a CB -c all -o yes -f CBindex.txt -D d:\fluks\data\ -w 132
Run "dizni -xos -?" for a full explanation of the options.
9. OTHER DIZNI UTILITIES
If you want to find out *all* options of DIZNI, try "DIZNI -?" on a command
line.
The output will give a list of main options.
For every main option (for instance "un"), you can try "DIZNI -xun -?".
This will give you a list of all switches that you can use with that main
option.
In the following paragraphs, we describe the most used utilities.
Other DIZNI options are mainly meant for HF only.
9.1 MAXIMISE
With this option, you can add information from the INDUCKS story file to your
input file.
The call is (see MAXIMISE.BAT):
dizni -xun -f dekopie.dbi -9 -c nl -D d:\fluks\data\
The directory d:\fluks\data\ should contain dekopie.dbi, stories.ins, and the
dbx files (no other files are used).
DIZNI generates an output file d:\fluks\data\maximised.dbi.
The -c option indicates the country of the DBI file.
This is important for recognising special story items (e.g. HC-coded items in
a Dutch DBI file).
There are several items that DIZNI can add to the file.
-1 add pages field (and pagelayout, brokpg)
-2 add app / xapp field
-3 add hero field
-4 add plot field
-5 add writ field
-6 add art field
-7 add ink field
-8 add desc field
-9 change internal field (to @ if unsolved, to ! if special, to ?
if idc)
All additions (so not the "-9" changes) are logged in the output file itself,
in a hidden comment, always starting with [/`maxim:
This is done so you can edit the file afterwards, and change the "[/`maxim:"
text to something more meaningful.
9.2 CHECKEQUIV
The call is (see CHECKEQUIV.BAT):
dizni -xuq -C c:\fluks\i\inducks\ -D d:\fluks\data\ -s -g
This option needs the DBX files, all applicable DBI files and headers.ine.
It writes a report to d:\fluks\data\equiv.log.
This report can be used to spot errors. For instance:
Reporting equivalents of:
#de1 = de/LTB 301
#nl2 = nl/PO3 88
Not listed in all 2 issues:
D 98172/49 is(+) or is not(-) in: #de1- #nl2+
D 98172/50 is(+) or is not(-) in: #de1+ #nl2-
Apparently, the "nl" entry has a wrong page count (unless the issues are not
fully equivalent, of course).
9.3 LANGUAGE FILE CHECK
The call is:
dizni -xut -c nl -C c:\fluks\i\inducks\ -D d:\fluks\data\
This option reads the file heroes-nl.dbi and produces a file heroes-nl.cpy.
This cpy file contains a checked version of all characters. Missing characters are
added (with empty translation), while wrong character names are noted.
9.4 CHECK (AND COPY) DBI FILE
The call is (see CHECKEDCOPY.BAT):
dizni -xuc -f nl.dbi -C c:\fluks\i\inducks\ -D d:\fluks\data\ -c -h
This option reads nl.dbi and makes an "improved" copy of it
(d:\fluks\data\nl.cpy).
In this CPY file, the most obvious errors are automatically corrected (if
the -c option is given).
These corrections are also marked in the CPY file (if the -h option is given),
for instance like this: [/`hero changed, was: D+M]
It uses the DBL input files from the CVS tree, as well as the usual DBX files
and OLDCODES.DBX.
The file "nl.dbi" should be on d:\fluks\data\, DIZNI does *not* look in the
CVS tree for it (because it cannot guess in which subdir the file is).
Error messages are written to DIZNI.LOG.
APPENDIX A. HOW TO WORK WITH THE DUMP FILE
Text of BLi (October 13th 1999); heavily edited by HF on 10-11-2004.
The indexer's dump is a file containing *all* information from the story *and*
issue input files gathered conveniently in one place.
It looks for instance like this:
------------
/ D 92311 12 VR DD [xapp:GL,HDL,US,DD]
de/93-41b D 92311 12 VR DD Die Glücksritter
dk/AA93-38c D 92311 12 VR DD Den store gevinst
fr/JM 2287f D 92311 12 VR DD La chance tourne
it/MG 451k D 92311
nl/94-26b D 92311 12 VR DD [xapp:GL,HDL,US,DD]
se/93-38c D 92311 12 DD Många turer
uk/M93-39c D 92311 Luck Of The Draw!
------------
It's almost the only file I'm using lately for any research.
This dump file can't be generated from the internal files.
You need *.ut2 files which are temporary files in the process of generating
the internal files.
See chapters 5 and 6.
HF will certainly upload a zipped version of the complete dump if you ask for
it.
The dump file generations has an option also to make a dump file with only
the entries of stories with reprints of a certain country.
These are good for checking of errors or locating codes not yet published
in a certain country.
These also can easily be made from the complete dump-file with SKO's lfi.
E.g. The command line
lfi /p1 "de/" dump.udm > de.txt
results in a dump called de.txt with the entries for stories with reprints in
Germany
The command line
lfi /v /p1 "de/" dump.udm > -de.txt
results in a dumpfile with the rest of the stories having no reprint in
Germany.
With the help of HF's diznutil.exe these files can be searched very
effectively.
The trick is that the contents of a story entry in the dump file like the
example is sorted on a single line so all the information from all the files
can be searched with SKO's find program (lfi).
I'll give an actual example (I identified a code especially for this):
I have this unidentified entry:
74-13d W 6 @ PAl AR Wo die Milch herkommt
[xapp:ARK,Duchess] [/xapp Hunde von AR?!!?]
The search
f /p15 W -de1|f /p28 " 6"|f /p49 AR|f PAl|d D > find.txt
results in a find.txt
dk/AA72-37f W ARIS 1-03 6 PAl AR Mælkens vej
se/72-37f W ARIS 1-03 6 AR kommer med mjölken
w /ARIS 1-03 W ARIS 1-03 6 PAl AR
------------
which obviously is the story searched for (Mælkens and mjölken are the
Scandinavian words for Milch = milk).
This was the ideal case because it really was the only entry displayed.
Usually there are a few more that could match, but the descriptions mostly
reveal if it's the story in question or not.
Let me explain, first I renamed SKO's lfi.exe to f.exe (so I don't have to
type that much).
For the same reason I renamed diznutil.exe to d.exe.
The dump-file I was searching (nor German reprints) has the name -de1.
The command above meant:
Look in the file -de1 for lines with a "W" at position 1 (= W-code), with
" 6" at poisition 28 (page count),
with AR at position 49 (hero Aristocats) and containing the artist PAl (no
position necessary but of course also possible to use).
The file -de1 already contains the information on single lines for searching
(hence the 1 at the end).
The "d D" at the end sorts the result back to different lines into readable
format into find.txt.
Is it clear from what I wrote? I hope so, because I want to stop now.
Just two last commands
diznutil d |
|
This page was generated on 2007-05-11 by DVEGEN 4.5a © Harry Fluks 2003.
For more information contact Harry Fluks (hfl at inducks.org - replace the at) |