head	1.8;
access;
symbols;
locks; strict;
comment	@# @;


1.8
date	2001.02.02.15.19.20;	author whmoseley;	state dead;
branches;
next	1.7;

1.7
date	2001.01.27.21.54.02;	author whmoseley;	state Exp;
branches;
next	1.6;

1.6
date	2001.01.22.09.39.58;	author jmruiz;	state Exp;
branches;
next	1.5;

1.5
date	2001.01.13.22.25.56;	author whmoseley;	state Exp;
branches;
next	1.4;

1.4
date	2001.01.05.19.44.49;	author whmoseley;	state Exp;
branches;
next	1.3;

1.3
date	2000.12.30.07.05.25;	author whmoseley;	state Exp;
branches;
next	1.2;

1.2
date	2000.12.23.07.24.46;	author whmoseley;	state Exp;
branches;
next	1.1;

1.1
date	2000.12.13.07.14.32;	author whmoseley;	state Exp;
branches;
next	;


desc
@@


1.8
log
@This doesn't need to be in CVS as it is made when make dist is run
@
text
@NAME
    The SWISH-E README File

What is SWISH-E?
    SWISH-E is Simple Web Indexing System for Humans - Enhanced. SWISH-E can
    quickly and easily index directories of files or remote web sites and
    search the generated indexes.

    SWISH was created by Kevin Hughes to fill the need of the growing number
    of Web administrators on the Internet - many of the indexing systems
    were not well documented, were hard to use and install, and were too
    complex for their own good. The system was widely used for several
    years, long enough to collect some bug fixes and requests for
    enhancements.

    In Fall 1996, The Library of UC Berkeley received permission from Kevin
    Hughes to implement bug fixes and enhancements to the original binary.
    The result is SWISH-Enhanced or SWISH-E, brought to you by the SWISH-E
    Development Team.

    SWISH-E version 2 represents a major rewrite of the code and the
    addition of many new features.

  Key features

    *   Quickly index a large number of documents in different formats
        including text, HTML, and XML

    *   Includes a web spider for indexing remote documents over HTTP

    *   Use "filters" to index any type of file such as PDF, gzip, or
        Postscript

    *   Document "properties" (some subset of the source document, usually
        defined as a META or XML elements) may be stored in the index and
        returned with search results

    *   Document summaries can be returned with each search

    *   Word stemming and soundex indexing

    *   Phrase searching and wildcard searching

    *   Results can be sorted by relevance or by any number of properties in
        ascending or decending order

    *   Limit searches to parts of documents such as certain HTML tags
        (META, TITLE, comments, etc.) or to XML elements.

    *   It's open source and FREE! You can customize SWISH-E and you can
        contribute your fancy new features to the project.

Where do I get SWISH-E?
    The current version of SWISH-E can be found at:

    http://sunsite.berkeley.edu/SWISH-E/

    Please make sure you use a current version of swish-e.

    Information about Windows binary distributions can also be found at this
    site.

The SWISH-E Documentation
    Documetation is provided in the SWISH-E distribution package in two
    forms, POD (Plain Old Documentation), and in html format. The POD
    documentation is in the pod directory, and the HTML documentation is in
    the html directory, of course.

    The distribution make files can also generate the documentation in these
    formats:

        Postscript
        PDF (Adobe Acrobat)
        system man pages

    You may also build a "split" version of the documentation where each
    topic heading is a separate web page. Building the split version also
    creates a SWISH-E index of the documentation that makes the
    documentation searchable via the included Perl CGI program.

    Buiding these other forms of documentation require aditional helper
    applications -- most modern Linux distributions will include all that's
    needed. At least mine does...

    Online documentation can be found at the SWISH-E web site listed above.

    See INSTALL for information on creating the PDF and Postscript versions
    of the documentation, and for information on installing the SWISH-*
    documentation as Unix man(1) pages.

  How do I read the SWISH-E documentation?

    The SWISH-E documentation is in POD format, and the documentation can be
    found in the pod directory. POD documentation is displayed by the
    "perldoc" command that is included with every Perl installation. For
    example, to view the swish-e installation documentation page called
    "INSTALL", type

       perldoc pod/INSTALL

    or to make life easier,

       cd pod
       perldoc INSTALL
       perldoc SWISH-RUN

    Complain to your system administrator if the `perldoc' command is not
    available on your machine.

  Included Documentation

    The following documentation is included in this SWISH-E distribution:

    *   README -- this file

    *   INSTALL -- Installation and basic usage instructions

    *   SWISH-CONFIG -- Configuration File Directives

    *   SWISH-RUN -- Running Swish and Command Line Switches

    *   SWISH-SEARCH --All about Searching with SWISH-E

    *   SWISH-FAQ -- Common questions, and some answers

    *   SWISH-LIBRARY -- Interface to the SWISH-E C library

    *   SWISH-PERL -- Instructions for using the Perl library

    *   CHANGES -- List of feature changes and bug fixes

  Document Generation

    The SWISH-E documentation in HTML format was created with
    Pod::HtmlPsPdf, a package of Perl modules written and/or modified by
    Stas Bekman to automate the conversion of documents in pod format (see
    perldoc perlpod) to HTML, Postscript, and PDF. A slightly modified
    version of this package is include with the SWISH-E distribution and
    used for building the HTML. As distributed, SWISH-E contains only the
    pod and HTML documentation. See INSTALL for instructions on creating
    man(1), Postscript, and PFD formats.

    Thanks, Stas, for your help!

Where do I get help with SWISH-E?
    If you need help with installing or using SWISH-E please subscribe to
    the SWISH-E mailing list. See visit the SWISH-E web site listed above
    for information on subscribing to the SWISH-E list.

    Before posting any questions please read QUESTIONS AND TROUBLESHOOTING
    in the INSTALL documentation page.

SWISH-E Development
    SWISH-E is currently being developed as an open source project on
    SourceForge http://sourceforge.net. This documentation is updated
    frequently from CVS at http://swishe.sourceforge.net/

    See http://sourceforge.net/projects/swishe/ for more information.

Document Info
    Each document in the SWISH-E distribution contains this section. It
    refers only to the specific page it's located in, and not to the SWISH-E
    program or the documentation as a whole.

    $Id: README,v 1.7 2001/01/27 21:54:02 whmoseley Exp $

    .

@


1.7
log
@*** empty log message ***
@
text
@d165 1
a165 1
    $Id: README.pod,v 1.1 2001/01/27 20:33:06 whmoseley Exp $
@


1.6
log
@This upload contains the work of the weekend. Mainly done in xml.c

search.c:
Revised some of the sorting routines.
Removed icomp
Added comparison function compResultsByRank for use with qsort
Added comparison function compResultsByFileNum for use with qsort
Modified sortresultsbyrank and sortresultsbyfilenum to use compResultsByRank
and compResultsByFileNum. The code of these function is simpler and use
less memory.

string.c
Added remove_controls routine to remove control chars

txt.c
Added a call to remove_controls. As stated by Rainer, an external filter can return control chars and should be removed.

index.c
Fixed a segfault when there are files to index but no valid words in none of them. Eg: Index an empty file. It does not have much sense to do things like this, but ...

xml.c
The big changes here. Well, not so much changes but a lot of work done.
Added convertentities.
Properties are stored up to the end tag. Until now, the property will stop when it encounters a '<'. Nested tags are allowed. Eg:

PropertyNames meta1

<meta1>
this
<meta2>
is
</meta2>
the property
</meta1>

Old xml.c stored "this" as property meta1
New xml.c should store "this is the property" as property meta1

NOT yet done in xml.c:

Support for things like:
<meta1 prop1="bla bla" prop2="bla bla">
bla bla
</meta1>

should prop1 and prop2 be indexed or even store as property if they are specified in metanames or propertynames?

New config directive to increase counters as suggested by Bill Moseley
@
text
@d65 3
a67 1
    forms, POD (Plain Old Documentation), and in html format.
d78 1
a78 1
    creates a SWISH-E index of the documentation that is makes the
d93 5
a97 4
    The SWISH-E documentation is in POD format. POD documentation is
    displayed by the "perldoc" command that is included with every Perl
    installation. For example, to view the swish-e installation
    documentation page called "INSTALL", type
d99 5
d105 1
d165 1
a165 1
    $Id: README.pod,v 1.4 2001/01/05 18:56:08 whmoseley Exp $
@


1.5
log
@Added make dist and updated documentation
version is now stored in top-level configure.in
combined src/configure.in with configure.in
@
text
@@


1.4
log
@Updated html and pod docs
@
text
@@


1.3
log
@Moved .pod files into top level directory
This is the docs that can be read via "perldoc <file>"
@
text
@d63 28
a90 1
How do I read the SWISH-E documentation?
d92 3
a94 3
    displayed by the `perldoc' command that is included with every Perl
    installation. For example, to view the installation documentation for
    swish-e, type
d98 2
a99 6
    Check with your system administrator if this command is not available.

    Online documentation can be found at the SWISH-E web site listed above.
    Optionally, you may view the documentation in HTML format in the html
    directory of the distribution. PDF or Postscript format is also
    available.
d111 1
a111 1
    *   SWISH-INDEX -- Running Swish and Command Line Switches
d123 13
d141 10
d152 5
a156 1
    $Id: README.pod,v 1.2 2000/12/28 04:30:05 whmoseley Exp $
@


1.2
log
@Syncing with 2.1.10-devel6 release
Commiting initial documention, and the doc generations scripts
I'm including much of perl modules as they have been modified for
Swish's use.  They may be removed in later versions.
@
text
@d4 16
a19 4
What is SWISHE?
    SWISH-Enhanced is a fast, powerful, flexible, free, and easy to use
    system for indexing collections of documents. The documents may be on a
    local file system, or remotely accessed via http.
d21 4
a24 1
  Key features include
d27 3
a29 1
        including text, html, and XML
d31 2
a32 1
    *   Use "filters" to index any type of file such as PDF, gzip
d56 1
a56 1
        http://sunsite.berkeley.edu/SWISH-E/
d71 1
a71 1
    Check with your system adminstrator if this command is not available.
d74 5
a78 4
    Optionally, if you have the pod2html utility on your system you can
    convert the documenation into HTML format for viewing with a web
    browser. Instructions for converting the documentation can be found in
    INSTALL.
d82 1
a82 2
    README
        this file
d84 1
a84 2
    INSTALL
        Installation and basic usage instructions
d86 1
a86 2
    SWISH-INDEX
        All about creating a SWISH-E index
d88 1
a88 2
    SWISH-SEARCH
        All about searching with SWISH-E
d90 1
a90 2
    SWISH-FAQ
        Common questions, and some answers
d92 1
a92 2
    SWISH-FILTERS
        Using filters to index non-standard documents
d94 1
a94 2
    SWISH-LIBRARY
        Interface to the SWISH-E C library
d96 1
a96 2
    SWISH-PERL
        Instructions for using the Perl library
d98 1
a98 2
    CHANGES
        List of feature changes and bug fixes
d103 4
a106 1
    for informatation on subscribing to the SWISH-E list.
d108 1
a108 2
Document revision
    $Id: README,v 1.1 2000/12/13 07:14:32 whmoseley Exp $
@


1.1
log
@Adjusted Makefile.in for install and install-lib, and to pass libdir
to src/configure
Added README INSTALL and SWISH-PERL docs -- initial edit
Rewrote perl/test.pl
@
text
@d1 97
a97 45
The SWISH-E README File

SWISH-Enhanced is a fast, powerful, flexible, free, and easy to use system for
indexing collections of Web pages or other text files. Key features include the
ability to limit searches to certain HTML tags (META, TITLE, comments, etc.).

The current version of SWISH-E can be found at:
    http://sunsite.berkeley.edu/SWISH-E/
Please make sure you use a current version of swish.    
    
The SWISH-E documentation is in POD format.  POD documentation is displayed by
the `perldoc' command that is included with every Perl installation.
For example, to view the installation documentation for swish-e, type

   perldoc INSTALL

Check with your system adminstrator if this command is not available.

Online documentation can be found at the SWISH-E web site listed above.
Additional documentation can be found in the html directory.

The following documentation is included in this SWISH-E distribution:

   README           -- this file
   INSTALL          -- Installation and basic usage instructions
   SWISH-INDEX      -- All about creating a SWISH-E index
   SWISH-SEARCH     -- All about searching with SWISH-E
   SWISH-FAQ        -- Common questions, and some answers
   SWISH-FILTERS    -- Using filters to index non-standard documents
   SWISH-LIBRARY    -- Interface to the SWISH-E C library
   SWISH-PERL       -- Instructions for using the Perl library
   CHANGES          -- List of feature changes and bug fixes


   README-SWISH-E   -- General configuration and features documentation
   html/index.html  -- HTML-based documentation.


If you need help with installing or using SWISH-E please subscribe to
the SWISH-E mailing list.  See visit the SWISH-E web site listed above
for informatation on subscribing to the SWISH-E list.


$Id:$

@

