|
|
|
February 24, 2000 |
January 12, 2000 uri-2.7 is available.Renamed uri struct member to _uri because some compilers do not like that and think that's a name clash.
January 3, 2000 mifluz-0.11Several bug fixes, speedups, and code cleanups. Added possibility to monitor what's going on inside the indexing. Preparing for full scale, real-world tests.
December 16, 1999 mifluz-0.10, webbase-5.6 and uri-2.6 are available.This set of versions must be used together. See each product page for more information on the modifications. We've fixed memory leaks, configuration errors and bugs.
December 09, 1999 mifluz-0.9 is available.A new compression algorithm was implemented. It reduces the index size by a factor of 8 compared to an uncompressed index. It works in the same context as the previously implemented compression (it compresses/uncompresses pages within Berkeley DB when they are written/read to the db file), but the comperssion algorithm is specifically designed for compressing DB pages (th previous compression used zlib). Since pages are generally full of redundant data this can achieve good compression ratios.
December 8, 1999 Search-Mifluz-0.01 is available.This is the pre-release version of the Perl interface to mifluz. It was generated using SWIG. We had to patch SWIG in order to achieve proper package encapsulation. The patches will be integrated in the next SWIG version but at present they are included in the Search-Mifluz distribution. The release of Search-Mifluz was also the opportunity to use SourceForge as a repository for the project. SourceForge provides all facilities available on Senga for OpenSource projects. If we're satisfied with SourceForge for Search-Mifluz, we consider moving all the products to SourceForge. It's much easier to contribute to a shared source distribution environment than dealing with it on our own :-)
December 7, 1999 webbase-5.5 is available.In this minor maintainance release we've fixed a few leaks and memory overrun. It has been tested on a set of 150 000 URLs, some of them containing really weird data.
November 29, 1999 webbase-5.4 is available.The most important thing is that many memory leaks have been removed. The crawler has been extensively tested (around 2 million URLs crawled on 150 000 different web sites). The mifluz full text indexing library is now integrated. It generates very big indexes at present but will improve dramaticaly next week thanks to Marcel Bosc. For more information on this subject refer to the mifluz mailing list and the htdig3-dev mailing list (on htdig). The hook to the full text indexing library is located in the new hooks library. In order to definitely fix the problems related to long URLs, the url field is now a text field. To resolve the indexing issue, a field was added to the url and start table: url_md5. Following the same idea, the directory tree that contains the temporary copies of the pages (WLROOT) now contains cryptic MD5 based file names. It's activated by default with the version 2.4 of the uri library. The MySQL connection functions have been upgraded so that it takes in account a ~/.my.cnf file. Always using -user, -password etc. is not mandatory anymore. The -schema option was added to crawler and displays the builtin database schema. It's usefull if you want to add fields of your own in the start table. Thanks to Bertrand Demiddelaer who fixed a timeout problem. Many other small bugs were fixed while testing, refer to ChangeLog for detailed information. November 05, 1999 mifluz-0.8 is available.Version 0.7.0 forgot to include examples subdirectory... Some portability and bug fixes. The docs on the API were extended, some examples were added to help starting up with mifluz. The storage key (WordKey) class has evolved a bit: accesors for getting numerical fields were added. Input operators for streaming were added to WordKey,WordList,WordReference... A speed-up for skiping useless sequential walking when using partialy defined searchkeys was added, as well as tests. The use of the (important) WordList::Walk method was simplified. October 12, 1999 mifluz-0.6 is available.After two months of maturation and coding, the first working version of mifluz-0.6 is finaly available. It is in alpha stage but we stronly believe that the architectural choices are appropriate and will allow mifluz to reach maturity rapidly. It provides very few functionalities and is merely an inverted index manipulation library. It knows nothing about parsing documents or displaying search results. We worked very closely with the Ht://dig Group and Berkeley DB staff. mifluz-0.6 is used in the 3.2 version of Ht://dig (or mifluz-0.6 is a packaging of the Ht://dig indexing library, depending on your point of view :-). We implemented a transparent compression layer in Berkeley DB 2.2.7 that will (maybe) be included in future releases of Berkeley DB. A new developper, Marcel Bosc (bosc@senga.org), joined Senga two days ago. He will eventually take over on mifluz. The work required is huge and having someone working full time on this subject is great news. The immediate future is to integrate mifluz with the crawler and Catalog. September 7, 1999 Catalog-1.01 is available. This is a maintainance release.
Don't hesitate to submit bugs or ideas to bugzilla. Hopefully the next version of Catalog will have a fast full text indexing mechanism and I'll be able to implement new functionalities. Have fun ! July 13, 1999 The first release of the URI manipulation C library (uri) and the internet crawler C library (webbase) are available. These two libraries are core component of our search engine. One would say : what ? another internet crawler ? we already have dozens ! Of course there is a difference with this one : it is able to efficiently crawl millions URLs. The crawler information is stored in a MySQL database.
July 6, 1999 The whole www.senga.org site has been restructured. It now contains general information about Senga, at the home page level. The top level menu on the left gives access to the bug tracking system for all the products (Bug Track), a catalog of resources that we use for development (Links). The Products page points to all the products or development projects at Senga. This is where you will find Catalog.July 3, 1999 Catalog-1.00 is available. This release includes PHP3 code to display a catalog. The author is Weston Bustraan (weston@infinityteldata.net). The main motivation to jump directly to version 1.00 is to avoid version number problems on CPAN. July 2, 1999 Catalog-0.19 is available. This is a minor release. The most noticeable addition is the new search mechanism.
Many thanks to Tim Bunce for his numerous contributions and ideas. He is the architect of the Text-Query and Text-Query-SQL modules, Eric Bohlman and Loic Dachary did the programming. Thanks to Eric Bohlman for his help on the Text-Query module. He was very busy but managed to spend the time needed to release it. There is not yet anything usable for full text indexing but we keep working on it. The storage management is now handled by the reiserfs file system thanks to Hans Reiser who is working full time on this. Loic Dachary does his best to get something working, if you're interested go to http://www.senga.org/mifluz/. For some mysterious reason CPAN lost track of Catalog name. In order to install catalog you should use perl -MCPAN -e 'install Catalog::db'. Weird but temporary. Have fun ! May 26, 1999 There currently are four contributors to Catalog. Here they are:
May 18, 1999 Catalog-0.10 replaces the Catalog-0.9 version published yesterday because of an installation bug that makes it completely unusable except for people ugrading from Catalog-0.5. Thank you for your patience. May 17, 1999 Catalog-0.10 is available. This is a maintainance release. We are happy to announce that Catalog is now available at your nearest CPAN mirror. The bug tracking system installed two weeks ago proved very usefull. It allows anyone to enter bug reports, ideas and suggestions about Catalog. If you are in need of commercial support on Catalog, two new companies are entering the business : Alcove and Atrid. (for details go to the support page).
For more details on bug fixes you can search the bug tracking system at (bugzilla). We are working hard on the full text indexing library. There will be more on this subject very soon. Have fun ! May 2, 1999 The Bugzilla bug tracking system is installed in http://www.senga.org/bugzilla/. It is used not only to report bugs of Catalog but also to suggest enhancements or new features. Anyone can add an entry, go ahead !April 19, 1999 Catalog-0.5 is available. The main features added to this version are:
Altough Catalog was added last month to CPAN, the module list has not been re-generated since then and we impatiently wait for it. A mirror of dmoz.org has been loaded to show that Catalog is able to handle a large number of records and categories. March 16, 1999 Catalog-0.4 is available. The main features added to this version are:
Catalog now depends on the MD5 Perl module. A copy of this module is kept on the www.senga.org download page. We have upgraded the MySQL distribution to 3.22.19 because it is now stable. Some users may have noticed formating errors in the HTML version of the documentation : it has been fixed. Two real world usage of Catalog may be seen at Ghana International Trade Fair (english) and Interbat (french). The example delivered with Catalog is also available on www.senga.org for browsing only: a thematic catalog, a chronological catalog and an alphabetical catalog. Last but not least, the Catalog name space was approved by Perl maintainers and Catalog should appear at your nearest CPAN site in the following weeks. February 24, 1999 Catalog-0.3 is available. The main features added to this version are:
Since a subtle bug was found in mysql-3.22.8-beta, we have switched to the latest version, mysql-3.22.16a-gamma. At the same time we've upgraded the DBI version used and mysql module. Those upgrades are not mandatory. Catalog now uses the Test module to run tests. This requires perl 5.005. If you were running perl 5.004 (native on RedHat 5.2), you will have to compile the perl 5.005. There is not rpm at the moment. February 10, 1999 Catalog-0.2 is available. It fixes installation problems, the documentation and some bugs. The installation process has been made simpler by removing the need set the password and user of the MySQL database after the installation. This was confusing because most people thought it was a fatal error message. The make test now works with a local invocation of the MySQL daemon to prevent possible corruption of an existing database. At the request of Lynx users, all images of this site now have alt tags. |
|
|