All Versions
6
Latest Version
Avg Release Cycle
218 days
Latest Release
1810 days ago
Changelog History
Changelog History
-
v0.4.4 Changes
January 29, 2020- ⬆️ Upgrade Kotlin to 1.3.61
- ⚡️ Upgrade
kotlinx.coroutines
. This required an update to some of the places where coroutine builders were called internally. - ⬆️ Upgrade Gradle wrapper
-
v0.4.3 Changes
November 26, 2017- ➕ Added ability to clear crawl queues by RequestId and Age, see
Krawler#removeUrlsByRootPage
🚚 andKrawler#removeUrlsByAge
- ➕ Added config option to prevent crawler shutdown on empty queues
- ➕ Added new single byte priority field to
KrawlQueueEntry
. Queues will always attempt to pop thelowest
priority
entry available. Priority can be assigned by overriding theKrawler#assignQueuePriorty
method. - ⚡️ Update dependencies
- ➕ Added ability to clear crawl queues by RequestId and Age, see
-
v0.4.1 Changes
August 16, 20170.4.1 (2017-8-15)
- ✂ Removed logging implementation from dependencies to prevent logging conflicts when used as a library.
- ⚡️ Updated Kotlin version to 1.1.4
- ⚡️ Updated
kotlinx.coroutines
to .17
-
v0.4.0 Changes
May 16, 20170.4.0 (2017-5-17)
🚚 Rewrote core crawl loop to use Kotlin 1.1 coroutines. This has effectively turned the crawl process into a multi-stage pipeline. This architecture change has removed the necessity for some locking by removing resource contention by multiple threads.
⚡️ Updated the build file to build the simple example as a runnable jar
Minor bug fies in the KrawlUrl class.
-
v0.3.2 Changes
March 03, 2017🛠 Fixed a number of bugs that would result in a crashed thread, and subsequently an incorrect number of crawled pages
👷 as well as cause slowdowns due to a reduced number of worker threads.➕ Added a new utility function to wrap
doCrawl
and log any uncaught exceptions during crawling. -
v0.3.1 Changes
February 02, 2017- Created 1:1 mapping between threads and the number of queues used to serve URLs to visit. URLs have an
affinity for a particular queue based on their domain. All URLs from that domain will end up in the same
🐎 queue. This improves parallel crawl performance by reducing the frequency that the politeness delay
effects requests. For crawls bound to fewer domains than queues, the excess queues are not used. - 🛠 Many bug fixes including fix that eliminates accidental over-crawling.
- Created 1:1 mapping between threads and the number of queues used to serve URLs to visit. URLs have an