ChangeLog 23 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466
  1. 2024-11-10 - V5.5.0
  2. * Set hOCR capabilities ocrp_dir and ocrp_lang unconditionally.
  3. * Calculate row bounding box in single-word mode per (issue #4304).
  4. * Reduce clock syscalls (#4303).
  5. * Several small performance and other code fixes.
  6. * Modernized code.
  7. * Print time for tessedit_timing_debug in milliseconds.
  8. * Print time for ErrorCounter::ComputeErrorRate in milliseconds.
  9. * cmake: Correctly set the soversion based on SemVer properties.
  10. * Do not export PDBs for static libraries (issue #4279).
  11. * Several other small fixes and improvements for builds and CI.
  12. * Modernize code for renderers and remove filename conversion for Windows (#4330).
  13. * Add build rule for Windows installer.
  14. * Support symbolic values for --oem and --psm options.
  15. * Remove Tensorflow support.
  16. * Add RISC-V V support (#4346).
  17. * Remove broken GitHub action msys2-4.1.1.
  18. 2024-06-11 - V5.4.1
  19. * Avoid FP overflow in NormEvidenceOf (fixes issue #4257) (#4259)
  20. * Small build fixes and code improvements (#4262, #4263, #4266, #4267)
  21. 2024-06-06 - V5.4.0
  22. * Small build fixes and code improvements
  23. (#4241, #4243, #4244, #4245, #4246, #4248, #4249, #4250, #4253)
  24. 2024-05-19 - V5.4.0-rc2
  25. * Fix setup of datadir on installations with Conda (issue #4230) (#4240)
  26. * Fix FP exception in Wordrec::angle_change (issue #4242) (#4243)
  27. 2024-05-12 - V5.4.0-rc1
  28. * Build fixes, code refactoring and other smaller changes.
  29. * Fix grey result of indexed PNG in pdfrenderer.
  30. * Rename frk -> deu_latf (ISO 639-3, ISO 15924).
  31. * Remove broken Dockerfile.
  32. * Fixes for several issues reported by Coverity Scan.
  33. * Remove unsupported OpenCL code and related API functions (#4220).
  34. * Facilitate vectorization for generic build (#4223).
  35. * Add PAGE XML renderer / export (#4214).
  36. * Support training without lstmf files.
  37. * Improve CCUtil::main_setup (fixes issue #4230 related to Coda).
  38. * Allow for text angle/gradient to be retrieved (#4070).
  39. 2024-01-18 - V5.3.4
  40. * Fixes for scrollview
  41. * Fixes for autoconf, clang and sw builds
  42. * Improve OCR for an image URL
  43. * Fail on curl download errors
  44. * New parameter curl_cookiefile
  45. * Set User-Agent: header field in HTTP request for curl downloads
  46. * Output directory list from "combine_tessdata -d" to stdout
  47. * Other small improvements for code and documentation.
  48. 2023-10-05 - V5.3.3
  49. * Small code fixes and improvements to fix Coverity Scan issues.
  50. * Disable -mfpu=neon for aarch64.
  51. * Fix build without git clone in cloned directory (required for FreeBSD).
  52. * Other build fixes for autotools, cmake and sw.
  53. * Fix regression in layout detection which was introduced in release 5.0.0.
  54. * Fix regression which prevented loading of submodels, introduced in release 5.0.0-rc2.
  55. * Other small improvements for code and documentation.
  56. 2023-07-11 - V5.3.2
  57. * Updates for snap package building.
  58. * Support for Sgaw and W Pwo Karen languages in the Myanmar validator (#4065).
  59. * Improve format of logging from lstmtraining.
  60. * Use less digits in filenames of checkpoints written by lstmtraining.
  61. * Replace deprecated sprintf.
  62. * Remove unused code in function fix_rep_char.
  63. * Avoid 32 bit overflow in multiplication (fixes 3 CodeQL CI alerts).
  64. * Avoid conversions from std::string to char* to std::string.
  65. * Abort with error message if OSD is requested with LSTM-only model.
  66. * cmake: allow to disable tiff (-DDISABLE_TIFF=ON).
  67. * cmake: provide info about disabled LibArchive and CURL.
  68. * cmake: check if leptonica was build with tiff support.
  69. * Remove old broken GitHub action vcpkg-4.1.1 (fixes issue #4078).
  70. * Create config.yml.
  71. * Fix typos.
  72. 2023-04-01 - V5.3.1
  73. * Bug fixes for some special scenarios:
  74. * Fix issue #4010.
  75. * textord: Catch empty rows in block iterator (fixes #4039).
  76. * Fix FP division by zero (issue #3995).
  77. * Improve documentation and log messages.
  78. * Build fixes and improvements (mainly for cmake).
  79. 2022-12-22 - V5.3.0
  80. * Minor updates for documentation and cmake builds.
  81. 2022-12-13 - V5.3.0-rc1
  82. * Fix the training tools for the legacy OCR engine (fix issue #3925).
  83. * PDF renderer: Ignore non-text blocks (fix issue #3957).
  84. * Remove colormap before thresholding (fix issue #3940).
  85. * Fix a number of performance issues reported by Coverity Scan.
  86. * Training tools: Replace call of exit function by return statement in main function.
  87. * Fix double free in function vigorous_noise_removal (fix issue #3876).
  88. * Create to_win if needed in Textord::make_spline_rows (fix issue #3875).
  89. * Bug fixes for ScrollView viewer:
  90. * Fix memory issues in ScrollView::MessageReceiver.
  91. * Catch potential nullptr in SVNetwork::SVNetwork.
  92. * Move svpaint.cpp from src/viewer to src/.
  93. * Add rule for svpaint executable in Autotools.
  94. * Bug fixes and improvements for build tools:
  95. * Fix AMD64 detection with autobuild on FreeBSD (fix issue #3964).
  96. * Fix tesseract.pc generated from CMake to match Autotools.
  97. * Detect availability of AVX512-VNNI.
  98. * configure.ac: fix build on aarch64_be.
  99. * Drop CI for old versions of macOS and Ubuntu.
  100. 2022-07-06 - V5.2.0
  101. * Improvements and fixes for continuous integration,
  102. autoconf and cmake builds.
  103. * Set /Os for some 32 bit MS compilers (fixes #3769).
  104. * Improve comments and other documentation.
  105. * Add initial support for Intel AVX512F.
  106. * Fix for very large PDF files on 32 bit hosts (fixes #3805).
  107. * Fix NEON detection on FreeBSD.
  108. * Fix regression with UZN files (fixes #3837).
  109. * Fix calling delete[] for memory allocated by malloc in C API.
  110. * Add an API function to init tesseract with traineddata from memory
  111. (fixes #3691).
  112. * Replace direct access to Leptonica internal data structures by
  113. function calls and support latest releases of Leptonica.
  114. * Replace std::regex by std::string functions (fixes issue #3830).
  115. * Use compiled-in TESSDATA_PREFIX also on Windows (fixes #3767).
  116. * Add new parameter 'invert_threshold', change the default threshold
  117. from 0.5 to 0.7 and mark parameter 'tessedit_do_invert' as deprecated.
  118. 2022-03-01 - V5.1.0
  119. * Handle image and line regions in output formats ALTO, hOCR and text.
  120. * New parameter curl_timeout for curl_easy_setop.
  121. * Build fixes and improvements.
  122. * Catch nullptr in PageIterator::Orientation to improve robustness.
  123. * Remove unused code.
  124. 2022-01-06 - V5.0.1
  125. * Add SPDX-License-Identifier to public include files.
  126. * Support redirections when running OCR on a URL.
  127. * Lots of fixes and improvements for cmake builds.
  128. Distributions should use the autoconf build.
  129. * Fix broken msys2 build with gcc 11.
  130. * Fix parameter certainty_scale (was duplicated).
  131. * Fix some compiler warnings and clean code.
  132. * Correctly detect amd64 and i386 on FreeBSD.
  133. * Add libarchive and libcurl in continuous integration actions.
  134. * Update submodule googletest to release v1.11.0.
  135. 2021-11-22 - V5.0.0
  136. * Faster training and recognition by default (float instead of
  137. double calculations)
  138. * More options for binarization
  139. * Improved support for ARM NEON
  140. * Modernized code
  141. * Removed proprietary data types like GenericVector and STRING
  142. from public API
  143. * pdf.ttf no longer needed, now integrated into the code
  144. * Faster flat build with automake
  145. * New options for combine_tessdata to show details of traineddata files
  146. * Improved training messages
  147. * Improved unit tests and fuzzing tests
  148. * Lots of bug fixes
  149. 2021-11-15 - V4.1.3
  150. * Fix build regression for autoconf build
  151. 2021-11-14 - V4.1.2
  152. * Add RowAttributes getter to PageIterator
  153. * Allow line images with larger width for training
  154. * Fix memory leaks
  155. * Improve build process
  156. * Don't output empty ALTO sourceImageInformation (issue #2700)
  157. * Extend URI support for Tesseract with libcurl
  158. * Abort LSTM training with integer model (fixes issue #1573)
  159. * Update documentation
  160. * Make automake builds less noisy by default
  161. * Don't use -march=native in automake builds
  162. 2019-12-26 - V4.1.1
  163. * Implemented sw build (cppan is depreciated)
  164. * Improved cmake build
  165. * Code cleanup and optimization
  166. * A lot of bug fixes...
  167. 2019-07-07 - V4.1.0
  168. * Added new renders Alto, LSTMBox, WordStrBox.
  169. * Added character boxes in hOCR output.
  170. * Added python training scripts (experimental) as alternative shell scripts.
  171. * Better support AVX / AVX2 / SSE.
  172. * Disable OpenMP support by default (see e.g. #1171, #1081).
  173. * Fix for bounding box problem.
  174. * Implemented support for whitelist/blacklist in LSTM engine.
  175. * Improved cmake configuration.
  176. * Code modernization and improvements.
  177. * A lot of bug fixes...
  178. 2018-10-29 - V4.0.0
  179. * Added new neural network system based on LSTMs, with major accuracy gains.
  180. * Improvements to PDF rendering.
  181. * Fixes to trainingdata rendering.
  182. * Added LSTM models+lang models to 101 languages. (tessdata repository)
  183. * Improved multi-page TIFF handling.
  184. * Fixed damage to binary images when processing PDFs.
  185. * Fixes to training process to allow incremental training from a recognition model.
  186. * Made LSTM the default engine, pushed cube out.
  187. * Deleted cube code.
  188. * Changed OEModes --oem 0 for legacy tesseract engine, --oem 1 for LSTM, --oem 2 for both, --oem 3 for default.
  189. * Avoid use of Leptonica debug parameters or functions.
  190. * Fixed multi-language mode.
  191. * Removed support for VS2010.
  192. * Added Support for VS2015 and VS2017 with CPPAN.
  193. * Implemented invisible text only for PDF.
  194. * Added AVX / SSE support for Windows.
  195. * Enabled OpenMP support.
  196. * Parameter unlv_tilde_crunching change to false.
  197. * Miscellaneous Fixes.
  198. * Detailed Changelog can be found at https://tesseract-ocr.github.io/tessdoc/4.0x-Changelog.html and https://tesseract-ocr.github.io/tessdoc/ReleaseNotes.html#tesseract-release-notes-oct-29-2018---v400
  199. 2017-02-16 - V3.05.00
  200. * Made some fine tuning to the hOCR output.
  201. * Added TSV as another optional output format.
  202. * Fixed ABI break introduced in 3.04.00 with the AnalyseLayout() method.
  203. * text2image tool - Enable all OpenType ligatures available in a font. This feature requires Pango 1.38 or newer.
  204. * Training tools - Replaced asserts with tprintf() and exit(1).
  205. * Fixed Cygwin compatibility.
  206. * Improved multipage tiff processing.
  207. * Improved the embedded pdf font (pdf.ttf).
  208. * Enable selection of OCR engine mode from command line.
  209. * Changed tesseract command line parameter '-psm' to '--psm'.
  210. * Write output of tesseract --help, --version and --list-langs to stdout instead of stderr.
  211. * Added new C API for orientation and script detection, removed the old one.
  212. * Increased minimum autoconf version to 2.59.
  213. * Removed dead code.
  214. * Require Leptonica 1.74 or higher.
  215. * Fixed many compiler warning.
  216. * Fixed memory and resource leaks.
  217. * Fixed some issues with the 'Cube' OCR engine.
  218. * Fixed some openCL issues.
  219. * Added option to build Tesseract with CMake build system.
  220. * Implemented CPPAN support for easy Windows building.
  221. 2016-02-17 - V3.04.01
  222. * Added OSD renderer for psm 0. Works for single page and multi-page images.
  223. * Improve tesstrain.sh script.
  224. * Simplify build and run of ScrollView.
  225. * Improved PDF output for OS X Preview utility.
  226. * INCOMPATIBLE fix to hOCR line height information - commit 134ebc3.
  227. * Added option to build Tesseract without Cube OCR engine (-DNO_CUBE_BUILD).
  228. * Enable OpenMP support.
  229. * Many bug fixes.
  230. 2015-07-11 - V3.04.00
  231. * Tesseract development is now done with Git and hosted at github.com (Previously we used Subversion as a VCS and code.google.com for hosting).
  232. * Tesseract now requires leptonica 1.71 or a higher version.
  233. * Removed official support for VS 2008.
  234. * Added support for 39 additional scripts/languages, including: amh, asm, aze_cyrl, bod, bos, ceb, cym, dzo, fas, gle, guj, hat, iku, jav, kat, kat_old, kaz, khm, kir, kur, lao, lat, mar, mya, nep, ori, pan, pus, san, sin, srp_latn, syr, tgk, tir, uig, urd, uzb, uzb_cyrl, yid
  235. * Major updates to training system as a result of extensive testing on 100 languages.
  236. * New training data for over 100 languages
  237. * Improved performance with PIC compilation option.
  238. * Significant change to invisible font system in pdf output to improve correctness and compatibility with external programs, particularly ghostscript.
  239. * Improved font identification.
  240. * Major change to improve layout analysis for heavily diacritic languages: Thai, Vietnamese, Kannada, Telugu etc.
  241. * Fixed problems with shifted baselines so recognition can recover from layout analysis errors.
  242. * Major refactor to improve speed on difficult images, especially when running a heap checker.
  243. * Moved params from global in page layout to tesseractclass.
  244. * Improved single column layout analysis.
  245. * Allow ocr output to multiple formats using tesseract command line executable.
  246. * Fixed issues with mixed eng+ara scripts.
  247. * Improved script consistency in numbers.
  248. * Major refactor of control.cpp to enable line recognition.
  249. * Added tesstrain.sh - a master training script.
  250. * Added ability to text2image training tool to just list available fonts.
  251. * Added ability to text2image to underline words.
  252. * Improved efficiency of image processing for PDF output.
  253. * Added parameter description for each parameter listed with 'print-parameters' command line option.
  254. * Added font info to hOCR output.
  255. * Enabled streaming input and output of multi-page documents.
  256. * Many bug fixes.
  257. 2014-02-04 - V3.03(rc1)
  258. * Added new training tool text2image to generate box/tif file pairs from
  259. text and truetype fonts.
  260. * Added support for PDF output with searchable text.
  261. * Removed entire IMAGE class and all code in image directory.
  262. * Tesseract executable: support for output to stdout; limited support for one
  263. page images from stdin (especially on Windows)
  264. * Added Renderer to API to allow document-level processing and output
  265. of document formats, like hOCR, PDF.
  266. * Major refactor of word-level recognition, beam search, eliminating dead code.
  267. * Refactored classifier to make it easier to add new ones.
  268. * Generalized feature extractor to allow feature extraction from greyscale.
  269. * Improved sub/superscript treatment.
  270. * Improved baseline fit.
  271. * Added set_unicharset_properties to training tools.
  272. * Many bug fixes.
  273. * More training source data included.
  274. 2012-02-01 - V3.02
  275. * Moved ResultIterator/PageIterator to ccmain.
  276. * Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic.
  277. * Added paragraph detection in layout analysis/post OCR.
  278. * Fixed inconsistent xheight during training and over-chopping.
  279. * Added simultaneous multi-language capability.
  280. * Refactored top-level word recognition module.
  281. * Added experimental equation detector.
  282. * Improved handling of resolution from input images.
  283. * Blamer module added for error analysis.
  284. * Cleaned up externally used namespace by removing includes from baseapi.h.
  285. * Removed dead memory mangagement code.
  286. * Tidied up constraints on control parameters.
  287. * Added support for ShapeTable in classifier and training.
  288. * Refactored class pruner.
  289. * Fixed training leaks and randomness.
  290. * Major improvements to layout analysis for better image detection, diacritic detection, better textline finding, better tabstop finding.
  291. * Improved line detection and removal.
  292. * Added fixed pitch chopper for CJK.
  293. * Added UNICHARSET to WERD_CHOICE to make mult-language handling easier.
  294. * Fixed problems with internally scaled images.
  295. * Added page and bbox to string in tr files to identify source of training data better.
  296. * Fixes to Hindi Shiroreka splitter.
  297. * Added word bigram correction.
  298. * Reduced stack memory consumption and eliminated some ugly typedefs.
  299. * Added new uniform classifier API.
  300. * Added new training error counter.
  301. * Fixed endian bug in dawg reader.
  302. * Many other fixes, including the way in which the chopper finds chops and messes with the outline while it does so.
  303. 2010-11-29 - V3.01
  304. * Removed old/dead serialise/deserialize methods on *LISTIZED classes.
  305. * Total rewrite of DENORM to better encapsulate operation and make
  306. for potential to extract features from images.
  307. * Thread-safety! Moved all critical global and static variables to members of the appropriate class. Tesseract is now thread-safe (multiple instances can be used in parallel in multiple threads.) with the minor exception that some control parameters are still global and affect all threads.
  308. * Added Cube, a new recognizer for Arabic. Cube can also be used in combination with normal Tesseract for other languages with an improvement in accuracy at the cost of (much) lower speed. *There is no training module for Cube yet.*
  309. * `OcrEngineMode` in `Init` replaces `AccuracyVSpeed` to control cube.
  310. * Greatly improved segmentation search with consequent accuracy and speed improvements, especially for Chinese.
  311. * Added `PageIterator` and `ResultIterator` as cleaner ways to get the full results out of Tesseract, that are not currently provided by any of the `TessBaseAPI::Get*` methods. All other methods, such as the `ETEXT_STRUCT` in particular are deprecated and will be deleted in the future.
  312. * ApplyBoxes totally rewritten to make training easier. It can now cope with touching/overlapping training characters, and a new boxfile format allows word boxes instead of character boxes, BUT to use that you have to have already bootstrapped the language with character boxes. "Cyclic dependency" on traineddata.
  313. * Auto orientation and script detection added to page layout analysis.
  314. * Deleted *lots* of dead code.
  315. * Fixxht module replaced with scalable data-driven module.
  316. * Output font characteristics accuracy improved.
  317. * Removed the double conversion at each classification.
  318. * Upgraded oldest structs to be classes and deprecated PBLOB.
  319. * Removed non-deterministic baseline fit.
  320. * Added fixed length dawgs for Chinese.
  321. * Handling of vertical text improved.
  322. * Handling of leader dots improved.
  323. * Table detection greatly improved.
  324. * Fixed a couple of memory leaks.
  325. * Fixed font labels on output text. (Not perfect, but a lot better than before.)
  326. * Cleanup and more bug fixes
  327. * Special treatments for Hindi.
  328. * Support for build in VS2010 with Microsoft Windows SDK for Windows 7 (thanks to Michael Lutz)
  329. 2010-09-21 - V3.00
  330. * Preparations for thread safety:
  331. * Changed TessBaseAPI methods to be non-static
  332. * Created a class hierarchy for the directories to hold instance data,
  333. and began moving code into the classes.
  334. * Moved thresholding code to a separate class.
  335. * Added major new page layout analysis module.
  336. * Added HOCR output (issues 221, 263: thanks to amkryukov).
  337. * Added Leptonica as main image I/O and handling. Currently optional,
  338. but in future releases linking with Leptonica will be mandatory.
  339. * Ambiguity table rewritten to allow definite replacements in place
  340. of fix_quotes.
  341. * Added TessdataManager to combine data files into a single file.
  342. * Some dead code deleted.
  343. * VC++6 no longer supported. It can't cope with the use of templates.
  344. * Many more languages added.
  345. * Doxygenation of most of the function header comments.
  346. * Added man pages.
  347. * Added bash completion script (issue 247: thanks to neskiem)
  348. * Fix integer overview in thresholding (issue 366: thanks to Cyanide.Drake)
  349. * Add Danish Fraktur support (issues 300, 360: thanks to
  350. dsl602230@vip.cybercity.dk)
  351. * Fix file pointer leak (issue 359, thanks to yukihiro.nakadaira)
  352. * Fix an error using user-words (Issue 345: thanks to max.markin)
  353. * Fix a memory leak in tablefind.cpp (Issue 342, thanks to zdravco)
  354. * Fix a segfault due to double fclose (Issue 320, thanks to souther)
  355. * Fix an automake error (Issue 318, thanks to ichanjz)
  356. * Fix a Win32 crash on fileFormatIsTiff() (Issues 304, 316, 317, 330, 347,
  357. 349, 352: thanks to nguyenq87, max.markin, zdenop)
  358. * Fixed a number of errors in newer (stricter) versions of VC++ (Issues
  359. 301, among others)
  360. 2009-06-30 - V2.04
  361. * Integrated bug fixes and patches and misc changes for portability.
  362. * Integrated a patch to remove some of the "access" macros.
  363. * Removed dependence on lua from the viewer, speeding it up
  364. dramatically.
  365. * Fixed the viewer so it compiles and runs properly!
  366. * Specifically fixing issues: 1, 63, 67, 71, 76, 81, 82, 106, 111,
  367. 112, 128, 129, 130, 133, 135, 142, 143, 145, 147, 153, 154, 160,
  368. 165, 170, 175, 177, 187, 192, 195, 199, 201, 205, 209, 108, 169
  369. 2008-04-22 - V2.03
  370. * Fixed crash introduced in 2.02.
  371. * Fixed lack of tessembedded.cpp in distribution.
  372. * Added test for leptonica header files and conditional test for lib.
  373. 2008-04-21 - V2.02 (again)
  374. * Fixed namespace collisions with jpeg library (INT32).
  375. * Portability fixes for Windows for new code.
  376. * Updates to autoconf system for new code.
  377. 2008-01-23 - V2.02
  378. * Improvements to clustering, training and classifier.
  379. * Major internationalization improvements for large-character-set
  380. * languages, eg Kannada.
  381. * Removed some compiler warnings.
  382. * Added multipage tiff support for training and running.
  383. * Updated graphics output to talk to new java-based viewer.
  384. * Added ability to save n-best lists.
  385. * Added leptonica support for more file types.
  386. * Improved Init/End to make them safe.
  387. * Reduced memory use of dictionaries.
  388. * Added some new APIs to TessBaseAPI.
  389. 2007-08-27 - V2.01
  390. * Fixed UTF8 input problems with box file reader.
  391. * Fixed various infinite loops and crashes in dawg code.
  392. * Removed include of config_auto.h from host.h.
  393. * Added automatic wctype encoding to unicharset_extractor.
  394. * Fixed dawg table too full error.
  395. * Removed svn files from tarball.
  396. * Added new functions to tessdll.
  397. * Increased maximum utf8 string in a classification result to 8.
  398. 2007-07-02 - V2.00
  399. * Converted internal character handling to UTF8.
  400. * Trained with 6 languages.
  401. * Added unicharset_extractor, wordlist2dawg.
  402. * Added boxfile creation mode.
  403. * Added UNLV regression test capability.
  404. * Fixed problems with copyright and registered symbols.
  405. * Fixed extern "C" declarations problem.
  406. 2007-05-15 - V1.04
  407. * Added dll exports for Windows.
  408. * Fixed name collisions with stl etc.
  409. * Made some preliminary changes ready for unicodeization.
  410. * Several bug fixes discovered during unicodeization.
  411. 2007-02-02 - V1.03
  412. * Added mftraining and cntraining.
  413. * Added baseapi with adaptive thresholding for grey and color.
  414. * Fixed many memory leaks.
  415. * Fixed several bugs including lack of use of adaptive classifier.
  416. * Added ifdefs to eliminate graphics code and add embedded platform support.
  417. * Incorporated several patches, including 64-bit builds, Mac builds.
  418. * Minor accuracy improvements.
  419. 2006-10-04 - V1.02
  420. * Removed dependency on Aspirin.
  421. * Fixed a few missing Apache license headers.
  422. * Removed $log.
  423. 2006-09-07 - V1.01.
  424. * Added mfcpch.cpp and getopt.cpp for VC++.
  425. * Fixed problem with greyscale images and no libtiff.
  426. * Stopped debug window from being used for the usage output.
  427. * Fixed load of inttemp for big-endian architectures.
  428. * Fixed some Mac compilation issues.
  429. 2006-06-16 - V1.0 of open source Tesseract checked-in.