Skip to content

v0.2.0

Latest
Compare
Choose a tag to compare
@taku910 taku910 released this 19 Feb 16:08
· 64 commits to master since this release

Major changes

N/A

New features

  • [ALL] Added SentencePieceNormalizer class in C++/Python. It supports almost the equivalent feature of spm_normalize. C++ Sample
  • [ALL] Added SentencePieceProcessor::Normalize method in C++/Python Python Sample
    C++ Sample
  • [ALL] Added functionality to override the normalization spec before the processing. Python Sample

Bug fixes & minor changes

  • Introduce better support of using external abseil and protobuf #869
  • Build universal binary in OSX release package #892
  • Add the set_min_log_level function to python to change the loglevel from the python wrapper. #893
  • Uses the logsumexp techniques in marginal probabilities of n-best tokenization to avoid underflow.
  • Support Python 3.12 #932
  • Improves the thread utilization in batch encoding/decoding.
  • Fix nasty bug in BPE position encoding.
  • Fix bugs in the handling of duplicated bigrams

Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant