20 Commits

Author SHA1 Message Date
caf916236f
Fix tests 2023-03-23 19:52:17 +01:00
85a4f9dbb0
Remove python 2023-03-22 00:07:38 +01:00
Mariatta Wijaya
82facf911f
Update the perf image path in readme (#68)
Closes #67
2023-03-17 13:46:32 -07:00
Shantanu Jain
529de22652 Bump version in pyproject.toml 2023-03-16 18:13:39 -07:00
Shantanu Jain
446cb49aff Bump version, sync codebase 2023-03-16 18:11:50 -07:00
Shantanu Jain
3e8620030c Bump version, sync codebase 2023-03-12 22:01:24 -07:00
Shantanu
b2e85f1423
Build aarch64 wheels under emulation (#54)
Co-authored-by: messense <messense@icloud.com>
2023-03-12 20:34:02 -07:00
Shantanu Jain
ec7c121e38 Bump version, sync codebase 2023-03-02 11:54:12 -08:00
Fritz Obermeyer
f5fbc9c5e9
Expose p50k_edit encoding (#32) 2023-02-26 11:36:53 -08:00
Shantanu Jain
fbaa86e0f0 Sync codebase 2023-02-25 21:01:46 -08:00
Nick Stathas
c4b8770184
Improve performance of byte_pair_merge (#31)
The improvements to `byte_pair_merge` are:
- Changing the `parts` vector to avoid repetition of data.
  
This vector used to store ranges for which the invariant `parts[i].end
== parts[i + 1].start` holds, which makes the vector twice as big as it
needs to be.
  Keeping this vector small improves CPU-cache efficiency.
- Using `usize::MAX` as a sentinel in lieu of `Optional` for the
computation of the minimum rank.
  
This change removes branching from the loop to compute the minimum rank,
generating assembly that uses conditional moves instead.

Ideally, we could keep the `Optional` and inform it of the sentinel much
like `Optional<NonZeroUsize>`. As far as I could tell, specifying custom
sentinels for `Optional` has an old Rust
[RFC](https://github.com/rust-lang/rfcs/pull/41) that has stalled, so we
don't get to have nice things.
- Minimizing the number of lookups into `ranks` by looking up ranks once
and iteratively updating them after each merge.

This reduces the number of rank lookups from `n*m` to `n + O(m)`
2023-02-25 20:37:35 -08:00
Shantanu Jain
7830ed537b Bump version, sync codebase 2023-02-03 12:35:59 -08:00
Ted Sanders
156eff92d2
Add link to OpenAI Cookbook example (#20) 2023-01-23 11:48:52 -08:00
Arvid Lunnemark
cf385cada0
Fix docstring, type annotation for private method (#19) 2023-01-19 14:51:15 -08:00
Shantanu Jain
40d9b1f14e Update codebase 2023-01-03 13:57:17 -08:00
Nguyen-Khanh Vu
0f8ec705e2
Remove redundant word in docstring (#7)
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
2022-12-16 20:41:08 -06:00
Henrik Torget
4226a6c008
Remove duplicate words in docstring (#3) 2022-12-16 15:55:30 -06:00
Shantanu Jain
ab3688a401 README.md: minor improvements 2022-12-16 03:26:13 -06:00
Shantanu Jain
1f098ca4d7 Build wheels; update codebase 2022-12-14 18:15:24 -08:00
Shantanu Jain
a1a9f16826 [tiktoken] hello world 2022-12-14 15:33:00 -08:00