Pinecone Canopy, tokenizer, poetry…
Pinecone released Canopy, which is a framework for RAG. It original has OpenAI as LLM and embedding model provider and wants to cooperate with Anyscale for open souce LLM support.
The project was delayed couple fo weeks due to the war situation. Now it’s back on track and I submitted PR for AE support and under final reviews.
Couple of things I learnt from this process
1. Tokenizer
You don’t realize all the tricks around tokenizers if you simply call Llama2’s tokenizer from Transformers
lib. But initially we don’t want to use the heavyweight lib but directly use Tokenizers
instead, and avoid use gated models like Llama2 which needs HF tokens.
-
So I tried
OpenLLM/tokenizer
, which gives slightl different tokenized results. -
Tokenizers
only loads from a JSON file, whileTransformers
tokenizer classes load JSON, model files. -
We finally decided to go back to
from transformers import LlamaTokenizerFast
withhf-internal-testing/llama-tokenizer
.This is the closes tokenizer we can get for Llama22.
pytest
for unit test.Unit and system tests were added for Canopy, which is a good practice.
pytest test.py
to verify the results.3.
flake8
andmypy
flake8 is stype guide enforcement and mypy is a static type checker for Python.
I tried withblack
first and it solves most of the format issues. except for lines being too long. It can be solved with properflake8
configs.4.
poetry
poetry is a tool for dependency management and packaging in Python.
I didn’t get properflake8
configs due to installations. Then it solves bypoetry install .
andpeotry run