Pinecone Canopy, tokenizer, poetry…
Pinecone released Canopy, which is a framework for RAG. It original has OpenAI as LLM and embedding model provider and wants to cooperate with Anyscale for open souce LLM support.
The project was delayed couple fo weeks due to the war situation. Now it’s back on track and I submitted PR for AE support and under final reviews.
Couple of things I learnt from this process
1. Tokenizer
You don’t realize all the tricks around tokenizers if you simply call Llama2’s tokenizer from Transformers lib. But initially we don’t want to use the heavyweight lib but directly use Tokenizers instead, and avoid use gated models like Llama2 which needs HF tokens.
-
So I tried
OpenLLM/tokenizer, which gives slightl different tokenized results. -
Tokenizersonly loads from a JSON file, whileTransformerstokenizer classes load JSON, model files. -
We finally decided to go back to
from transformers import LlamaTokenizerFastwithhf-internal-testing/llama-tokenizer.This is the closes tokenizer we can get for Llama22.
pytestfor unit test.Unit and system tests were added for Canopy, which is a good practice.
pytest test.pyto verify the results.3.
flake8andmypyflake8 is stype guide enforcement and mypy is a static type checker for Python.
I tried withblackfirst and it solves most of the format issues. except for lines being too long. It can be solved with properflake8configs.4.
poetrypoetry is a tool for dependency management and packaging in Python.
I didn’t get properflake8configs due to installations. Then it solves bypoetry install .andpeotry run
