Will OpenAI release a tokenizer with vocab size > 150k by end of 2024?
Basic
9
Ṁ246Dec 31
42%
chance
1D
1W
1M
ALL
The GPT-2 model used r50k_base: vocab size = 50k
The GPT-3 model used r50k_base: vocab size = 50k
The GPT-3.5 model used cl100k_base: vocab size = 100k
The GPT-4 model used cl100k_base: vocab size = 100k
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Will OpenAI release a tokenizer with more than 210000 tokens before 2026?
24% chance
Will the next major LLM by OpenAI use a new tokenizer?
77% chance
Will a flagship (>60T training bytes) open-weights LLM from Meta which doesn't use a tokenizer be released in 2025?
43% chance
Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models before the end of 2025?
49% chance
Will OpenAI release an AI product with a cool name by Jan 1, 2025?
36% chance
Will OpenAI release next-generation models with varying capabilities and sizes?
77% chance
Will OpenAI reveal a textless LLM before 2025?
12% chance
Will OpenAI release a version of Voice Engine by the end of 2024?
81% chance
Will OpenAI release o2 (or o3) before 2026?
98% chance
Will OpenAI release a LLM with a bigger context length than 128K by 2026?
93% chance